Redaction Archives

eDiscovery Trends: Christine Musil of Informative Graphics Corporation (IGC)

February 17, 2012

This is the second of the 2012 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?
Which trend(s), if any, haven’t emerged to this point like you thought they would?
What are your general observations about LTNY this year and how it fits into emerging trends? (Note: Christine was interviewed the night before the show, so there were obviously no observations at that point)
What are you working on that you’d like our readers to know about?

Today’s thought leader is Christine Musil. Christine has a diverse career in engineering and marketing spanning 18 years. Christine has been with IGC since March 1996, when she started as a technical writer and a quality assurance engineer. After moving to marketing in 2001, she has applied her in-depth knowledge of IGC’s products and benefits to marketing initiatives, including branding, overall messaging, and public relations. She has also been a contributing author to a number of publications on archiving formats, redaction, and viewing technology in the enterprise.

What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012? And which trend(s), if any, haven’t emerged to this point like you thought they would?

That’s a hard question. Especially for us because we’re somewhat tangential to the market, and not as deeply enmeshed in the market as a lot of the other vendors are. I think the number of acquisitions in the industry was what we expected, though maybe the M&A players themselves were surprising. For example, I didn’t personally see the recent ADI acquisition (Applied Discovery acquired by Siris Capital) coming. And while we weren’t surprised that Clearwell was acquired, we thought that their being acquired by Symantec was an interesting move.

So, we expect the consolidation to continue. We watched the major content management players like EMC OpenText to see if they would acquire additional, targeted eDiscovery providers to round out some of their solutions, but through 2011 they didn’t seem to have decided whether they’re “all in” despite some previous acquisitions in the space. We had wondered if some of them have decided maybe they’re out again, though EMC is here in force for Kazeon this year. So, I think that’s some of what surprised me about the market.

Other trends that I see are potentially more changes in the FRCP (Federal Rules of Civil Procedure) and probably a continued push towards project-based pricing. We have certainly felt the pressure to do more project-based pricing, so we’re watching that. Escalating data volumes have caused cost increases and, obviously, something’s going to have to give there. That’s where I think we’re going to see more regulations come out through new FRCP rules to provide more proportionality to the Discovery process, or clients will simply dictate more pricing alternatives.

What are you working on that you’d like our readers to know about?

We just announced a new release of our Brava!^® product, version 7.1, at the show. The biggest additions to Brava are in the Enterprise version, and we’re debuting a the new Brava Changemark^® Viewer (Changemark®) for smartphones as well as an upcoming Brava HTML client for tablets. iPads have been a bigger game changer than I think a lot of people even anticipated. So, we’re excited about it. Also new with Brava 7.1 isvideo collaboration and improved enterprise readiness and performance for very large deployments.

We also just announced the results of our Redaction Survey, which we conducted to gauge user adoption of toward electronic redaction software. Nearly 65% of the survey respondents were from law firms, so that was a key indicator of the importance of redaction within the legal community. Of the respondents, 25% of them indicated that they are still doing redaction manually, with markers or redaction tape, 32% are redacting electronically, and nearly 38% are using a combined approach with paper-based and software-driven redaction. Of those that redact electronically, the reasons that they prefer electronic redaction included professional look of the redactions, time savings, efficiency and “environmental friendliness” of doing it electronically.

For us, it’s exciting moving into those areas and our partnerships continue to be exciting, as well. We have partnerships with LexisNexis and Clearwell, both of which are unaffected by the recent acquisitions. So, that’s what’s new at IGC.

Thanks, Christine, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Best Practices: When Preparing Production Sets, Quality is Job 1

December 2, 2011

OK, I admit I stole that line from an old Ford commercial… 😉

Yesterday, we talked about addressing parameters of production up front to ensure that those requirements make sense and avoid foreseeable production problems well before the production step. Today, we will talk about quality control (QC) mechanisms to make sure that the production is complete and accurate.

Quality Control Checks

There are a number of checks that can and should be performed on the production set, prior to producing it to the requesting party. Here are some examples:

File Counts: The most obvious check you can perform is to ensure that the count of files matches the count of documents or pages you have identified to be produced. However, depending on the production, there may be multiple file counts to check:
- Image Files: If you have agreed with opposing counsel to produce images for all documents, then there will be a count of images to confirm. If you’re producing multi-page image files (typically, PDF or TIFF), the count of images should match the count of documents being produced. If you’re producing single-page image files (usually TIFF), then the count should match the number of pages being produced.
- Text Files: When producing image files, you may also be producing searchable text files. Again, the count should match either the documents (multi-page text files) or pages (single-page text files) with one possible exception. If a document or page has no searchable text, are you still producing an empty file for those? If not, you will need to be aware of how many of those instances there are and adjust the count accordingly to verify for QC purposes.
- Native Files: Native files (if produced) are typically at the document level, so you would want to confirm that one exists for each document being produced.
- Subset Counts: If the documents are being produced in a certain organized manner (e.g., a folder for each custodian), it’s a good idea to identify subset counts at those levels and verify those counts as well. Not only does this provide an extra level of count verification, but it helps to find the problem more quickly if the overall count is off.
- Verify Counts on Final Production Media: If you’re verifying counts of the production set before copying it to the media (which is common when burning files to CD or DVD), you will need to verify those counts again after copying to ensure that all files made it to the final media.

Sampling of Results: Unless the production is relatively small, it may be impractical to open every last file to be produced to confirm that it is correct. If so, employ accepted statistical sampling procedures (such as those described here and here for searching) to identify an appropriate sample size and randomly select that sample to open and confirm that the correct files were selected, HASH values of produced native files match the original source versions of those files, images are clear and text files contain the correct text.
Redacted Files: If any redacted files are being produced, each of these (not just a sample subset) should be reviewed to confirm that redactions of privileged or confidential information made it to the produced file. Many review platforms overlay redactions which have to be burned into the images at production time, so it’s easy for mistakes in the process to cause those redactions to be left out or burned in at the wrong location.
Inclusion of Logs: Depending on agreed upon parameters, the production may include log files such as:
- Production Log: Listing of all files being produced, with an agreed upon list of metadata fields to identify those files.
- Privilege Log: Listing of responsive files not being produced because of privilege (and possibly confidentiality as well). This listing often identifies the privilege being asserted for each file in the privilege log.
- Exception Log: Listing of files that could not be produced because of a problem with the file. Examples of types of exception files are included here.

Each production will have different parameters, so the QC requirements will differ, so there are examples, but not necessarily a comprehensive list of all potential QC checks to perform.

So, what do you think? Can you think of other appropriate QC checks to perform on production sets? If so, please share them! As well as any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Production is the “Ringo” of the eDiscovery Phases

December 1, 2011

Since eDiscovery Daily debuted over 14 months ago, we’ve covered a lot of case law decisions related to eDiscovery. 65 posts related to case law to date, in fact. We’ve covered cases associated with sanctions related to failure to preserve data, issues associated with incomplete collections, inadequate searching methodologies, and inadvertent disclosures of privileged documents, among other things. We’ve noted that 80% of the costs associated with eDiscovery are in the Review phase and that volume of data and sources from which to retrieve it (including social media and “cloud” repositories) are growing exponentially. Most of the “press” associated with eDiscovery ranges from the “left side of the EDRM model” (i.e., Information Management, Identification, Preservation, Collection) through the stages to prepare materials for production (i.e., Processing, Review and Analysis).

All of those phases lead to one inevitable stage in eDiscovery: Production. Yet, few people talk about the actual production step. If Preservation, Collection and Review are the “John”, “Paul” and “George” of the eDiscovery process, Production is “Ringo”.

It’s the final crucial step in the process, and if it’s not handled correctly, all of the due diligence spent in the earlier phases could mean nothing. So, it’s important to plan for production up front and to apply a number of quality control (QC) checks to the actual production set to ensure that the production process goes as smooth as possible.

Planning for Production Up Front

When discussing the production requirements with opposing counsel, it’s important to ensure that those requirements make sense, not only from a legal standpoint, but a technical standpoint as well. Involve support and IT personnel in the process of deciding those parameters as they will be the people who have to meet them. Issues to be addressed include, but not limited to:

Format of production (e.g., paper, images or native files);
Organization of files (e.g., organized by custodian, legal issue, etc.);
Numbering scheme (e.g., Bates labels for images, sequential file names for native files);
Handling of confidential and privileged documents, including log requirements and stamps to be applied;
Handling of redactions;
Format and content of production log;
Production media (e.g., CD, DVD, portable hard drive, FTP, etc.).

I was involved in a case recently where opposing counsel was requesting an unusual production format where the names of the files would be the subject line of the emails being produced (for example, “Re: Completed Contract, dated 12/01/2011”). Two issues with that approach: 1) The proposed format only addressed emails, and 2) Windows file names don’t support certain characters, such as colons (:) or slashes (/). I provided that feedback to the attorneys so that they could address with opposing counsel and hopefully agree on a revised format that made more sense. So, let the tech folks confirm the feasibility of the production parameters.

The workflow throughout the eDiscovery process should also keep in mind the end goal of meeting the agreed upon production requirements. For example, if you’re producing native files with metadata, you may need to take appropriate steps to keep the metadata intact during the collection and review process so that the metadata is not inadvertently changed. For some file types, metadata is changed merely by opening the file, so it may be necessary to collect the files in a forensically sound manner and conduct review using copies of the files to keep the originals intact.

Tomorrow, we will talk about preparing the production set and performing QC checks to ensure that the ESI being produced to the requesting party is complete and accurate.

So, what do you think? Have you had issues with production planning in your cases? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Avoiding eDiscovery Nightmares: 10 Ways CEOs Can Sleep Easier

June 16, 2011

I found this article in the CIO Central blog on Forbes.com from Robert D. Brownstone – it’s a good summary of issues for organizations to consider so that they can avoid major eDiscovery nightmares. The author counts down his top ten list David Letterman style (clever!) to provide a nice easy to follow summary of the issues. Here’s a summary recap, with my ‘two cents’ on each item:

10. Less is more: The U.S. Supreme Court ruled unanimously in 2005 in the Arthur Andersen case that a “retention” policy is actually a destruction policy. It’s important to routinely dispose of old data that is no longer needed to have less data subject to discovery and just as important to know where that data resides. My two cents: A data map is a great way to keep track of where the data resides.

9. Sing Kumbaya: They may speak different languages, but you need to find a way to bridge the communication gap between Legal and IT to develop an effective litigation-preparedness program. My two cents: Require cross-training so that each department can understand the terms and concepts important to the other. And, don’t forget the records management folks!

8. Preserve or Perish: Assign the litigation hold protocol to one key person, either a lawyer or a C-level executive to decide when a litigation hold must be issued. Ensure an adequate process and memorialize steps taken – and not taken. My two cents: Memorialize is underlined because an organization that has a defined process and the documentation to back it up is much more likely to be given leeway in the courts than a company that doesn’t document its decisions.

7. Build the Three-Legged Stool: A successful eDiscovery approach involves knowledgeable people, great technology, and up-to-date written protocols. My two cents: Up-to-date written protocols are the first thing to slide when people get busy – don’t let it happen.

6. Preserve, Protect, Defend: Your techs need the knowledge to avoid altering metadata, maintain chain-of-custody information and limit access to a working copy for processing and review. My two cents: A good review platform will assist greatly in all three areas.

5. Natives Need Not Make You Restless: Consider exchanging files to be produced in their original/”native” formats to avoid huge out-of-pocket costs of converting thousands of files to image format. My two cents: Be sure to address how redactions will be handled as some parties prefer to image those while others prefer to agree to alter the natives to obscure that information.

4. Get M.A.D.? Then Get Even: Apply the Mutually Assured Destruction (M.A.D.) principle to agree with the other side to take off the table costly volumes of data, such as digital voicemails and back-up data created down the road. My two cents: That’s assuming, of course, you have the same levels of data. If one party has a lot more data than the other party, there may be no incentive for that party to agree to concessions.

3. Cooperate to Cull Aggressively and to Preserve Clawback Rights: Setting expectations regarding culling efforts and reaching a clawback agreement with opposing counsel enables each side to cull more aggressively to reduce eDiscovery costs. My two cents: Some parties will agree on search terms up front while others will feel that gives away case strategy, so the level of cooperation may vary from case to case.

2. QA/QC: Employ Quality Assurance (QA) tests throughout review to ensure a high accuracy rate, then perform Quality Control (QC) testing before the data goes out the door, building time in the schedule for that QC testing. Also, consider involving a search-methodology expert. My two cents: I cannot stress that last point enough – the ability to illustrate how you got from the large collection set to the smaller production set will be imperative to responding to any objections you may encounter to the produced set.

1. Never Drop Your Laptop Bag and Run: Dig in, learn as much as you can and start building repeatable, efficient approaches. My two cents: It’s the duty of your attorneys and providers to demonstrate competency in eDiscovery best practices. How will you know whether they have or not unless you develop that competency yourself?

So, what do you think? Are there other ways for CEOs to avoid eDiscovery nightmares? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscoveryJournal Webinar: More on Native Format Production and Redaction

November 3, 2010

As noted yesterday, eDiscoveryJournal conducted a webinar last Friday with some notable eDiscovery industry thought leaders regarding issues associated with native format production and redaction, including George Socha, Craig Ball and Tom O’Connor, and moderated by Greg Buckles, co-founder of eDiscoveryJournal, who has over 20 years experience in discovery and consulting.

What follows is more highlights of the discussion, based on my observations and notes from the webinar. If anyone who attended the webinar feels that there are any inaccuracies in this account, please feel free to submit a comment to this post and I will be happy to address it.

More highlights of the discussion:

Redaction – Is it Possible, Practical, Acceptable?: George said it’s certainly possible and practical, but the biggest problem he sees is that redaction is often done without agreement between parties as to how it will be done. Tom noted that the knee jerk reaction for most of his clients is “no” – to do it effectively, you need to know your capabilities and what information you’re trying to change. Craig indicated that it’s not only possible and practical, but often desirable; however, when removing information such as columns from databases or spreadsheets, you need to know data dependencies and the possibility of “breaking” the file by removing that data. Craig also remarked that certain file types (such as Microsoft Office files) are now stored in XML format, making it easier to redact them natively without breaking functionality.
How to Authenticate Redacted Files based on HASH Value?: Craig said you don’t – it’s a changing of the file. Although Craig indicated that some research has been done on “near-HASH” values, George noted that there is currently no such thing and that the HASH value changes completely with a change as small as one character. Tom noted that it’s “tall weeds” when discussing HASH values with clients to authenticate files as many don’t fully understand the issues – it’s a “where angels fear to tread” concern.
Biggest Piece of Advice Regarding Redaction?: Craig said that redaction of native files is hard – So what? Is the percentage of files requiring redaction so great that it needs to drive the process? If it’s a small percentage, you can always simply TIFF the files requiring redaction and redact the TIFFs. George indicated that one of the first things he advises clients to do is to work with the other side on how to handle redactions and if they won’t work with you, go to the judge to address it. Tom indicated that he asks the client questions to find out what issues are associated with the redaction, such as what the client wants to accomplish, percentage of redaction expected, etc. and then provides advice based on those answers.
Redaction for Confidentiality (e.g., personal information, trade secrets, etc.): George noted that, while in many cases, it’s not a big issue; in some cases, it’s a huge issue. There are currently 48 states that have at least some laws regarding safeguarding personal information and also efforts underway to do so at a national level. We’re a long way from coming up with an effective way to address this issue. Craig said that sometimes there are ways to address programmatically – in one case where he served as special master, his client had a number of spreadsheets with columns of confidential data and they were able to identify a way to handle those programmatically. Tom has worked on cases where redaction of social security numbers through search and replace was necessary, but that there was a discussion and agreement with opposing counsel before proceeding.
How to Guarantee that Redaction Actually Deletes the Data and Doesn’t Just Obscure it?: Tom said he had a situation on a criminal case where they received police reports from the Federal government with information on protected witnesses, which they gave back. There is not a “cookie-cutter” approach, but you have to understand the data, what’s possible and provide diligent QC. Craig indicated that he conducts searches for the redacted data to confirm it has been deleted. Greg noted that you have to make sure that the search tool will reach all of the redacted areas of the file. George said too often people simply fail to check the results – providers often say that they can’t afford to perform the QC, but law firms often don’t do it either, so it falls through the cracks. Tom recommends to his law firm clients that they take responsibility to perform that check as they are responsible for the production. As part of QC, it’s important to have a different set of eyes and even different QC/search tools to confirm successful redaction.

Thanks to eDiscoveryJournal for a very informative webinar!

So, what do you think? Do you have any other questions about native format production and redaction? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscoveryJournal Webinar: Debate on Native Format Production and Redaction

November 2, 2010

eDiscoveryJournal conducted a webinar last Friday with some notable eDiscovery industry thought leaders regarding issues associated with native format production and redaction. The panel included George Socha of Socha Consulting, LLC and co-founder of EDRM, Craig Ball of Craig D. Ball, P.C. and author of numerous articles on eDiscovery and computer forensics, and Tom O’Connor, who is a nationally known consultant, speaker and writer in the area of computerized litigation support systems. All three panelists are nationally recognized speakers and experts on eDiscovery topics. The panel discussion was moderated by Greg Buckles, co-founder of eDiscoveryJournal, who is also a recognized expert with over 20 years experience in discovery and consulting.

I wrote an article a few years ago on review and production of native files, so this is a subject of particular interest to me. What follows is highlights of the discussion, based on my observations and notes from the webinar. If anyone who attended the webinar feels that there are any inaccuracies in this account, please feel free to submit a comment to this post and I will be happy to address it.

Having said that, here are the highlights:

Definition of Native Files: George noted that the technical definition of native files is “in the format as used during the normal course of business”, but in the application of that concept, there is no real consensus. Tom, who has worked on a number of multi-party cases has found consensus difficult as parties have different interpretations as to what defines native files. Craig noted that it’s less about format than it is ensuring a “level of information parity” so that both sides have the opportunity to access the same information for those files.
“Near-Native” Files: George noted that there is a “quasi-native” or “near-native” format, which is still a native format, even if it isn’t in the original form. If you have a huge SQL database, but only produce a relevant subset out of it in a smaller SQL database, that would be an example of a “near-native” format. Individual Outlook MSG files are another example that, as Craig noted, are smaller components of the original Outlook mailbox container for which individual message metadata is preserved.
Position of Producing Native Files: Craig noted that the position is often to provide in a less usable format (such as TIFF images) because of attorneys’ fear that the opposition will be able to get more information out of the native files than they did. George noted that you can expect expert fees to double or even quadruple when expecting them to work with image files as opposed to native files.
Negotiation and Production of Metadata: Tom noted that there is a lack of understanding by attorneys as to how metadata differs for each file format. Craig noted that there is certain “dog tag” metadata such as file name, path, last modified date and time, custodian name and hash value, that serve as a “driver’s license” for files whereas the rest of the more esoteric metadata complete the “DNA” for each file. George noted that the EDRM XML project is working towards facilitating standard transfer of file metadata between parties.
Advice on Meet and Confer Preparation: When asked by Greg what factor is most important when preparing for meet and confer, Craig said it depends partly on whether you’re the primary producing or requesting party in the case. Some people prefer “dumbed down” images, so it’s important to know what format you can handle, the issues in the case and cost considerations, of course. George noted that there is little or no attention on how the files are going to be used later in the case at depositions and trial and that it’s important to think about how you plan to use the files in presentation and work backward. Tom noted it’s really important to understand your collection as completely as possible and ask questions such as: What do you have? How much? What formats? Where does it reside? Tom indicated that he’s astonished how difficult it is for many of his clients to answer these questions.

Want to know more? Tune in tomorrow for the second half of the webinar! And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Thought Leader Q&A: Christine Musil of Informative Graphics Corporation

October 14, 2010

Tell me about your company and the products you represent. Informative Graphics Corp. (IGC) is a leading developer of commercial software to view, collaborate on, redact and publish documents. Our products are used by corporations, law firms and government agencies around the world to access and safely share content without altering the original document.

What are some examples of how electronic redaction has been relevant in eDiscovery lately? Redaction is walking the line between being responsive and protecting privilege and privacy. A great recent example of a redaction mistake having pretty broad implications includes the lawyers for former Illinois governor Rod Blagojevich requesting a subpoena of President Obama. The court filing included areas that had been improperly redacted by Blagojevich’s lawyers. While nothing new or shocking was revealed, this snafu put his reputation up for public inspection and opinion once again.

What are some of the pitfalls in redacting PDFs? The big pitfall is not understanding what a redaction is and why it is important to do it correctly. People continue to make the mistake of using a drawing tool to cover text and then publishing the document to PDF. The drawing shape visually blocks the text, but someone can use the Text tool in Acrobat to highlight the text and paste it into Notepad. Using a true electronic redaction tool like Redact-It and being properly trained to use it is essential.

Is there such thing as native redaction? This is such a hot topic that I recently wrote a white paper on the subject titled “The Reality of Native Format Production and Redaction.” The answer is: It depends who you ask. From a realistic perspective, no, there is no such thing as native redaction. There is no tool that supports multiple formats and gives you back the document in the same format as the original. Even if there was such a tool, this seems dangerous and ripe for abuse (what else might “accidentally” get changed while they are at it?).

You recently joined EDRM’s XML section. What are you currently working on in that endeavor, to the extent you can talk about, and why do you think XML is an important part of the EDRM? The EDRM XML project is all about creating a single, universal format for eDiscovery. The organization’s goal is really to eliminate issues around the multitude of formats in the world and streamline review and production. Imagine never again receiving a CD full of flat TIFF files with separate text files! This whole issue of how users control and see document content is at the core of what IGC does, which makes this project a great fit for IGC’s expertise.

About Christine Musil

Christine Musil is Director of Marketing for Informative Graphics Corporation, a viewing, annotation and content management software company based in Arizona. Informative Graphics makes several products including Redact-It, an electronic redaction solution used by law firms, corporate legal departments, government agencies and a variety of other professional service companies.