Electronic Discovery Archives

Court Rules that Joint Stipulation Supports Plaintiff’s Production of Images Instead of Native Files – eDiscovery Case Law

October 13, 2014

In Melian Labs, Inc. v. Triology LLC, No. 13-cv-04791-SBA (N.D. Cal. Sept. 4, 2014), California Magistrate Judge Kandis A. Westmore denied the plaintiff’s motion to compel discovery in native form because the production format had been agreed upon under the parties’ ESI protocol under the Joint Rule 26(f) Report filed by the parties that supported production in “paper, PDF, or TIFF format”.

In this trademark dispute, the plaintiff sought a declaratory judgment that its website did not infringe upon the defendant’s trademark, but rather, that the defendant’s use of the trademark infringed on the plaintiff’s senior trademark rights.

On March 26, 2014, the parties filed a case management conference statement (referred to as the “Joint Rule 26(f) Report”), and informed the district court that:

“With respect to the production of electronic data and information, the parties agree that the production of metadata beyond the following fields are not necessary in this lawsuit absent a showing of a compelling need: Date Sent, Time Sent, Date Received, Time Received, To, From, CC, BCC, and Email Subject. The parties agree to produce documents electronic form in paper, PDF, or TIFF format, and spreadsheets and certain other electronic files in native format when it is more practicable to do so.”

The plaintiff began its document production on June 23 and had produced 1218 pages of documents to date. On August 1, the defendant complained about the format of the plaintiff’s document production of its electronically stored information (“ESI”), claiming that the produced PDFs were stripped of all metadata in violation of the agreement of the parties and that the spreadsheets were not produced in native format. The defendant contended that the plaintiff’s production of “7 large PDF image documents, which each appear to be a compilation of ESI improperly collected and produced,” were violative of Federal Rule of Civil Procedure 34(b)(2)(E), because they were not produced in their native format and are not reasonably usable. The defendant also contended that the plaintiff failed to comply with the Joint Rule 26(f) Report by refusing to produce all spreadsheets in native format – the plaintiff acknowledged that some of its spreadsheet printouts were difficult to read, and, in those cases, it produced the spreadsheets in native format (Excel) upon request, but contended that the parties never agreed to produce all spreadsheets in native format.

Judge Westmore stated that “Triology’s complaint is purely one of form and, at this juncture, it is not claiming that Melian’s production is incomplete. Rule 34(b) only requires that the parties produce documents as they are kept in the usual course of business or in the form ordinarily maintained unless otherwise stipulated. Fed. R. Civ. P. 34(b)(2)(E). The parties’ Joint Rule 26(f) Report is a stipulation, and, therefore, Rule 34(b) does not govern. Further, the Joint Rule 26(f) Report does not require that all ESI be produced electronically. Instead, it states that ESI may be produced in paper, PDF or TIFF.”

Judge Westmore also noted that “Triology fails to articulate why metadata is important to emails, when every email should contain the information sought on the face of the document.” As a result, he ruled that the defendant’s “request to compel the production of all emails in a searchable or native format is denied”.

So, what do you think? Did the Joint Rule 26(f) Report allow the plaintiff to produce PDFs with no metadata or was the defendant still entitled to native files with at least the email metadata? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Twitter Sues for the Right to be More Transparent – Social Tech eDiscovery

October 10, 2014

Back in July, we took a look at Twitter’s Transparency Report to show government requests for data over the last six months of 2013 (we had previously looked at their very first report here). However, because Twitter is barred by law from disclosing certain details on government surveillance requests, the Transparency Report is not as transparent as Twitter would like. So, on Tuesday, Twitter filed suit against the FBI and the Justice Department, seeking the ability to release more detailed information on government surveillance of Twitter users.

As reported by The Huffington Post, Twitter is asking a judge for permission to publish its full transparency report, including the number of so-called “national security letters” and Foreign Intelligence Surveillance Act orders that it receives. Twitter claims that restrictions on its ability to speak about government surveillance requests are unconstitutional under the First Amendment.

“We’ve tried to achieve the level of transparency our users deserve without litigation, but to no avail,” Twitter said in a blog post announcing the lawsuit, which was filed in federal court. “It’s our belief that we are entitled under the First Amendment to respond to our users’ concerns and to the statements of U.S. government officials by providing information about the scope of U.S. government surveillance – including what types of legal process have not been received. We should be free to do this in a meaningful way, rather than in broad, inexact ranges.

So, today, we have filed a lawsuit in federal court seeking to publish our full Transparency Report, and asking the court to declare these restrictions on our ability to speak about government surveillance as unconstitutional under the First Amendment. The Ninth Circuit Court of Appeals is already considering the constitutionality of the non-disclosure provisions of the NSL law later this week.”

Transparency reports are typically issued by companies to disclose numerous statistics related to requests for user data, records, and website content. These reports indicate the frequency and authority that governments request data or records over the given period. Due to the creation of these reports, the public may be informed of the private information governments gain access to via search warrants, court subpoenas and other methods. Many other major communication platforms provide Transparency Reports as well, such as Facebook, LinkedIn, Google and Microsoft.

In Twitter’s most recent Transparency Report, they received 2,058 requests for information on its users over the previous six months from governments around the world – a 46 percent increase from 1,410 requests received the previous six months. Over 61 percent of those requests (1,257 total) came from the US Government (Japan was next on the list with a mere 192 requests).

Twitter said it supports the USA Freedom Act of 2014, which was introduced earlier this year by Sen. Patrick Leahy (D-Vt.). The bill would allow for greater public reporting about government surveillance requests.

A copy of Twitter’s filed complaint can be found here.

So, what do you think? Do you agree with Twitter that they deserve the right to greater transparency? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Throwback Thursdays – How things evolved, part 3

October 9, 2014

So far in this blog series, we’ve taken a look at the ‘litigation support culture’ circa 1980, and we’ve covered how databases were built and used. We’ve come a long way since then, and in the past couple of weeks we’ve discussed how things have evolved — we’ll continue that this week. First, though, If you missed the earlier posts in this series, they can be found here, here, here, here, here, here, here, here, here, here and here.

In the past couple of weeks we’ve talked about the form in which document collections were stored and the evolution – first in paper form, then on microfilm, then microfiche, and then as digital images. Database content has evolved too. Early databases included coded information only. In the mid 1980’s, litigation support professionals starting thinking about and talking about OCR (optical character recognition) technology, mostly because one of the main-stream litigation support vendors promoted the advantages of full-text databases.

The primary advantage was, of course, the availability of all words on a document for searching. There was a price-tag though, because the starting point was still paper. Text was captured in an OCR scanning process. Like image technology, full-text took a while to catch on in our industry. The biggest hurdle initially was a lack of confidence in the results – with good reason. At the time, searching the internet wasn’t mainstream, so the average litigation team member wasn’t comfortable with employing a less-than-rigid search method.

In addition, search technology was less advanced than it is today, so there was a greater burden on the user to get a search right. And, OCR technology wasn’t as advanced either, so there were a lot of errors in the scanning process – errors that affected search results. Over time, however, these things changed. Average business people became more and more comfortable searching text (thanks in large part to Google); search technology advanced; and OCR technology advanced.

Eventually, including full-text in a database became the norm, and even started replacing coded information. Another factor that contributed to the evolution of full-text was the cost to store data. It used to be expensive. I remember sitting in meetings where attorneys debated on things like using abbreviations and punctuation in databases because of the expense of storage – they looked for every way they could to cut down on the data that was stored. As storage costs went down over the years, it became easier to justify including full-text in databases.

These changes — databases that included images and full-text, coupled with advanced search technology – made a huge change in how litigation databases were used. Databases were no longer a ‘back-office’ tool – they were used directly by attorneys, and they provided attorneys with very, very fast access to their documents. By the mid 1990’s litigation databases were not only main stream, but they were regularly portable. Not only did attorneys have almost-immediate access to their documents – they had that access even when not in the office.

This brings us up to the 1990’s, at which point electronic discovery quickly emerged as the next big advancement. I won’t cover the evolution of it in this series… CloudNine has documented that well here in its eDiscovery Daily Blog.

This post concludes the Throwback Thursday blog series. I hope you enjoyed this look back at the way things used to be in our industry!

Please let us know if there are eDiscovery topics you’d like to see us cover in eDiscoveryDaily.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Survey of Corporate Counsel Finds that there is Much Room for Improvement in Handling eDiscovery – eDiscovery Trends

October 8, 2014

Yesterday, we discussed a new self-assessment test that enables organizations to measure their eDiscovery “maturity”. Today, we look at a new survey of corporate counsel from BDO Consulting that shows that they feel there is substantial room for improvement when evaluating their organizations’ effectiveness in managing eDiscovery.

According to BDO’s press release promoting their inaugural Inside E-Discovery Survey by BDO Consulting, corporate counsel give their internal and external resources a grade of 6.5 out of 10 for overall effectiveness in handling and managing their eDiscovery. The survey was completed by 100 senior in-house counsel and is scheduled to be released in late October.

Here are the top critical factors identified by in-house legal professionals as impacting their eDiscovery process:

Nearly half (48.4 percent) of respondents ranked understanding the universe of potentially responsive evidence early in the case as the most critical factor, more than three times as much as the next ranked factor;
15.6 percent of respondents ranked predicting the total cost of eDiscovery as the most critical factor;
14.1 percent of respondents ranked reducing eDiscovery review fees as the most important factor; and
12.5 percent of respondents reported the ability to use previously collected and processed electronically stored information (ESI) for other matters as the most important factor, pointing to a desire among corporate counsel to achieve efficiencies by reusing prior work product.

When it comes to selecting eDiscovery providers, quality of provider (47.6 percent of respondents) is twice as important as cost (23.8 percent) as the most important factor for provider selection.

Other findings:

Response to Challenges: To respond to the increasing challenges of eDiscovery, 31.4 percent of respondents reported implementing new guidelines or policies within the past year to streamline and improve their response to litigation. Just over one in four (25.5 percent) say they have adopted tools and technologies, while 15.7 percent say they have hired an outside vendor.
Top eDiscovery Challenges Going Forward: When asked about key forward-looking challenges with regards to eDiscovery management, the largest percentage of in-house counsel (22.5 percent) say managing mobile and social networking data is the number one issue they will face in the near future, followed by cost control (17.5 percent), new regulations (15 percent) and automating processes (12.5 percent).
Few Organizations are Ahead of the Curve: Only 5.4 percent of respondents identify their organization as an “early adopter” when it comes to its willingness to adopt new tools and technologies. In addition, only 17.6 percent currently use customized customer portals to view and track project statistics and only 16.2 percent use data visualization techniques to assist in priority processing or review.
Spending on the Rise: 43.2 percent of corporate counsel respondents predict eDiscovery spend will increase within the next year, while a mere 6.2 percent expect it to decrease.

An infographic of the survey results is available from BDO Consulting here.

So, what do you think? Do any of these results surprise you? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

How Mature is Your Organization in Handling eDiscovery? – eDiscovery Best Practices

October 7, 2014

A new self-assessment resource from EDRM helps you answer that question.

A few days ago, EDRM announced the release of the EDRM eDiscovery Maturity Self-Assessment Test (eMSAT-1), the “first self-assessment resource to help organizations measure their eDiscovery maturity” (according to their press release linked here).

As stated in the press release, eMSAT-1 is a downloadable Excel workbook containing 25 worksheets (actually 27 worksheets when you count the Summary sheet and the List sheet of valid choices at the end) organized into seven sections covering various aspects of the e-discovery process. Complete the worksheets and the assessment results are displayed in summary form at the beginning of the spreadsheet. eMSAT-1 is the first of several resources and tools being developed by the EDRM Metrics group, led by Clark and Dera Nevin, with assistance from a diverse collection of industry professionals, as part of an ambitious Maturity Model project.

The seven sections covered by the workbook are:

General Information Governance: Contains ten questions to answer regarding your organization’s handling of information governance.
Data Identification, Preservation & Collection: Contains five questions to answer regarding your organization’s handling of these “left side” phases.
Data Processing & Hosting: Contains three questions to answer regarding your organization’s handling of processing, early data assessment and hosting.
Data Review & Analysis: Contains two questions to answer regarding your organization’s handling of search and review.
Data Production: Contains two questions to answer regarding your organization’s handling of production and protecting privileged information.
Personnel & Support: Contains two questions to answer regarding your organization’s hiring, training and procurement processes.
Project Conclusion: Contains one question to answer regarding your organization’s processes for managing data once a matter has concluded.

Each question is a separate sheet, with five answers ranked from 1 to 5 to reflect your organization’s maturity in that area (with descriptions to associate with each level of maturity). Default value of 1 for each question. The five answers are:

1: No Process, Reactive
2: Fragmented Process
3: Standardized Process, Not Enforced
4: Standardized Process, Enforced
5: Actively Managed Process, Proactive

Once you answer all the questions, the Summary sheet shows your overall average, as well as your average for each section. It’s an easy workbook to use with input areas defined by cells in yellow. The whole workbook is editable, so perhaps the next edition could lock down the calculated only cells. Nonetheless, the workbook is intuitive and provides a nice exercise for an organization to grade their level of eDiscovery maturity.

You can download a copy of the eMSAT-1 Excel workbook from here, as well as get more information on how to use it (the page also describes how to provide feedback to make the next iterations even better).

The EDRM Maturity Model Self-Assessment Test is the fourth release in recent months by the EDRM Metrics team. In June 2013, the new Metrics Model was released, in November 2013 a supporting glossary of terms for the Metrics Model was published and in November 2013 the EDRM Budget Calculators project kicked off (with four calculators covered by us here, here, here and here). They’ve been busy.

So, what do you think? How mature is your organization in handling eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Approves Use of Predictive Coding, Disagrees that it is an “Unproven Technology” – eDiscovery Case Law

October 6, 2014

In Dynamo Holdings v. Commissioner of Internal Revenue, Docket Nos. 2685-11, 8393-12 (U.S. Tax Ct. Sept 17, 2014), Texas Tax Court Judge Ronald Buch ruled that the petitioners “may use predictive coding in responding to respondent’s discovery request” and if “after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time”.

The cases involved various transfers from one entity to a related entity where the respondent determined that the transfers were disguised gifts to the petitioner’s owners and the petitioners asserted that the transfers were loans.

The respondent requested for the petitioners to produce the electronically stored information (ESI) contained on two specified backup storage tapes or simply produce the tapes themselves. The petitioners asserted that it would “take many months and cost at least $450,000 to do so”, requesting that the Court deny the respondent’s motion as a “fishing expedition” in search of new issues that could be raised in these or other cases. Alternatively, the petitioners requested that the Court let them use predictive coding to efficiently and economically identify the non-privileged information responsive to respondent’s discovery request. The respondent opposed the petitioners’ request to use predictive coding, calling it “unproven technology” and added that petitioners could simply give him access to all data on the two tapes and preserve the right (through a “clawback agreement”) to later claim that some or all of the data is privileged.

Judge Buch called the request to use predictive coding “somewhat unusual” and stated that “although it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery… Yet that is, in essence, what the parties are asking the Court to consider – whether document review should be done by humans or with the assistance of computers. Respondent fears an incomplete response to his discovery. If respondent believes that the ultimate discovery response is incomplete and can support that belief, he can file another motion to compel at that time.”

With regard to the respondent’s categorization of predictive coding as “unproven technology”, Judge Buch stated “We disagree. Although predictive coding is a relatively new technique, and a technique that has yet to be sanctioned (let alone mentioned) by this Court in a published Opinion, the understanding of e-discovery and electronic media has advanced significantly in the last few years, thus making predictive coding more acceptable in the technology industry than it may have previously been. In fact, we understand that the technology industry now considers predictive coding to be widely accepted for limiting e-discovery to relevant documents and effecting discovery of ESI without an undue burden.”

As a result, Judge Buch ruled that “[p]etitioners may use predictive coding in responding to respondent’s discovery request. If, after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time.”

So, what do you think? Should predictive coding have been allowed in this case? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Good Processing Requires a Sound Process – Best of eDiscovery Daily

October 3, 2014

Home at last! Today, we are recovering from our trip, after arriving back home one day late and without our luggage. Satan, thy name is Lufthansa! Anyway, for these past two weeks except for Jane Gennarelli’s Throwback Thursday series, we have been re-publishing some of our more popular and frequently referenced posts. Today’s post is a topic that comes up often with our clients. Enjoy! New posts next week!

As we discussed Wednesday, working with electronic files in a review tool is NOT just simply a matter of loading the files and getting started. Electronic files are diverse and can represent a whole collection of issues to address in order to process them for loading. To address those issues effectively, processing requires a sound process.

eDiscovery providers like (shameless plus warning!) CloudNine Discovery process electronic files regularly to enable their clients to work with those files during review and production. As a result, we are aware of some of the information that must be provided by the client to ensure that the resulting processed data meets their needs and have created an EDD processing spec sheet to gather that information before processing. Examples of information we collect from our clients:

Do you need de-duplication? If so, should it performed at the case or the custodian level?
Should Outlook emails be extracted in MSG or HTM format?
What time zone should we use for email extraction? Typically, it’s the local time zone of the client or Greenwich Mean Time (GMT). If you don’t think that matters, consider this example.
Should we perform Optical Character Recognition (OCR) for image-only files that don’t have corresponding text? If we don’t OCR those files, these could be responsive files that are missed during searching.
If any password-protected files are encountered, should we attempt to crack those passwords or log them as exception files?
Should the collection be culled based on a responsive date range?
Should the collection be culled based on key terms?

Those are some general examples for native processing. If the client requests creation of image files (many still do, despite the well documented advantages of native files), there are a number of additional questions we ask regarding the image processing. Some examples:

Generate as single-page TIFF, multi-page TIFF, text-searchable PDF or non text-searchable PDF?
Should color images be created when appropriate?
Should we generate placeholder images for unsupported or corrupt files that cannot be repaired?
Should we create images of Excel files? If so, we proceed to ask a series of questions about formatting preferences, including orientation (portrait or landscape), scaling options (auto-size columns or fit to page), printing gridlines, printing hidden rows/columns/sheets, etc.
Should we endorse the images? If so, how?

Those are just some examples. Questions about print format options for Excel, Word and PowerPoint take up almost a full page by themselves – there are a lot of formatting options for those files and we identify default parameters that we typically use. Don’t get me started.

We also ask questions about load file generation (if the data is not being loaded into our own review tool, OnDemand®), including what load file format is preferred and parameters associated with the desired load file format.

This isn’t a comprehensive list of questions we ask, just a sample to illustrate how many decisions must be made to effectively process electronic data. Processing data is not just a matter of feeding native electronic files into the processing tool and generating results, it requires a sound process to ensure that the resulting output will meet the needs of the case.

So, what do you think? How do you handle processing of electronic files? Please share any comments you might have or if you’d like to know more about a particular topic.

P.S. – No hamsters were harmed in the making of this blog post.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Blog Throwback Thursdays – How things evolved, Part 2

October 2, 2014

So far in this blog series, we’ve taken a look at the ‘litigation support culture’ circa 1980, and we’ve covered how databases were built and used. We’ve come a long way since then, and in last week’s blog, we started discussing how things have evolved. In the next posts, we’ll continue discussion of things evolved, but first, if you missed the earlier posts in this series, they can be found here, here, here, here, here, here, here, here, here, and here.

Last week, I described the use of microfilm and microfiche to store document collections. As most of you know, the next step in the evolution process was a move to storing documents as images.

This was a huge step in the world of litigation support, and honestly it was long overdue when it finally became adopted as a standard. Like so many advancements, it was ‘looked at’ and ‘talked about’ for years before it became the norm. One of the most significant hurdles was simply cost: while the cost to scan documents to create images wasn’t much different than the costs to photocopy or film, image viewing technology was expensive. Firms did not already have this technology, and corporate clients were not willing to bear the cost. Eventually, however, it caught on. By the late 1980’s more and more litigation teams were building databases with images.

There were other changes happening that helped this along – a couple of which meant using images only made sense:

The use of computers in general was becoming more widespread. Computers were no longer only used by large companies. Small and mid-sized companies were using them. PCs were introduced to the world so large main-frame computers and mini computers were not the only option. Desktop computers were becoming widespread.
Because the use of computers was growing, more and more commercial software products were available, including commercial litigation support products. Two of the first popular commercial products were Inmagic and BRS Search.

Because of these changes, technology use in law firms grew. Law firms were buying computers for use by attorneys and paralegals. Law firms started hiring IT staff. Law firms started hiring litigation support professionals and buying litigation support software. In short, law firms were developing internal resources to build and maintain databases. They were creating an infrastructure that could support the use of images.

Including images in litigation support databases caused another shift in the way databases were used: because the documents themselves were immediately available in a database, databases were being used more and more often directly by attorneys. They were no longer a ‘back-office’ function. For many years, it was common for law firms to have ‘walk-up’ litigation support stations, but these ‘walk-up’ stations were often used by attorneys, and eventually it became normal to see a computer on every desk in a law firm.

Tune in next week and we’ll continue discussion of how the litigation world circa 1980 evolved and got to where it is today.

Please let us know if there are eDiscovery topics you’d like to see us cover in eDiscoveryDaily.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

The Files are Already Electronic, How Hard Can They Be to Load? – Best of eDiscovery Daily

October 1, 2014

Come fly with me! Today we are winding our way back home from Paris, by way of Frankfurt. For the next two weeks except for Jane Gennarelli’s Throwback Thursday series, we will be re-publishing some of our more popular and frequently referenced posts. Today’s post is a topic that relates to a question that I get asked often. Enjoy!

Since hard copy discovery became electronic discovery, I’ve worked with a number of clients who expect that working with electronic files in a review tool is simply a matter of loading the files and getting started. Unfortunately, it’s not that simple!

Back when most discovery was paper based, the usefulness of the documents was understandably limited. Documents were paper and they all required conversion to image to be viewed electronically, optical character recognition (OCR) to capture their text (though not 100% accurately) and coding (i.e., data entry) to capture key data elements (e.g., author, recipient, subject, document date, document type, names mentioned, etc.). It was a problem, but it was a consistent problem – all documents needed the same treatment to make them searchable and usable electronically.

Though electronic files are already electronic, that doesn’t mean that they’re ready for review as is. They don’t just represent one problem, they can represent a whole collection of problems. For example:

Image only electronic files such as TIFF or image-only PDF files may be electronic, but they still have no searchable text. They still require OCR to generate searchable text to enable them to be effectively searched. It’s important to account for image-only files when self-collecting as keyword searches will miss these files.
Outlook Emails are typically stored in a “container” file like an EDB (Exchange Database), OST (Outlook Offline Storage Table) or PST (Outlook Personal Storage Table). To work with the emails individually, they typically require processing to break them out into individual MSG (Outlook MSG Files). That processing is also necessary to break out the attachments from the emails so that they can be reviewed or categorized individually, if required. And, if the emails are stored in Lotus Notes, there is no equivalent single message format, so those emails generally require conversion to HTML format during processing.
Databases are large, structured collections of data, but they don’t relate easily to a document format, so they require some analysis to determine if, and in what form, they should be produced.
In almost every collection, there are some files that cannot be processed or searched. Corrupt files, password protected files and other types of exception files are frequent components of your ESI collection and it can become very expensive to make these files searchable or reviewable.

These are just a few examples of why working with electronic files for review isn’t necessarily straightforward. Of course, when processed correctly, electronic files include considerable metadata that provides useful information about how and when the files were created and used, and by whom. They’re way more useful than paper documents. So, it’s still preferable to work with electronic files instead of hard copy files whenever they are available. But, despite what you might think, that doesn’t make them ready to review as is.

So, what do you think? Have you encountered difficulties or challenges when processing electronic files? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

When Preparing Production Sets, Quality is Job 1 – Best of eDiscovery Daily

September 30, 2014

OK, I admit I stole that line from an old Ford commercial… 😉

France Strikes Back! Today, we’re heading back to Paris for one final evening before heading home (assuming the Air France pilots let us). For the next two weeks except for Jane Gennarelli’s Throwback Thursday series, we will be re-publishing some of our more popular and frequently referenced posts. Today’s post is a best practice topic for preparing production sets. Enjoy!

Yesterday, we talked about addressing parameters of production up front to ensure that those requirements make sense and avoid foreseeable production problems well before the production step. Today, we will talk about quality control (QC) mechanisms to make sure that the production is complete and accurate.

Quality Control Checks

There are a number of checks that can and should be performed on the production set, prior to producing it to the requesting party. Here are some examples:

File Counts: The most obvious check you can perform is to ensure that the count of files matches the count of documents or pages you have identified to be produced. However, depending on the production, there may be multiple file counts to check:
- Image Files: If you have agreed with opposing counsel to produce images for all documents, then there will be a count of images to confirm. If you’re producing multi-page image files (typically, PDF or TIFF), the count of images should match the count of documents being produced. If you’re producing single-page image files (usually TIFF), then the count should match the number of pages being produced.
- Text Files: When producing image files, you may also be producing searchable text files. Again, the count should match either the documents (multi-page text files) or pages (single-page text files) with one possible exception. If a document or page has no searchable text, are you still producing an empty file for those? If not, you will need to be aware of how many of those instances there are and adjust the count accordingly to verify for QC purposes.
- Native Files: Native files (if produced) are typically at the document level, so you would want to confirm that one exists for each document being produced.
- Subset Counts: If the documents are being produced in a certain organized manner (e.g., a folder for each custodian), it’s a good idea to identify subset counts at those levels and verify those counts as well. Not only does this provide an extra level of count verification, but it helps to find the problem more quickly if the overall count is off.
- Verify Counts on Final Production Media: If you’re verifying counts of the production set before copying it to the media (which is common when burning files to CD or DVD), you will need to verify those counts again after copying to ensure that all files made it to the final media.
- Sampling of Results: Unless the production is relatively small, it may be impractical to open every last file to be produced to confirm that it is correct. If so, employ accepted statistical sampling procedures (such as those described here and here for searching) to identify an appropriate sample size and randomly select that sample to open and confirm that the correct files were selected, HASH values of produced native files match the original source versions of those files, images are clear and text files contain the correct text.
- Redacted Files: If any redacted files are being produced, each of these (not just a sample subset) should be reviewed to confirm that redactions of privileged or confidential information made it to the produced file. Many review platforms overlay redactions which have to be burned into the images at production time, so it’s easy for mistakes in the process to cause those redactions to be left out or burned in at the wrong location. Very Important! – You also need to confirm that the redacted text has been removed from any text files that have been produced
- Inclusion of Logs: Depending on agreed upon parameters, the production may include log files such as:
  - Production Log: Listing of all files being produced, with an agreed upon list of metadata fields to identify those files.
  - Privilege Log: Listing of responsive files not being produced because of privilege (and possibly confidentiality as well). This listing often identifies the privilege being asserted for each file in the privilege log.
  - Exception Log: Listing of files that could not be produced because of a problem with the file. Examples of types of exception files are included here.

Each production will have different parameters, so the QC requirements will differ, so these are examples, but not necessarily a comprehensive list of all potential QC checks to perform.

So, what do you think? Can you think of other appropriate QC checks to perform on production sets? If so, please share them! As well as any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.