Searching

Despite 18 Missing Emails in Production, Court Denies Request for “Discovery on Discovery” – eDiscovery Case Law

In Freedman v. Weatherford Int’l, 12 Civ. 2121 (LAK) (JCF) (S.D.N.Y. Sept. 12, 2014), New York Magistrate Judge James C. Francis, IV denied the plaintiff’s request to, among other things, require the defendant to produce “certain reports comparing the electronic search results from discovery in this action to the results from prior searches” – despite the fact that the plaintiff identified 18 emails that the defendant did not produce that were ultimately produced by a third party.

Case Background

In this securities fraud class action, Judge Francis had previously denied three motions to compel by the plaintiffs seeking production of “(1) ‘certain reports comparing the electronic search results from discovery in this action to the results from prior searches’; (2) ‘documents concerning an investigation undertaken by [the] Audit Committee’ of [the] defendant…; and (3) ‘documents concerning an investigation undertaken by the law firm Latham & Watkins LLP’.”  In denying the motions, Judge Francis stated that “Although I recognized that such ‘discovery on discovery’ is sometimes warranted, I nevertheless denied the request because the plaintiffs had not ‘proffered an adequate factual basis for their belief that the current production is deficient.’”

However, Judge Francis granted reconsideration and asked for further briefing on the second item, based on the plaintiffs’ presentation of “new evidence, unavailable at the time [they] filed their [earlier] motion, which allegedly reveals deficiencies in [Weatherford’s] current production.”

Eighteen Missing Emails

The new evidence referenced by the plaintiffs consisted of 18 emails from “critical custodians at Weatherford” that were produced (after briefing on the original motion to compel was complete) not by the defendants, but by a third-party causing the plaintiffs to contend that Weatherford’s production is “significantly deficient.”  The plaintiffs contended that “providing them with a “report of the documents `hit'” by search terms used in connection with the Latham and Audit Committee Investigations will identify additional relevant documents that have not been produced here.”

Judge’s Ruling

However, Judge Francis disagreed, stating “the suggested remedy is not suited to the task. The plaintiffs admit that of those 18 e-mails only three, at most, would have been identified by a search using the terms from the investigations.”  He also cited Da Silva Moore, noting that “[T]he Federal Rules of Civil Procedure do not require perfection…Weatherford has reviewed “millions of documents [] and [produced] hundreds of thousands,” comprising “nearly 4.4 million pages,” in this case…It is unsurprising that some relevant documents may have fallen through the cracks. But, most importantly, the plaintiffs’ proposed exercise is unlikely to remedy the alleged discovery defects. In light of its dubious value, I will not require Weatherford to provide the requested report.”

So, what do you think?  Was the decision justified or should the defendant have been held to a higher standard?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Those Pesky Email Signatures and Disclaimers – eDiscovery Best Practices

 

Are email signatures and disclaimers causing more trouble than they’re worth?  According to one author, perhaps they are.

Earlier this week, Jeff Bennion wrote an interesting post on the Above the Law blog (‘Please Consider the Environment Before Printing’ Email Signatures Are Hurting the Environment) where he noted that, about 5 years ago, people started putting ‘Please consider the environment before printing this e-mail’ in their email signature (along with a webdings font character of a tree).

Bennion states that this is “the Kony 2012 of the environmental battles – it’s a noble war, but a pointless battle” and that the printing of emails is only a tiny fraction of the paper that lawyers waste.  Instead, he notes, “the ‘please consider the environment’ email signature is more like one of those ‘I voted’ stickers — both serve no purpose other than proclaiming your self-righteousness for performing a civic duty”.

In fact, per a Time magazine article, the internet accounts for a good deal of the pollution in the world. In a 2011 article, cleantechnica.com reported that there were about 500,000 data centers in the world and each used 10 megawatts of energy a month.  That’s a lot more than 1.21 gigawatts.  Great Scott!

When comparing Word files containing data that might go into an email with the same data that also includes the email signature, Bennion observes that the one with the email signature contains .3 KB more of data than the one without the signature.  He extrapolates that out to 27,000 GB of extra useless data being added to internet storage servers every day (10 million GB per year) over all business emails, while acknowledging that not all 90 billion business emails are including the signature.  “The point is that it is a pointless gesture that, as a whole, does more harm than good”, Bennion states.

And, the same holds true for those confidential and privileged email disclaimers at the bottom of emails, which he observes “take up about 10-20 times more wasted space than the ‘please stop printing your emails’ disclaimer” – “roughly the environmental equivalent of clubbing 3 baby seals a month”.  Some interesting takes.

These email signatures and disclaimers also affect eDiscovery costs, both in terms of extra data to process and also host.  They can also lead to false hits when searching text and affect conceptual clustering or predictive coding of documents (which are based on text content of the documents) unless steps are taken to remove those from indices and ignore the text when performing those processes.  All of which can lead to extra work and extra cost.

So, what do you think?  Do you use “please stop printing your emails” signatures and confidential and privileged email disclaimers?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Text Overlays on Image-Only PDF Files Can Be Problematic – eDiscovery Best Practices

Recently, we at CloudNine Discovery received a set of Adobe PDF files from a client that raised an issue regarding the handling of those files for searching and reviewing purposes.   The issue serves as a cautionary tale for those working with image-only PDFs in their document collection.  Here’s a recap of the issue.

The client was using OnDemand Discovery®, which is our new Client Side add-on to OnDemand® that allows clients to upload their own native data for automated processing and loading into new or existing projects.  The collection was purported to consist mostly of image-only PDF files.  PDF files are created in two ways:

  1. By saving or printing from applications to a PDF file: Many applications, such as Microsoft Office applications like Word, Excel and PowerPoint, provide the ability to save the document or spreadsheet that you’ve created to a PDF file, which is common when you want to “publish” the document.  If the application you’re using doesn’t provide that option, you can print the document to PDF using any of several PDF printer drivers available (some of which are free).  These PDFs that are created usually include the text of the file from which the PDF was created.
  2. By scanning or otherwise creating an image to a PDF file: Typically, this occurs either by scanning hard copy documents to PDF or through some sort of receipt in an image-only PDF form (such as through fax software).  These PDFs that are created are images and do not include the text of the document from which they came.

Like many processing tools, such as LAW PreDiscovery®, OnDemand Discovery is programmed to handle PDF files by extracting the text if present or, if not, performing OCR on the files to capture text from the image.  Text from the file is always preferable to OCR text because it’s a lot more accurate, so this is why OCR is typically only performed on the PDF files lacking text.

After the client loaded their data, we did a spot Quality Control check (like we always do) and discovered that the text for several of the documents only consisted of Bates numbers.

Why?

Because the Bates numbers were added as text overlays to the pre-existing image-only PDF files.  When the processing software viewed the file, it found that there was extractable text, so it extracted that text instead of OCRing the PDF file.  In effect, adding the Bates numbers as text overlays to the image-only PDF rendered it as no longer an image-only PDF.  Therefore, the content portion of the text wasn’t captured, so it wasn’t available for indexing and searching.  These documents were essentially rendered non-searchable even after processing.

How did this happen?  Likely through Adobe Acrobat’s Bates Numbering functionality, which is available on later versions of Acrobat (version 8 and higher).  It does exactly that – applies a text overlay Bates number to each page of the document.  Once that happens, eDiscovery processing software applications will not perform OCR on the image-only PDF.

What can you do about it?  If you haven’t applied Bates numbers on the files yet (or have a backup of the files before they were applied – highly recommended) and they haven’t been produced, you should process the files before putting Bates numbers on the images to ensure that you capture the most text available.  And, if opposing counsel will be producing any image-only PDF files, you will want to request the text as well (along with a load file) so that you can maximize your ability to search their production (of course, your first choice should be to receive native format productions whenever possible – here’s a link to an excellent guide on that subject).

If the Bates numbers are already applied and you don’t have a backup of the files without the Bates numbers (oops!) you’re faced with additional processing charges to convert them to TIFF and perform OCR of the text AND the Bates number, a totally unnecessary charge if you plan ahead.

So, what do you think?  Have you dealt with image-only PDF files with text overlaid Bates numbers?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

How Mature is Your Organization in Handling eDiscovery? – eDiscovery Best Practices

A new self-assessment resource from EDRM helps you answer that question.

A few days ago, EDRM announced the release of the EDRM eDiscovery Maturity Self-Assessment Test (eMSAT-1), the “first self-assessment resource to help organizations measure their eDiscovery maturity” (according to their press release linked here).

As stated in the press release, eMSAT-1 is a downloadable Excel workbook containing 25 worksheets (actually 27 worksheets when you count the Summary sheet and the List sheet of valid choices at the end) organized into seven sections covering various aspects of the e-discovery process. Complete the worksheets and the assessment results are displayed in summary form at the beginning of the spreadsheet.  eMSAT-1 is the first of several resources and tools being developed by the EDRM Metrics group, led by Clark and Dera Nevin, with assistance from a diverse collection of industry professionals, as part of an ambitious Maturity Model project.

The seven sections covered by the workbook are:

  1. General Information Governance: Contains ten questions to answer regarding your organization’s handling of information governance.
  2. Data Identification, Preservation & Collection: Contains five questions to answer regarding your organization’s handling of these “left side” phases.
  3. Data Processing & Hosting: Contains three questions to answer regarding your organization’s handling of processing, early data assessment and hosting.
  4. Data Review & Analysis: Contains two questions to answer regarding your organization’s handling of search and review.
  5. Data Production: Contains two questions to answer regarding your organization’s handling of production and protecting privileged information.
  6. Personnel & Support: Contains two questions to answer regarding your organization’s hiring, training and procurement processes.
  7. Project Conclusion: Contains one question to answer regarding your organization’s processes for managing data once a matter has concluded.

Each question is a separate sheet, with five answers ranked from 1 to 5 to reflect your organization’s maturity in that area (with descriptions to associate with each level of maturity).  Default value of 1 for each question.  The five answers are:

  • 1: No Process, Reactive
  • 2: Fragmented Process
  • 3: Standardized Process, Not Enforced
  • 4: Standardized Process, Enforced
  • 5: Actively Managed Process, Proactive

Once you answer all the questions, the Summary sheet shows your overall average, as well as your average for each section.  It’s an easy workbook to use with input areas defined by cells in yellow.  The whole workbook is editable, so perhaps the next edition could lock down the calculated only cells.  Nonetheless, the workbook is intuitive and provides a nice exercise for an organization to grade their level of eDiscovery maturity.

You can download a copy of the eMSAT-1 Excel workbook from here, as well as get more information on how to use it (the page also describes how to provide feedback to make the next iterations even better).

The EDRM Maturity Model Self-Assessment Test is the fourth release in recent months by the EDRM Metrics team. In June 2013, the new Metrics Model was released, in November 2013 a supporting glossary of terms for the Metrics Model was published and in November 2013 the EDRM Budget Calculators project kicked off (with four calculators covered by us here, here, here and here).  They’ve been busy.

So, what do you think?  How mature is your organization in handling eDiscovery?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Approves Use of Predictive Coding, Disagrees that it is an “Unproven Technology” – eDiscovery Case Law

 

In Dynamo Holdings v. Commissioner of Internal Revenue, Docket Nos. 2685-11, 8393-12 (U.S. Tax Ct. Sept 17, 2014), Texas Tax Court Judge Ronald Buch ruled that the petitioners “may use predictive coding in responding to respondent's discovery request” and if “after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time”.

The cases involved various transfers from one entity to a related entity where the respondent determined that the transfers were disguised gifts to the petitioner's owners and the petitioners asserted that the transfers were loans.

The respondent requested for the petitioners to produce the electronically stored information (ESI) contained on two specified backup storage tapes or simply produce the tapes themselves. The petitioners asserted that it would "take many months and cost at least $450,000 to do so", requesting that the Court deny the respondent's motion as a "fishing expedition" in search of new issues that could be raised in these or other cases. Alternatively, the petitioners requested that the Court let them use predictive coding to efficiently and economically identify the non-privileged information responsive to respondent's discovery request.  The respondent opposed the petitioners' request to use predictive coding, calling it "unproven technology" and added that petitioners could simply give him access to all data on the two tapes and preserve the right (through a "clawback agreement") to later claim that some or all of the data is privileged.

Judge Buch called the request to use predictive coding “somewhat unusual” and stated that “although it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery… Yet that is, in essence, what the parties are asking the Court to consider – whether document review should be done by humans or with the assistance of computers. Respondent fears an incomplete response to his discovery. If respondent believes that the ultimate discovery response is incomplete and can support that belief, he can file another motion to compel at that time.”

With regard to the respondent’s categorization of predictive coding as “unproven technology”, Judge Buch stated “We disagree. Although predictive coding is a relatively new technique, and a technique that has yet to be sanctioned (let alone mentioned) by this Court in a published Opinion, the understanding of e-discovery and electronic media has advanced significantly in the last few years, thus making predictive coding more acceptable in the technology industry than it may have previously been. In fact, we understand that the technology industry now considers predictive coding to be widely accepted for limiting e-discovery to relevant documents and effecting discovery of ESI without an undue burden.”

As a result, Judge Buch ruled that “[p]etitioners may use predictive coding in responding to respondent's discovery request. If, after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time.”

So, what do you think?  Should predictive coding have been allowed in this case?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Proximity, Not Absence, Makes the Heart Grow Fonder – Best of eDiscovery Daily

 

God Save the Queen!  Today is our first full day in London and we’re planning to visit The Tower of London, which is only about a thousand years old.  For the next two weeks except for Jane Gennarelli’s Throwback Thursday series, we will be re-publishing some of our more popular and frequently referenced posts.  Today’s post is a topic that has come up often as I work with clients and have referenced frequently over the years.  Enjoy!

Recently, I assisted a large corporate client where there were several searches conducted across the company’s enterprise-wide document management systems (DMS) for ESI potentially responsive to the litigation.  Some of the individual searches on these systems retrieved over 200,000 files by themselves!

DMS systems are great for what they are intended to do – provide a storage archive for documents generated within the organization, version tracking of those documents and enable individuals to locate specific documents for reference or modification (among other things).  However, few of them are developed with litigation retrieval in mind.  Sure, they have search capabilities, but it can sometimes be like using a sledgehammer to hammer a thumbtack into the wall – advanced features to increase the precision of those searches may often be lacking.

Let’s say in an oil company you’re looking for documents related to “oil rights” (such as “oil rights”, “oil drilling rights”, “oil production rights”, etc.).  You could perform phrase searches, but any variations that you didn’t think of would be missed (e.g., “rights to drill for oil”, etc.).  You could perform an AND search (i.e., “oil” AND “rights”), and that could very well retrieve all of the files related to “oil rights”, but it would also retrieve a lot of files where “oil” and “rights” appear, but have nothing to do with each other.  A search for “oil” AND “rights” in an oil company’s DMS systems may retrieve every published and copyrighted document in the systems mentioning the word “oil”.  Why?  Because almost every published and copyrighted document will have the phrase “All Rights Reserved” in the document.

That’s an example of the type of issue we were encountering with some of those searches that yielded 200,000 files with hits.  And, that’s where proximity searching comes in.  Proximity searching is simply looking for two or more words that appear close to each other in the document (e.g., “oil within 5 words of rights”) – the search will only retrieve the file if those words are as close as specified to each other, in either order.  Proximity searching helped us reduce that collection to a more manageable number for review, even though the enterprise-wide document management system didn’t have a proximity search feature.

How?  We wound up taking a two-step approach to get the collection to a more likely responsive set.  First, we did the “AND” search in the DMS system, understanding that we would retrieve a large number of files, and exported those results.  After indexing them with an eDiscovery review application that has more precise search alternatives (at CloudNine Discovery, we use OnDemand®), we performed a second search on the set using proximity searching to limit the result set to only files where the terms were near each other.  Then, tested the results and revised where necessary to retrieve a result set that maximized both recall and precision.

The result?  We were able to reduce an initial result set of 200,000 files to just over 5,000 likely responsive files by applying the proximity search to the first result set.  And, we probably saved $50,000 to $100,000 in review costson a single search.

I also often use proximity searches as alternatives to phrase searches to broaden the recall of those searches to identify additional potentially responsive hits.  For example, a search for “Doug Austin” doesn’t retrieve “Austin, Doug” and a search for “Dye 127” may not retrieve “Dye #127”.  One character difference is all it takes for a phrase search to miss a potentially responsive file.  With proximity searching, you can look for these terms close to each other and catch those variations.

So, what do you think?  Do you use proximity searching in your culling for review?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Don’t Get “Wild” with Wildcards – Best of eDiscovery Daily

 

Vive la France!  Today is our second full day in Paris and we’re planning to visit Versailles, which Marie Antoinette loved so much, she lost her head.  For the next two weeks except for Jane Gennarelli’s Throwback Thursday series, we will be re-publishing some of our more popular and frequently referenced posts.  Today’s post is one that we published on our very first day and have referenced frequently over the years.  Enjoy!

Several months ago, I provided search strategy assistance to a client that had already agreed upon several searches with opposing counsel.  One search related to mining activities, so the attorney decided to use a wildcard of “min*” to retrieve variations like “mine”, “mines” and “mining”.

That one search retrieved over 300,000 files with hits.

Why?  Because there are 269 words in the English language that begin with the letters “min”.  Words like “mink”, “mind”, “mint” and “minion” were all being retrieved in this search for files related to “mining”.  We ultimately had to go back to opposing counsel and negotiate a revised search that was more appropriate.

How do you ensure that you’re retrieving all variations of your search term?

Stem Searches

One way to capture the variations is with stem searching.  Applications that support stem searching give you an ability to enter the root word (e.g., mine) and it will locate that word and its variations.  Stem searching provides the ability to find all variations of a word without having to use wildcards.

Other Methods

If your application doesn’t support stem searches, Morewords.com shows list of words that begin with your search string (e.g., to get all 269 words beginning with “min”, go here – simply substitute any characters for “min” to see the words that start with those characters).  Choose the variations you want and incorporate them into the search instead of the wildcard – i.e., use “(mine or “mines or mining)” instead of “min*” to retrieve a more relevant result set.

Some applications let you select the wildcard variations you wish to use.  OnDemand® enables you to type in the wildcard string, display all the words – in your collection – that begin with that string, and select the variations on which to search.  As a result, you can avoid all of the non-relevant variations and limit the search to the relevant hits.

So, what do you think?  Have you ever been “burned” by wildcard searching?  Do you have any other suggested methods for effectively handling them?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Is Technology Assisted Review Older than the US Government? – eDiscovery Trends

A lot of people consider Technology Assisted Review (TAR) and Predictive Coding (PC) to be new technology.  We attempted to debunk that as myth last year after our third annual thought leader interview series by summarizing comments from some of the thought leaders that noted that TAR and PC really just apply artificial intelligence to the review process.  But, the foundation for TAR may go way farther back than you might think.

In the BIA blog, Technology Assisted Review: It’s not as new as you think it is, Robin Athlyn Thompson and Brian Schrader take a look at the origins of at least one theory behind TAR.  Called the “Naive Bayes classifier”, it’s based on theorems that were essentially introduced to the public in 1812.  But, the theorems existed quite a bit earlier than that.

Bayes’s theorem is named after Rev. Thomas Bayes (who died in 1761), who first showed how to use new evidence to update beliefs. He lived so long ago, that there is no known widely accepted portrait of him.  His friend Richard Price edited and presented this work in 1763, after Bayes’s death, as An Essay towards solving a Problem in the Doctrine of Chances.  Bayes’ algorithm remained unknown until it was independently rediscovered and further developed by Pierre-Simon Laplace, who first published the modern formulation in his 1812 Théorie analytique des probabilities (Analytic theory of probabilities).

Thompson and Schrader go on to discuss more recent uses of artificial intelligence algorithms to map trends, including Amazon’s More Like This functionality that Amazon uses to recommend other items that you may like, based on previous purchases.  That technology has been around for nearly two decades – can you believe it’s been that long? – and is one of the key factors for Amazon’s success over that time.

So, don’t scoff at the use of TAR because it’s “new technology”, that thinking is “naïve”.  Some of the foundation statistical theories for TAR go further back than the birth of our country.

So, what do you think?  Has your organization used technology assisted review on a case yet?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Rules to Limit Scope of Discovery, Noting that “Searching for ESI is only one discovery tool” – eDiscovery Case Law

In United States v. Univ. of Neb. at Kearney, 4:11CV3209 (D. Neb. Aug. 25, 2014), Nebraska Magistrate Judge Cheryl R. Zwart denied the government’s motion to compel discovery, finding that “ESI is neither the only nor the best and most economical discovery method for obtaining the information the government seeks” and stating that searching for ESI “should not be deemed a replacement for interrogatories, production requests, requests for admissions and depositions”.

This Fair Housing Act case involved a case brought about by the government with claims that students were prohibited or hindered from having “emotional assistance animals in university housing when such animals were needed to accommodate the requesting students’ mental disabilities.  Discovery has been a lengthy and disputed process since the parties filed a Stipulation and Order Regarding Discovery back in March of 2012.

The scope of ESI was a major part of the dispute. The defendants objected that the government’s search parameters were too expansive, and the cost of compliance would be unduly burdensome. The defendants explained that the cost of retrieval, review, and production would approach a million dollars, and provided an outline identifying the document “hits” and the estimated discovery costs.  The government served revised search terms on April 14, 2014. Although narrowed, the government’s search terms would still yield 51,131 responsive documents, and based on the defendants’ estimate, would require the defendants to expend an additional $155,574 to retrieve, review, and produce the responsive ESI.

To date, the defendants had paid $122,006 to third-party vendors for processing the government’s ESI requests and proposed the requests be narrowed to the “housing” or “residential” context. The defendants’ search terms would yield 10,997 responsive documents.  The Government did not want to limit the scope of discovery and recommended producing all the ESI subject to a clawback agreement for the Government to search the ESI. The Defendants argued such an agreement would violate the Family Educational Rights and Privacy Act by disclosing student personal identifiable information without their notice and consent.

The court had ordered the parties to provide answers to specific questions regarding their efforts at resolving ESI as part of any motion to compel filed. The government’s responsive statement does not include information comparing the cost of its proposed document retrieval method and amount at issue in the case, any cost/benefit analysis of the discovery methods proposed, or a statement of who should bear those costs.

Judge Zwart stated that she “will not order the university to produce ESI without first reviewing the disclosure, even with the protection afforded under a clawback order. And if UNK must review the more than 51,000 documents requested by the government’s proposed ESI requests, the cost in both dollars and time exceeds the value to be gained by the government’s request.”

Illustrating the lack of proportionality in the government’s requests, Judge Zwart stated “Searching for ESI is only one discovery tool. It should not be deemed a replacement for interrogatories, production requests, requests for admissions and depositions, and it should not be ordered solely as a method to confirm the opposing party’s discovery is complete. For example, the government proposes search terms such as ‘document* w/25 policy.’ The broadly used words “document” and “policy” will no doubt retrieve documents the government wants to see, along with thousands of documents that have no bearing on this case. And to what end? Through other discovery means, the government has already received copies of UNK’s policies for the claims at issue.”

As a result, Judge Zwart stated that “the court is convinced ESI is neither the only nor the best and most economical discovery method for obtaining the information the government seeks. Standard document production requests, interrogatories, and depositions should suffice—and with far less cost and delay.”

So, what do you think?  Were the government’s requests overbroad or should they have been granted their motion to compel?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Though it was “Switching Horses in Midstream”, Court Approves Plaintiff’s Predictive Coding Plan – eDiscovery Case Law

In Bridgestone Americas Inc. v. Int’l Bus. Mach. Corp., No. 3:13-1196 (M.D. Tenn. July 22, 2014), Tennessee Magistrate Judge Joe B. Brown, acknowledging that he was “allowing Plaintiff to switch horses in midstream”, nonetheless ruled that that the plaintiff could use predictive coding to search documents for discovery, even though keyword search had already been performed.

In this case where the plaintiff sued the defendant for a $75 million computer system that it claimed threw its “entire business operation into chaos”, the plaintiff requested that the court allow the use of predictive coding in reviewing over two million documents.  The defendant objected, noting that the request was an unwarranted change to the original case management order that did not include predictive coding, and that it would be unfair to use predictive coding after an initial screening had been done with keyword search terms.

Judge Brown conducted a lengthy telephone conference with the parties on June 25 and, began the analysis in his order by observing that “[p]redictive coding is a rapidly developing field in which the Sedona Conference has devoted a good deal of time and effort to, and has provided various best practices suggestions”, also noting that “Magistrate Judge Peck has written an excellent article on the subject and has issued opinions concerning predictive coding.”  “In the final analysis”, Judge Brown continued, “the uses of predictive coding is a judgment call, hopefully keeping in mind the exhortation of Rule 26 that discovery be tailored by the court to be as efficient and cost-effective as possible.”

As a result, noting that “we are talking about millions of documents to be reviewed with costs likewise in the millions”, Judge Brown permitted the plaintiff “to use predictive coding on the documents that they have presently identified, based on the search terms Defendant provided.”  Judge Brown acknowledged that he was “allowing Plaintiff to switch horses in midstream”, so “openness and transparency in what Plaintiff is doing will be of critical importance.”

This case has similar circumstances to Progressive Cas. Ins. Co. v. Delaney, where that plaintiff also desired to shift from the agreed upon discovery methodology for privilege review to a predictive coding methodology.  However, in that case, the plaintiff did not consult with either the court or the requesting party regarding their intentions to change review methodology and the plaintiff’s lack of transparency and lack of cooperation resulted in the plaintiff being ordered to produce documents according to the agreed upon methodology.  It pays to cooperate!

So, what do you think?  Should the plaintiff have been allowed to shift from the agreed upon methodology or did the volume of the collection warrant the switch?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.