Searching

Do You Test Your Search Terms Before Proposing Them to Opposing Counsel?: eDiscovery Best Practices

If you don’t, you should.  When litigation is anticipated, it’s never too early to begin collecting potentially responsive data and assessing it by performing searches and testing the results.  However, if you wait until after the meet and confer with opposing counsel, it can be too late.

On the very first day we introduced eDiscovery Daily, we discussed the danger of using wildcards in your searches (and how they can retrieve vastly different results than you intended).  Let me recap that example.

Several years ago, I provided search strategy assistance to a client that had already agreed upon several searches with opposing counsel.  One search related to mining activities, so the attorney decided to use a wildcard of “min*” to retrieve variations like “mine”, “mines” and “mining”.

That one search retrieved over 300,000 files with hits.

Why?  Because there are 269 words in the English language that begin with the letters “min”.  Words like “mink”, “mind”, “mint” and “minion” were all being retrieved in this search for files related to “mining”.  We ultimately had to go back to opposing counsel and attempt to negotiate a revised search that was more appropriate.

What made that process difficult was the negotiation with opposing counsel.  My client had already agreed on over 200 terms with opposing counsel and had proposed many of those terms, including this one.  The attorneys had prepared these terms without assistance from a technology consultant (or “geek” if you prefer) – I was brought into the project after the terms were negotiated and agreed upon – and without testing any of the terms.

Since the terms had been agreed upon, opposing counsel was understandably resistant to modifying any of them.  It wasn’t their problem that my client faced having to review all of these files – it was my client’s proposed term that they now wanted to modify.  Fortunately, for this term, we were ultimately able to provide a clear indication that many of the retrieved documents in this search were non-responsive and were able to get opposing counsel to agree to a modified list of variations of “mine” that included “minable”, “minefield”, “minefields”, “miner” and “minings” (among other variations).  We were able to reduce the result set to less than 12,000 files with hits, saving our client a “mint”, which they certainly didn’t “mind” (because we were able to drop “mint” and “mind” and over 200 other words from the responsive hit list).

However, there were several other inefficient terms that opposing counsel refused to renegotiate and my client was forced to review thousands of additional files that they shouldn’t have had to review.  Had the client included a technical member on the team at the beginning and had they tested each of these searches before negotiating terms with opposing counsel, they would have been able to figure out which terms were overbroad and would have been able to determine more efficient search terms to propose, saving thousands of dollars in review costs.

So, what do you think?  Do you test your search terms before proposing them to opposing counsel?  If not, why not?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Acknowledges Lack of Expertise to Recommend Search Methodology, Orders Parties to Confer: eDiscovery Case Law

In ACI Worldwide Corp. v. MasterCard Technologies, LLC and MasterCard International, Inc., 8:14CV31 (Jul. 13, 2015), Nebraska Magistrate Judge F.A. Gossett, acknowledging that the Court “simply does not have the expertise necessary to determine the best methodology to be employed in retrieving the requested materials in a safe, non-obtrusive, and cost-effective manner”, ordered the parties to “once again” confer in an effort to reach an agreement regarding the search methodology to be employed by the defendants in retrieving the information requested by the plaintiff.

Case Background

In this action where the plaintiff alleged the defendants violated a licensing agreement and disclosed confidential information regarding the plaintiff’s middleware, the plaintiff sought ESI from the defendants to determine whether they continued using information regarding the middleware after expiration of the license agreement and whether they still use it in their source code today.

The defendants objected to producing the ESI as requested, stating that the requests were burdensome and also claiming risks that the requests posed to the defendants’ production systems. The plaintiff, in an effort to address the defendants’ concerns, revised the discovery requests several times and devised a search protocol for the defendants to use in retrieving the requested information – when the defendants refused to use the devised search protocol, the plaintiff filed a motion to compel.

Judge’s Ruling

Noting that “Defendants do not dispute the relevance of the requested information”, Judge Gossett found that “Plaintiff has shown a particular need for the information and that the information is relevant to the issues involved in this action”.  Judge Gossett stopped short of granting the plaintiff’s motion though, stating:

“However, the Court simply does not have the expertise necessary to determine the best methodology to be employed in retrieving the requested materials in a safe, non-obtrusive, and cost-effective manner. Based on the information before it, the Court does not even know whether a search methodology or protocol exists (or could exist) which would allow the requested information to reasonably be retrieved.”

As a result, Judge Gossett chose to “order the parties to once again confer in an effort to reach an agreement regarding the search methodology to be employed in retrieving the requested information”, with a plan to “refer the matter to a special master” if the parties would be unable to agree.

So, what do you think?  Should the court have been able to recommend the methodology or was the judge wise to order the parties to try again to work it out?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Here are a Few Common Myths About Technology Assisted Review: eDiscovery Best Practices

A couple of years ago, after my annual LegalTech New York interviews with various eDiscovery thought leaders (a list of which can be found here, with links to each interview), I wrote a post about some of the perceived myths that exist regarding Technology Assisted Review (TAR) and what it means to the review process.  After a recent discussion with a client where their misperceptions regarding TAR were evident, it seemed appropriate to revisit this topic and debunk a few myths that others may believe as well.

  1. TAR is New Technology

Actually, with all due respect to each of the various vendors that have their own custom algorithm for TAR, the technology for TAR as a whole is not new technology.  Ever heard of artificial intelligence?  TAR, in fact, applies artificial intelligence to the review process.  With all of the acronyms we use to describe TAR, here’s one more for consideration: “Artificial Intelligence for Review” or “AIR”.  May not catch on, but I like it. (much to my disappointment, it didn’t)…

Maybe attorneys would be more receptive to it if they understood as artificial intelligence?  As Laura Zubulake pointed out in my interview with her, “For years, algorithms have been used in government, law enforcement, and Wall Street.  It is not a new concept.”  With that in mind, Ralph Losey predicts that “The future is artificial intelligence leveraging your human intelligence and teaching a computer what you know about a particular case and then letting the computer do what it does best – which is read at 1 million miles per hour and be totally consistent.”

  1. TAR is Just Technology

Treating TAR as just the algorithm that “reviews” the documents is shortsighted.  TAR is a process that includes the algorithm.  Without a sound approach for identifying appropriate example documents for the collection, ensuring educated and knowledgeable reviewers to appropriately code those documents and testing and evaluating the results to confirm success, the algorithm alone would simply be another case of “garbage in, garbage out” and doomed to fail.  In a post from last week, we referenced Tom O’Connor’s recent post where he quoted Maura Grossman, probably the most recognized TAR expert, who stated that “TAR is a process, not a product.”  True that.

  1. TAR and Keyword Searching are Mutually Exclusive

I’ve talked to some people that think that TAR and key word searching are mutually exclusive, i.e., that you wouldn’t perform key word searching on a case where you plan to use TAR.  Not necessarily.  Ralph Losey continues to advocate a “multimodal” approach, noting it as: “more than one kind of search – using TAR, but also using keyword search, concept search, similarity search, all kinds of other methods that we have developed over the years to help train the machine.  The main goal is to train the machine.”

  1. TAR Eliminates Manual Review

Many people (including the New York Times) think of TAR as the death of manual review, with all attorney reviewers being replaced by machines.  Actually, manual review is a part of the TAR process in several aspects, including: 1) Subject matter knowledgeable reviewers are necessary to perform review to create a training set of documents for the technology, 2) After the process is performed, both sets (the included and excluded documents) are sampled and the samples are reviewed to determine the effectiveness of the process, and 3) The resulting responsive set is generally reviewed to confirm responsiveness and also to determine whether the documents are privileged.  Without manual review to train the technology and verify the results, the process would fail.

  1. TAR Has to Be Perfect to Be Useful

Detractors of TAR note that TAR can miss plenty of responsive documents and is nowhere near 100% accurate.  In one recent case, the producing party estimated as many as 31,000 relevant documents may have been missed by the TAR process.  However, they also estimated that a much more costly manual review would have missed as many as 62,000 relevant documents.

Craig Ball’s analogy about the two hikers that encounter the angry grizzly bear is appropriate – the one hiker doesn’t have to outrun the bear, just the other hiker.  Craig notes: “That is how I look at technology assisted review.  It does not have to be vastly superior to human review; it only has to outrun human review.  It just has to be as good or better while being faster and cheaper.”

So, what do you think?  Do you agree that these are myths?  Can you think of any others?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Keyword Searching Isn’t Dead, If It’s Done Correctly: eDiscovery Best Practices

In the latest post of the Advanced Discovery blog, Tom O’Connor (who is an industry thought leader and has been a thought leader interviewee on this blog several times) posed an interesting question: Is Keyword Searching Dead?

In his post, Tom recapped the discussion of a session with the same name at the recent Today’s General Counsel Institute in New York City where Tom was a co-moderator of the session along with Maura Grossman, a recognized Technology Assisted Review (TAR) expert, who was recently appointed as Special Master in the Rio Tinto case.  Tom then went on to cover some of the arguments for and against keyword searching as discussed by the panelists and participants in the session, while also noting that numerous polls and client surveys show that the majority of people are NOT using TAR today.  So, they must be using keyword searching, right?

Should they be?  Is there still room for keyword searching in today’s eDiscovery landscape, given the advances that have been made in recent years in TAR technology?

There is, if it’s done correctly.  Tom quotes Maura in the article as stating that “TAR is a process, not a product.”  The same could be said for keyword searching.  If the process is flawed within which the keyword searches are being performed, you could either retrieve way more documents to be reviewed than necessary and drive up eDiscovery costs or leave yourself open to challenges in the courtroom regarding your approach.  Many lawyers at corporations and law firms identify search terms to be performed (and, in many cases, agree on those terms with opposing counsel) without any testing done to confirm the validity of those terms.

Way back in the first few months of this blog (over four years ago), I advocated an approach to searching that I called “STARR”Search, Test, Analyze, Revise (if necessary) and Repeat (also, if necessary).  With an effective platform (using advanced search capabilities such as “fuzzy”, wildcard, synonym and proximity searching) and knowledge and experience of that platform and also knowledge of search best practices, you can start with a well-planned search that can be confirmed or adjusted using the “STARR” approach.

And, even when you’ve been searching databases for as long as I have (decades now), an effective process is key because you never know what you will find until you test the results.  The favorite example that I have used over recent years (and walked through in this earlier post) is the example where I was doing work for a petroleum (oil) company looking for documents that related to “oil rights” and retrieved almost every published and copyrighted document in the oil company with a search of “oil AND rights”.  Why?  Because almost every published and copyrighted document in the oil company had the phrase “All Rights Reserved”.  Testing and an iterative process eventually enabled me to find the search that offered the best balance of recall and precision.

Like TAR, keyword searching is a process, not a product.  And, you can quote me on that.  (-:

So, what do you think?  Is keyword searching dead?  And, please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Pitfalls Associated with Self-Collection of Data by Custodians: eDiscovery Best Practices

In a prior article, we covered the Burd v. Ford Motor Co. case where the court granted the plaintiff’s motion for a deposition of a Rule 30(b)(6) witness on the defendant’s search and collection methodology involving self-collection of responsive documents by custodians based on search instructions provided by counsel.  In light of that case and a recent client experience of mine, I thought it would be appropriate to revisit this topic that we addressed a couple of years ago.

I’ve worked with a number of attorneys who have turned over the collection of potentially responsive files to the individual custodians of those files, or to someone in the organization responsible for collecting those files (typically, an IT person).  Self-collection by custodians, unless managed closely, can be a wildly inconsistent process (at best).  In some cases, those attorneys have instructed those individuals to perform various searches to turn “self-collection” into “self-culling”.  Self-culling can cause at least two issues:

  1. You have to go back to the custodians and repeat the process if additional search terms are identified.
  2. Potentially responsive image-only files will be missed with self-culling.

It’s not uncommon for additional searches to be required over the course of a case, even when search terms are agreed to by the parties up front (search terms are frequently renegotiated), so the self-culling process has to be repeated when new or modified terms are identified.

It’s also common to have a number of image-only files within any collection, especially if the custodians frequently scan executed documents or use fax software to receive documents from other parties.  In some cases, image-only PDF or TIFF files can often make up as much as 20% of the collection.  When custodians are asked to perform “self-culling” by performing their own searches of their data, these files will typically be missed.

For these reasons, I usually advise against self-culling by custodians in litigation.  I also typically don’t recommend that the organization’s internal IT department perform self-culling either, unless they have the capability to process that data to identify image-only files and perform Optical Character Recognition (OCR) on them to capture text.  If your IT department doesn’t have the capabilities and experience to do so (which includes a well-documented process and chain of custody), it’s generally best to collect all potentially responsive files from the custodians and turn them over to a qualified eDiscovery provider to perform the culling.  Most qualified eDiscovery providers, including (shameless plug warning!) CloudNine™, perform OCR as needed to include image-only files in the resulting potentially responsive document set before culling.  With the full data set available, there is also no need to go back to the custodians to perform additional searches to collect additional data (unless, of course, the case requires supplemental productions).


Most organizations that have their custodians perform self-collection of files for eDiscovery probably don’t expect that they will have to explain that process to the court.  Ford sure didn’t.  If your organization plans to have its custodians self-collect, you’d better be prepared to explain that process, which includes discussing your approach for handling image-only files.

So, what do you think?  Do you self-collect data for discovery purposes?  If so, how do you account for image-only files?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

“Da Silva Moore Revisited” Will Be Visited by a Newly Appointed Special Master: eDiscovery Case Law

In Rio Tinto Plc v. Vale S.A., 14 Civ. 3042 (RMB)(AJP) (S.D.N.Y. Jul. 15, 2015), New York Magistrate Judge Andrew J. Peck, at the request of the defendant, entered an Order appointing Maura Grossman as a special master in this case to assist with issues concerning Technology-Assisted Review (TAR).

Back in March (as covered here on this blog), Judge Peck approved the proposed protocol for technology assisted review (TAR) presented by the parties, titling his opinion “Predictive Coding a.k.a. Computer Assisted Review a.k.a. Technology Assisted Review (TAR) — Da Silva Moore Revisited”.  Alas, as some unresolved issues remained regarding the parties’ TAR-based productions, Judge Peck decided to prepare the order appointing Grossman as special master for the case.  Grossman, of course, is a recognized TAR expert, who (along with Gordon Cormack) wrote Technology-Assisted Review in E-Discovery can be More Effective and More Efficient that Exhaustive Manual Review and also the Grossman-Cormack Glossary of Technology Assisted Review (covered on our blog here).

While noting that it has “no objection to Ms. Grossman’s qualifications”, the plaintiff issued several objections to the appointment, including:

  • The defendant should have agreed much earlier to appointment of a special master: Judge Peck’s response was that “The Court certainly agrees, but as the saying goes, better late than never. There still are issues regarding the parties’ TAR-based productions (including an unresolved issue raised at the most recent conference) about which Ms. Grossman’s expertise will be helpful to the parties and to the Court.”
  • The plaintiff stated a “fear that [Ms. Grossman’s] appointment today will only cause the parties to revisit, rehash, and reargue settled issues”: Judge Peck stated that “the Court will not allow that to happen. As I have stated before, the standard for TAR is not perfection (nor of using the best practices that Ms. Grossman might use in her own firm’s work), but rather what is reasonable and proportional under the circumstances. The same standard will be applied by the special master.”
  • One of the defendant’s lawyers had three conversations with Ms. Grossman about TAR issues: Judge Peck noted that one contact in connection with The Sedona Conference “should or does prevent Ms. Grossman from serving as special master”, and noted that, in the other two, the plaintiff “does not suggest that Ms. Grossman did anything improper in responding to counsel’s question, and Ms. Grossman has made clear that she sees no reason why she cannot serve as a neutral special master”, agreeing with that statement.

Judge Peck did agree with the plaintiff on allocation of the special master’s fees, stating that the defendant’s “propsal [sic] is inconsistent with this Court’s stated requirement in this case that whoever agreed to appointment of a special master would have to agree to pay, subject to the Court reallocating costs if warranted”.

So, what do you think?  Was the appointment of a special master (albeit an eminently qualified one) appropriate at this stage of the case?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

“Quality is Job 1” at Ford, Except When it Comes to Self-Collection of Documents: eDiscovery Case Law

In Burd v. Ford Motor Co., Case No. 3:13-cv-20976 (S.D. W. Va. July 8, 2015), West Virginia Magistrate Judge Cheryl A. Eifert granted the plaintiff’s motion for a deposition of a Rule 30(b)(6) witness on the defendant’s search and collection methodology, but did not rule on the issue of whether the defendant had a reasonable collection process or adequate production, denying the plaintiff’s motion as “premature” on that request.

Case Background

In these cases involving alleged events of sudden unintended acceleration in certain Ford vehicles, the plaintiffs, in December 2014, requested regularly scheduled discovery conferences in an effort to expedite what they anticipated would be voluminous discovery.

At the February 10, 2015 conference, the plaintiffs raised concerns regarding the reasonableness of the searches being performed by the defendant in its effort to respond to plaintiffs’ requests for documents.  While conceding that it had not produced e-mail in certain instances, because it did not understand that the request sought e-mail communications, the defendant did indicate that it had conducted a “sweep” of the emails of ten to twenty key custodians.  That “sweep” was described as a “self-selection” process being conducted by the individual employees, who had each been given information about the plaintiffs’ claims, as well as suggested search terms.  However, excerpts of deposition transcripts of defendant’s witnesses provided by the plaintiff revealed that some of those key employees performed limited searches or no searches at all.

Also, the Court ordered the parties to meet, confer, and agree on search terms.  The defendant objected to sharing its search terms, contending that the plaintiff sought improper “discovery on discovery,” and deemed the plaintiff’s request as “overly burdensome” given that each custodian developed their own search terms after discussing the case with counsel.

Judge’s Ruling

Noting that the defendant’s “generic objections to ‘discovery on discovery’ and ‘non-merits’ discovery are outmoded and unpersuasive”, Judge Eifert stated, as follows:

“Here, there have been repeated concerns voiced by Plaintiffs regarding the thoroughness of Ford’s document search, retrieval, and production. Although Ford deflects these concerns with frequent complaints of overly broad and burdensome requests, it has failed to supply any detailed information to support its position. Indeed, Ford has resisted sharing any specific facts regarding its collection of relevant and responsive materials. At the same time that Ford acknowledges the existence of variations in the search terms and processes used by its custodians, along with limitations in some of the searches, it refuses to expressly state the nature of the variations and limitations, instead asserting work product protection. Ford has cloaked the circumstances surrounding its document search and retrieval in secrecy, leading to skepticism about the thoroughness and accuracy of that process. This practice violates ‘the principles of an open, transparent discovery process.’”

Judge Eifert also rejected the defendant’s claim of work product protection regarding the search terms, stating that “[u]ndoubtedly, the search terms used by the custodians and the names of the custodians that ran searches can be disclosed without revealing the substance of discussions with counsel.”  As a result, Judge Eifert granted the plaintiff’s motion for a deposition of a Rule 30(b)(6) witness on the defendant’s search and collection methodology, but did not rule on the issue of whether the defendant had a reasonable collection process or adequate production, denying the plaintiff’s motion as premature on that request.

So, what do you think?  Was the order for a deposition of a Rule 30(b)(6) witness the next appropriate step?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

This Study Discusses the Benefits of Including Metadata in Machine Learning for TAR: eDiscovery Trends

A month ago, we discussed the Discovery of Electronically Stored Information (DESI) workshop and the papers describing research or practice presented at the workshop that was held earlier this month and we covered one of those papers a couple of weeks later.  Today, let’s cover another paper from the study.

The Role of Metadata in Machine Learning for Technology Assisted Review (by Amanda Jones, Marzieh Bazrafshan, Fernando Delgado, Tania Lihatsh and Tamara Schuyler) attempts to study the  role of metadata in machine learning for technology assisted review (TAR), particularly with respect to the algorithm development process.

Let’s face it, we all generally agree that metadata is a critical component of ESI for eDiscovery.  But, opinions are mixed as to its value in the TAR process.  For example, the Grossman-Cormack Glossary of Technology Assisted Review (which we covered here in 2012) includes metadata as one of the “typical” identified features of a document that are used as input to a machine learning algorithm.  However, a couple of eDiscovery software vendors have both produced documentation stating that “machine learning systems typically rely upon extracted text only and that experts engaged in providing document assessments for training should, therefore, avoid considering metadata values in making responsiveness calls”.

So, the authors decided to conduct a study that established the potential benefit of incorporating metadata into TAR algorithm development processes, as well as evaluate the benefits of using extended metadata and also using the field origins of that metadata.  Extended metadata fields included Primary Custodian, Record Type, Attachment Name, Bates Start, Company/Organization, Native File Size, Parent Date and Family Count, to name a few.  They evaluated three distinct data sets (one drawn from Topic 301 of the TREC 2010 Interactive Task, two other proprietary business data sets) and generated a random sample of 4,500 individual documents for each (split into a 3,000 document Control Set and a 1,500 document Training Set).

The metric they used throughout to compare model performance is Area Under the Receiver Operating Characteristic Curve (AUROC). Say what?  According to the report, the metric indicates the probability that a given model will assign a higher ranking to a randomly selected responsive document than a randomly selected non-responsive document.

As indicated by the graphic above, their findings were that incorporating metadata as an integral component of machine learning processes for TAR improved results (based on the AUROC metric).  Particularly, models incorporating Extended metadata significantly outperformed models based on body text alone in each condition for every data set.  While there’s still a lot to learn about the use of metadata in modeling for TAR, it’s an interesting study and start to the discussion.

A copy of the twelve page study (including Bibliography and Appendix) is available here.  There is also a link to the PowerPoint presentation file from the workshop, which is a condensed way to look at the study, if desired.

So, what do you think?  Do you agree with the report’s findings?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Here’s One Study That Shows Potential Savings from Technology Assisted Review: eDiscovery Trends

A couple of weeks ago, we discussed the Discovery of Electronically Stored Information (DESI) workshop and the papers describing research or practice presented at the workshop that was held earlier this month.  Today, let’s cover one of those papers.

The Case for Technology Assisted Review and Statistical Sampling in Discovery (by Christopher H Paskach, F. Eli Nelson and Matthew Schwab) aims to show how Technology Assisted Review (TAR) and Statistical Sampling can significantly reduce risk and improve productivity in eDiscovery processes.  The easy to read 6 page report concludes with the observation that, with measures like statistical sampling, “attorney stakeholders can make informed decisions about  the reliability and accuracy of the review process, thus quantifying actual risk of error and using that measurement to maximize the value of expensive manual review. Law firms that adopt these techniques are demonstrably faster, more informed and productive than firms who rely solely on attorney reviewers who eschew TAR or statistical sampling.”

The report begins by giving an introduction which includes a history of eDiscovery, starting with printing documents, “Bates” stamping them, scanning and using Optical Character Recognition (OCR) programs to capture text for searching.  As the report notes, “Today we would laugh at such processes, but in a profession based on ‘stare decisis,’ changing processes takes time.”  Of course, as we know now, “studies have concluded that machine learning techniques can outperform manual document review by lawyers”.  The report also references key cases such as DaSilva Moore, Kleen Products and Global Aerospace, demonstrating with the first few of many cases to approve the use of technology assisted review for eDiscovery.

Probably the most interesting portion of the report is the section titled Cost Impact of TAR, which illustrates a case scenario that compares the cost of TAR to the cost of manual review.  On a strictly relevance based review of 90,000 documents (after keyword filtering, which implies a multimodal approach to TAR), the TAR approach was over $57,000 less expensive ($136,225 vs. $193,500 for manual review).  The report illustrates the comparison with both a numbers spreadsheet and a pie chart comparison of costs, based on the assumptions provided.  Sounds like the basis for a budgeting tool!

Anyway, the report goes on to discuss the benefits of statistical sampling to validate the results, demonstrating that the only way to attempt to do so in a manual review scenario is to review the documents multiple times, which is prone to human error and inconsistent assessments of responsiveness.  The report then covers necessary process changes to realize the benefits of TAR and statistical sampling and concludes with the declaration that:

“Companies and law firms that take advantage of the rapid advances in TAR will be able to keep eDiscovery review costs down and reduce the investment in discovery by getting to the relevant facts faster. Those firms who stick with unassisted manual review processes will likely be left behind.”

The report is a quick, easy read and can be viewed here.

So, what do you think?  Do you agree with the report’s findings?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Plaintiff Ordered to Image its Sources of ESI, Respond to Disputed Discovery Requests: eDiscovery Case Law

In Electrified Discounters, Inc. v. MI Technologies, Inc. et al., Case No. 3:13cv1332 (RNC) (D. Conn. May 19, 2015), Connecticut Magistrate Judge Donna F. Martinez granted the defendant’s motion to compel the plaintiff ‘s responses to discovery and ordered the plaintiff to “image its sources of electronically stored information (‘ESI’), including its hard drives and QuickBook files”.

Case Background

In this trademark infringement case between competitors who sell replacement lamps for rear projection televisions and front projectors via online marketplaces, the defendants filed a motion to compel the plaintiff ‘s responses to discovery and argued that the plaintiff failed to issue a timely litigation hold and that the plaintiff’s production of ESI was “careless and indifferent.”  Specifically, the defendant stated that the plaintiff anticipated filing a lawsuit against the defendant in 2011, but that the plaintiff’s attorney admittedly did not counsel his client regarding its duty to retain relevant information until 2013 when the lawsuit was filed.

Additionally, in March 2015, the plaintiff’s company president testified in his deposition that he routinely deletes emails based on their age when his mailbox becomes full, that he deletes emails about once a month, that he continued to delete emails during this litigation and, on the day before his deposition, he deleted approximately 1000 emails.  Other records also were admittedly destroyed by the plaintiff company, which responded to the defendant’s request for plaintiff’s lamp sales that “[a]s part of its routine business practices, Electrified discards its records of lamps sales after approximately one year following payment.”

Judge’s Ruling

With regard to the defendant’s criticism of plaintiff’s failure to institute a timely litigation hold and its careless and indifferent production efforts after the duty to preserve arose, Judge Martinez stated “After reviewing the deposition testimony of Electrified’s witnesses, the court agrees that the defendant’s concern is well-founded.”  Those depositions included one plaintiff employee, who testified that his company uses a Quickbooks program, which contains detailed inventory and sales records dating back to 2006 as well as the company president, who also acknowledged that the Quickbook database contains inventory and sales information.

Citing Pension Committee and Zubulake, Judge Martinez stated that “The duty to preserve evidence is ‘well established.’”  With regard to the plaintiff’s admitted preservation failures, she stated “This cannot continue. Pending the final disposition of all claims in this action, plaintiff Electrified is ordered to preserve all documents, electronically-stored information, and/or tangible things that might be relevant to this subject matter or reasonably calculated to lead to the discovery of admissible evidence in this action.”  In an attempt to limit further spoliation of data, Judge Martinez stated that the plaintiff “shall image its sources of electronically stored information (‘ESI’), including its hard drives and QuickBook files.”

With regard to the twenty discovery requests in dispute, Judge Martinez granted the defendant’s motion to compel for each one, ordering the plaintiff to search and produce responsive ESI within 14 days of the order.  She also ordered the plaintiff “to show cause by June 2, 2015 why the court should not award defendant [requested] attorney’s fees incurred in the making of the motion to compel pursuant to Rule 37(a)(5).”

So, what do you think?  Are sanctions the next step in this case?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.