To Keyword Cull or Not to Keyword Cull? That is the Question: eDiscovery Trends

June 7, 2017

We’re seeing a lot of discussion about whether to perform keyword searching before predictive coding. We’ve even seen a recent case where a judge weighed in as to whether TAR with or without keyword searching is preferable. Now, we have a new article published in the Richmond Journal of Law and Technology that weighs in as well.

In Calling an End to Culling: Predictive Coding and the New Federal Rules of Civil Procedure (PDF version here), Stephanie Serhan, a law student, looks at the 2015 Federal Rules amendments (particularly Rules 1 and 26(b)(1)) as justification for applying predictive coding “at the outset on the entire universe of documents in a case.” Serhan concludes that doing so is “far more accurate, and is not more costly or time-consuming, especially when the parties collaborate at the outset.”

Serhan discusses the importance of timing to predictive coding and explains the technical difference between predictive coding at the outset of a case vs. predictive coding after performing keyword searches. One issue of keyword culling that Serhan notes is that it “is not as accurate because the party may lose many relevant documents if the documents do not contain the specified search terms, have typographical errors, or use alternative phraseologies”. Serhan assumes that those “relevant documents removed by keyword culling would likely have been identified using predictive coding at the outset instead.”

Serhan also takes a look at the impact on efficiency and cost between the two methods and concludes that the “actual cost of predictive coding will likely be substantially equal in both methods since the majority of the costs will be incurred in both methods.” She also looks at TAR related cases, both before and after the 2015 Rules changes.

More and more people have concluded that predictive coding should be done without keyword culling and with good reason. Applying predictive coding to a set unaltered by keywords would not only likely be more accurate, but also be more efficient as keyword searching requires its own methodology that includes testing of results (and documents not retrieved) before moving on. Unless there’s a need to limit the volume of collected data because of cost considerations, there is no need to apply keyword culling before predictive coding.

Culling that does make sense is Hash based deduplication, elimination of clearly non-responsive domains and other activities where clearly redundant or non-responsive ESI can be removed from the collection. That’s a different type of culling that does make sense.

So, what do you think? To keyword cull or not to keyword cull? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

WHAT CLIENTS ARE SAYING ABOUT CLOUDNINE

Great value product.

“Offers the major features we were looking for, at a fraction of pricing of other competitors.”

I used CloudNine as part of fraud investigation for email searches.

“…The tag function made it easy to flag the search results. I was impressed with the ease of use for a first-time user. The speed and ease of loading data and being able to review it immediately is a tremendous advantage over other Cloud-based platforms.”

Excellent tool with outstanding support

“CloudNine Review is excellent, it takes the best of the (market leader) review solution and leaves out all of the fiddly bits that make that product excruciating to use. Their upload and processing is automatic, and their pricing structure is the best I’ve seen.”

Great software that is easy to log on, user-friendly, has a great layout, and is easy to navigate.

“…CloudNine is great at searching documents, including tagging, and exporting. Software tailored to our business needs and streamlined the task at hand.”

Discovery Production

This software is easy to use and allows us to upload and download documents as they become ready, saving us both time and money.

Stephanie Plake, Assistant to Attorney at Law Office

eDiscovery Daily Blog