Analysis

EDRM Needs Your Input on its TAR Guidelines: eDiscovery Best Practices

I’m here in Durham, NC at the annual EDRM Spring Workshop at Duke Law School and, as usual, the Workshop is a terrific opportunity to discuss the creation of standards and guidelines for the legal community, as well as network with like minded people on eDiscovery topics.  I’ll have more to report about this year’s Workshop next week.  But, one part of the Workshop that I will touch on now is the release of the public comment version of EDRM’s Technology Assisted Review (TAR) Guidelines.

Last Friday, EDRM released the preliminary draft of its TAR Guidelines for public comment (you can download it here).  EDRM and the Bolch Judicial Institute at Duke Law are seeking comments from the bench, bar, and public on a preliminary draft of Technology Assisted Review (TAR) Guidelines. Nearly 50 volunteer lawyers, e-discovery experts, software developers, scholars and judges worked on this draft under the auspices of EDRM. A version of the document was presented at the Duke Distinguished Lawyers’ conference on Technology Assisted Review, held Sept. 7-8, 2017. At that event, 15 judges and nearly 100 lawyers and practitioners provided feedback and comments on the draft. The document was further revised based on discussions at that conference, additional review by judges and additional review by EDRM members over the past couple of months (which involved significant changes and a much tighter and briefer guideline document). With the assistance of four law student fellows of the Bolch Judicial Institute, this draft was finalized in May 2018 for public comment.

So, calling this a preliminary draft is a bit of a misnomer as it has already been through several iterations of review and edit.  Now, it’s the public’s turn.

EDRM states that “Comments on this preliminary draft will be carefully considered by the drafting team and an advisory group of judges as they finalize the document for publication. Please send comments on this draft, whether favorable, adverse, or otherwise, as soon as possible, but no later than Monday, July 16, 2018. Comments must be submitted in tracked edits (note: the guidelines are in a Word document for easy ability to track changes) and submitted via email to edrm@law.duke.edu. All comments will be made available to the public.”

That’s all well and good and EDRM will hopefully get a lot of useful feedback on the guideline document.  However, one thing I have observed about public comment periods is that the people who tend to provide comments (i.e., geeks like us who attend EDRM workshops) are people who already understand TAR (and think they know how best to explain it to others).  If the goal of the EDRM TAR guidelines is to help the general bench and bar better understand TAR, then it’s important for the average attorney to review the document and provide comments as to how useful it is.

So, if you’re an attorney or legal technology practitioner who doesn’t understand TAR, I encourage (even challenge) you to review these guidelines and provide feedback.  Point out what you learned from the document and what was confusing and whether or not you feel that you have a better understanding of TAR and the considerations for when to use it and where it can be used.  Ask yourself afterward if you have a better idea of how to get started using TAR and if you understand the difference between TAR approaches.  If these guidelines can help a lot of members of the legal profession better understand TAR, that will be the true measure of its effectiveness.

Oh, and by the way, Europe’s General Data Protection Regulation is now in effect!  Are you ready?  If not, you might want to check out this webcast.

So, what do you think?  Will these guidelines help the average attorney or judge better understand TAR?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

A Fresh Comparison of TAR and Keyword Search: eDiscovery Best Practices

Bill Dimm of Hot Neuron (the company that provides the product Clustify that provides document clustering and predictive coding technologies, among others) is one of the smartest men I know about technology assisted review (TAR).  So, I’m always interested to hear what he has to say about TAR, how it can be used and how effective it is when compared to other methods (such as keyword searching).  His latest blog post on the Clustify site talk about an interesting exercise that did exactly that: compared TAR to keyword search in a real classroom scenario.

In TAR vs. Keyword Search Challenge on the Clustify blog, Bill challenged the audience during the NorCal eDiscovery & IG Retreat to create keyword searches that would work better than technology-assisted review (predictive coding) for two topics.  Half of the room was tasked with finding articles about biology (science-oriented articles, excluding medical treatment) and the other half searched for articles about current law (excluding proposed laws or politics).  Bill then ran one of the searches against TAR in Clustify live during the presentation (the others he couldn’t do during the session due to time constraints, but did afterward and covered those on his blog, providing the specific searches to which he compared TAR).

To evaluate the results, Bill measured the recall from the top 3,000 and top 6,000 hits on the search query (3% and 6% of the population respectively) and also included the recall achieved by looking at all docs that matched the search query, just to see what recall the search queries could achieve if you didn’t worry about pulling in a ton of non-relevant docs.  For the TAR results he used TAR 3.0 (which is like Continuous Active Learning, but applied to cluster centers only) trained with (a whopping) two seed documents (one relevant from a keyword search and one random non-relevant document) followed by 20 iterations of 10 top-scoring cluster centers, for a total of 202 training documents.  To compare to the top 3,000 search query matches, the 202 training documents plus 2,798 top-scoring documents were used for TAR, so the total document review (including training) would be the same for TAR and the search query.

The result: TAR beat keyword search across the board for both tasks.  The top 3,000 documents returned by TAR achieved higher recall than the top 6,000 documents for any keyword search.  Based on this exercise, TAR achieved better results (higher recall) with half as much document review compared to any of the keyword searches.  The top 6,000 documents returned by TAR achieved higher recall than all of the documents matching any individual keyword search, even when the keyword search returned 27,000 documents.

Bill acknowledges that the audience had limited time to construct queries, they weren’t familiar with the data set, and they couldn’t do sampling to tune their queries, so the keyword searching wasn’t optimal.  Then again, for many of the attorneys I’ve worked with, that sounds pretty normal.  :o)

One reader commented about email headers and footers cluttering up results and Bill pointed out that “Clustify has the ability to ignore email header data (even if embedded in the middle of the email due to replies) and footers” – which I’ve seen and is actually pretty cool.  Irrespective of the specifics of the technology, Bill’s example is a terrific fresh example of how TAR can outperform keyword search – as Bill notes in his response to the commenter “humans could probably do better if they could test their queries, but they would probably still lose”.  Very interesting.  You’ll want to check out the details of his test via the link here.

So, what do you think?  Do you think this is a valid comparison of TAR and keyword searching?  Why or why not?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Don’t Miss Our Webcast Today on Technology Assisted Review!: eDiscovery Webcasts

What is Technology Assisted Review (TAR)? Why don’t more lawyers use it? Find out in our webcast today!

Today at noon CST (1:00pm EST, 10:00am PST), CloudNine will conduct the webcast Getting Off the Sidelines and into the Game using Technology Assisted Review. In this one-hour webcast that’s CLE-approved in selected states, will discuss what TAR really is, when it may be appropriate to consider for your case, what challenges can impact the use of TAR and how to get started. Topics include:

  • Understanding the Goals for Retrieving Responsive ESI
  • Defining the Terminology of TAR
  • Different Forms of TAR and How They Are Used
  • Acceptance of Predictive Coding by the Courts
  • How Big Does Your Case Need to Be to use Predictive Coding?
  • Considerations for Using Predictive Coding
  • Challenges to an Effective Predictive Coding Process
  • Confirming a Successful Result with Predictive Coding
  • How to Get Started with Your First Case using Predictive Coding
  • Resources for More Information

Once again, I’ll be presenting the webcast, along with Tom O’Connor, who recently wrote an article about TAR that we covered on this blog.  To register for it, click here.  Even if you can’t make it, go ahead and register to get a link to the slides and to the recording of the webcast (if you want to check it out later).  If you want to learn about TAR, what it is and how to get started, this is the webcast for you!

So, what do you think?  Do you use TAR to assist in review in your cases?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.