Analysis

Getting Off the Sidelines and into the Game using Technology Assisted Review: eDiscovery Webcasts

The use of Technology Assisted Review (TAR) has been accepted in the courts for several years, but most lawyers still don’t use it and many still don’t know what it is or how it works. Why not?  We will discuss this and other questions in a webcast next week.

On Wednesday, April 25 at noon CST (1:00pm EST, 10:00am PST), CloudNine will conduct the webcast Getting Off the Sidelines and into the Game using Technology Assisted Review. In this one-hour webcast that’s CLE-approved in selected states, will discuss what TAR really is, when it may be appropriate to consider for your case, what challenges can impact the use of TAR and how to get started. Topics include:

  • Understanding the Goals for Retrieving Responsive ESI
  • Defining the Terminology of TAR
  • Different Forms of TAR and How They Are Used
  • Acceptance of Predictive Coding by the Courts
  • How Big Does Your Case Need to Be to use Predictive Coding?
  • Considerations for Using Predictive Coding
  • Challenges to an Effective Predictive Coding Process
  • Confirming a Successful Result with Predictive Coding
  • How to Get Started with Your First Case using Predictive Coding
  • Resources for More Information

Once again, I’ll be presenting the webcast, along with Tom O’Connor, who recently wrote an article about TAR that we are currently covering on this blog (parts one and two were published last week, the remaining two parts will be published this week).  To register for it, click here.  Even if you can’t make it, go ahead and register to get a link to the slides and to the recording of the webcast (if you want to check it out later).  If you want to learn about TAR, what it is and how to get started, this is the webcast for you!

So, what do you think?  Do you use TAR to assist in review in your cases?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

You May Be a User of Predictive Coding Technology and Not Realize It: eDiscovery Trends

At the Houston ACEDS luncheon/TAR panel last week, we asked a few questions of the audience to gauge their understanding and experience with Technology Assisted Review (TAR).  Some of the questions (like “have you used TAR on a case?”) were obvious questions to ask.  Others might have not been so obvious.

Like, “do you watch movies and TV shows on Netflix or Amazon Prime?”  Or, “do you listen to music on Pandora or Spotify”?

So, why would we ask a question like that on a TAR panel?

Because those sites are examples of uses of artificial intelligence and supervised machine learning.

But first, this week’s eDiscovery Tech Tip of the Week is about Boolean Searching.  When performing searches, the ability to combine multiple criteria into a single search to be performed is key to help achieve a proper balance of recall and precision in that search.  Using OR operators between search terms helps expand recall by retrieving documents that meet ANY of the criteria; while using AND or AND NOT operators between search terms help improve precision by only retrieving documents that are responsive if they include all terms (AND) or exclude certain terms (AND NOT).

Grouping of those parameters properly is important as well.  My first name is Dozier, so a search for my name could be represented as Doug or Douglas or Dozier and Austin or it could be represented as (Doug or Douglas or Dozier) and Austin.  One of them is right.  Guess which one!  Regardless, boolean searching is an important part of efficient search and retrieval of documents to meet discovery requirements.

To see an example of how Boolean Searching is conducted using our CloudNine platform, click here (requires BrightTalk account, which is free).

Anyway, back to the topic of the day.  Let’s take Pandora, for example.  I was born in the 60’s – yes, I look GREAT for my age, :o) – and so I’m a fan of classic rock.  Pandora is a site where you can set up “stations” of your favorite artists.  If you’re a fan of classic rock and you’re born in the 60’s, you probably love an artist like Jimi Hendrix.  Right?

Well, I do and I have a Pandora account, so I set up a Jimi Hendrix “station”.  But, Pandora doesn’t just play Jimi Hendrix on that station, it plays other artists and songs it thinks I might like that are in a similar genre.  Artists like Stevie Ray Vaughan (The Sky is Crying), Led Zeppelin (Kashmir), The Doors (Peace Frog) and Ten Years After (I’d Love to Change the World), which is the example you see above.  For each song, you can listen to it, skip it, or give it a “thumbs up” or “thumbs down” (for the record, I wouldn’t give any of the above songs a “thumbs down”).  If you give a song a “thumbs up”, you’re more likely to hear the song again and if you give the song a “thumbs down”, you’re less likely to hear it again (at least in theory).

Does something sound familiar about that?

You’re training the system.  Pandora is using the feedback you give it to (hopefully) deliver more songs that you like and less of the songs you don’t like to improve your listening experience.  One nice thing about it is that you get to listen to songs or artists you may not have heard before and learn to enjoy them as well (that’s how I got to be a fan of The Black Keys, for example).

If you watch a show or movie on Netflix and you log in sometime afterward, Netflix will suggest shows for you to watch, based on what you’ve viewed previously (especially if you rate what you watched highly).

That’s what supervised machine learning is and what a predictive coding algorithm does.  “Thumbs up” is the same as marking a document responsive, “thumbs down” is the same as marking a document non-responsive.  The more documents (or songs or movies) you classify, the more likely you’re going to receive relevant and useful documents (or songs or movies) going forward.

When it comes to teaching the legal community about predictive coding, “I’d love to save the world, but I don’t know what to do”.  Maybe, I can start by teaching people about Pandora!  So, you say you’ve never used a predictive coding algorithm before?  Maybe you have, after all.  :o)

Speaking of predictive coding, is that the same as TAR or not?  If you want to learn more about what TAR is and what it could also be, check out our webcast Getting Off the Sidelines and into the Game using Technology Assisted Review on Wednesday, April 25.  Tom O’Connor and I will discuss a lot of topics related to the use of TAR, including what TAR is (or what people think it is), considerations and challenges to using TAR and how to get started using it.  To register, click here!

So, what do you think?  Have you used a predictive coding algorithm before?  Has your answer changed after reading this post?  :o)  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Why Is TAR Like a Bag of M&M’s?, Part Two: eDiscovery Best Practices

Editor’s Note: Tom O’Connor is a nationally known consultant, speaker, and writer in the field of computerized litigation support systems.  He has also been a great addition to our webinar program, participating with me on several recent webinars.  Tom has also written several terrific informational overview series for CloudNine, including eDiscovery and the GDPR: Ready or Not, Here it Comes (which we covered as a webcast), Understanding eDiscovery in Criminal Cases (which we also covered as a webcast) and ALSP – Not Just Your Daddy’s LPO.  Now, Tom has written another terrific overview regarding Technology Assisted Review titled Why Is TAR Like a Bag of M&M’s? that we’re happy to share on the eDiscovery Daily blog.  Enjoy! – Doug

Tom’s overview is split into four parts, so we’ll cover each part separately.  The first part was covered on Tuesday.  Here’s part two.

History and Evolution of Defining TAR

Most people would begin the discussion by agreeing with this framing statement made by Maura Grossman and Gordon Cormack in their seminal article, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, (XVII RICH. J.L. & TECH. 11 (2011):

Overall, the myth that exhaustive manual review is the most effective—and therefore, the most defensible—approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.

A technology-assisted review process may involve, in whole or in part, the use of one or more approaches including, but not limited to, keyword search, Boolean search, conceptual search, clustering, machine learning, relevance ranking, and sampling.

So, TAR began as a process and in the early stage of the discussion, it was common to refer to various TAR tools under the heading “analytics” as illustrated by the graphic below from Relativity.

Copyright © Relativity

That general heading was often divided into two main categories

Structured Analytics

  • Email threading
  • Near duplicate detection
  • Language detection

Conceptual Analytics

  • Keyword expansion
  • Conceptual clustering
  • Categorization
  • Predictive Coding

That definition of Predictive Coding as part of the TAR process held for quite some time. In fact, the current EDRM definition of Predictive Coding still refers to it as:

An industry-specific term generally used to describe a Technology-Assisted Review process involving the use of a Machine Learning Algorithm to distinguish Relevant from Non-Relevant Documents, based on a Subject Matter Expert’s Coding of a Training Set of Documents

But before long, the definition began to erode and TAR started to become synonymous with Predictive Coding. Why?  For several reasons I believe.

  1. The Grossman-Cormack glossary of 2013 used the phrase Coding” to define both TAR and PC and I think various parties then conflated the two. (See No. 2 below)

  1. Continued use of the terms interchangeably. See EG, Ralph Losey’s TARCourse,” where the very beginning of the first chapter states, “We also added a new class on the historical background of the development of predictive coding.”  (which is, by the way, an excellent read).
  2. Any discussion of TAR involves selecting documents using algorithms and most attorneys react to math the way the Wicked Witch of the West reacted to water.

Again, Ralph Losey provides a good example.  (I’m not trying to pick on Ralph, he is just such a prolific writer that his examples are everywhere…and deservedly so). He refers to gain curves, x-axis vs y-axis, HorvitsThompson estimators, recall rates, prevalence ranges and my personal favorite “word-based tf-idf tokenization strategy.”

“Danger. Danger. Warning. Will Robinson.”

  1. Marketing: the simple fact is that some vendors sell predictive coding tools. Why talk about other TAR tools when you don’t make them? Easier to call your tool TAR and leave it at that.

The problem became so acute that by 2015, according to a 2016 ACEDS News Article, Maura Grossman and Gordon Cormack trademarked the terms “Continuous Active Learning” and “CAL”, claiming those terms’ first commercial use on April 11, 2013 and January 15, 2014. In an ACEDS interview earlier in the year, Maura stated that “The primary purpose of our patents is defensive; that is, if we don’t patent our work, someone else will, and that could inhibit us from being able to use it. Similarly, if we don’t protect the marks ‘Continuous Active Learning’ and ‘CAL’ from being diluted or misused, they may go the same route as technology-assisted review and TAR.”

So then, what exactly is TAR? Everyone agrees that manual review is inefficient, but nobody can agree on what software the lawyers should use and how. I still prefer to go back to Maura and Gordon’s original definition. We’re talking about a process, not a product.

TAR isn’t a piece of software. It’s a process that can include many different steps, several pieces of software, and many decisions by the litigation team. Ralph calls it the multi-modal approach: a combination of people and computers to get the best result.

In short, analytics are the individual tools. TAR is the process you use to combine the tools you select.  The next consideration, then, is how to make that selection.

We’ll publish Part 3 – Uses for TAR and When to Use or Not Use It – next Tuesday.

So, what do you think?  How would you define TAR?  And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Image Copyright © Mars, Incorporated and its Affiliates.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Why Is TAR Like a Bag of M&M’s?: eDiscovery Best Practices

Editor’s Note: Tom O’Connor is a nationally known consultant, speaker, and writer in the field of computerized litigation support systems.  He has also been a great addition to our webinar program, participating with me on several recent webinars.  Tom has also written several terrific informational overview series for CloudNine, including eDiscovery and the GDPR: Ready or Not, Here it Comes (which we covered as a webcast), Understanding eDiscovery in Criminal Cases (which we also covered as a webcast) and ALSP – Not Just Your Daddy’s LPO.  Now, Tom has written another terrific overview regarding Technology Assisted Review titled Why Is TAR Like a Bag of M&M’s? that we’re happy to share on the eDiscovery Daily blog.  Enjoy! – Doug

Tom’s overview is split into four parts, so we’ll cover each part separately.  Here’s the first part.

Introduction

Over the past year I have asked this question several different ways in blogs and webinars about technology assisted review (TAR). Why is TAR like ice cream? Think Baskin Robbins? Why is TAR like golf? Think an almost incomprehensible set of rules and explanations. Why is TAR like baseball, basketball or football? Think never ending arguments about the best team ever.

And now my latest analogy. Why is TAR like a bag of M&M’s?  Because there are multiple colors with sometimes a new one thrown in and sometimes they have peanuts inside but sometimes they have chocolate.  And every now and then you get a bag of Reese’s Pieces and think to yourself, “ hmmmm, this is actually better than M&M’s. “

Two recent cases spurred this new rumination on TAR. First came the decision in Winfield, et al. v. City of New York, No. 15-CV-05236 (LTS) (KHP) (S.D.N.Y. Nov. 27, 2017) (covered by eDiscovery Daily here), where Magistrate Judge Parker ordered the parties to meet and confer on any disputes with regards to a TAR process “with the understanding that reasonableness and proportionality, not perfection and scorched-earth, must be their guiding principles.”  More recently is the wonderfully crafted validation protocol (covered by ACEDS here) from Special Master Maura Grossman in the In Re Broiler Chicken Antitrust Litigation, (Jan. 3, 2018) matter currently pending in the Northern District of Illinois.

Both of these cases harkened back to Aurora Cooperative Elevator Company v. Aventine Renewable Energy or Independent Living Center of Southern California v. City of Los Angeles, a 2015 where the court ordered the use of predictive coding after extensive discovery squabbles and the 2016 decision in Hyles v. New York City (covered by eDiscovery Daily here) where by Judge Peck, in declining to order the parties to use TAR, used the phrase on page 1 of his Order, “TAR (technology assisted review, aka predictive coding) … “.

Which brings me to my main point of discussion. Before we can decide on whether or not to use TAR we have to decide what TAR is.  This discussion will focus on the following topics:

  1. History and Evolution of Defining TAR
  2. Uses for TAR and When to Use or Not Use It
  3. Justification for Using TAR
  4. Conclusions

We’ll publish Part 2 – History and Evolution of Defining TAR – on Thursday.

So, what do you think?  How would you define TAR?  And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Image Copyright © Mars, Incorporated and its Affiliates.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Houstonians, Here’s a Terrific Panel Discussion on TAR Right in Your Own Backyard: eDiscovery Best Practices

Next month, I have the privilege of moderating a panel on the current state of the acceptance of technology assisted review (TAR) with three terrific panelists, courtesy of the Association of Certified E-Discovery Specialists (ACEDS).  If you’re in Houston on April 3rd, you might want to check it out!

The panel is titled From Asking About It to Asking For It: The Evolution of the Acceptance and Use of TAR and it will be held at the offices of BoyarMiller law firm at 2925 Richmond Avenue, Houston, Texas  77098 (their offices are on the 14th floor).  The event will begin at 11:30am and will conclude at 1:30pm.  Lunch will be served!

Our panelists will be Christopher Cafiero, J.D., Southwest Territory Manager of Catalyst Repository Systems (and former trial lawyer), Gary Wiener, Independent eDiscovery Consultant, SME and Attorney and Rohit Kelkar, Vice President of R&D at Servient.  We will discuss several topics related to the current state of TAR, including the state of approval of TAR within the legal community, differences in approaches and preferred methods to TAR, disclosure of the use of TAR to opposing parties, and recommendations for those using TAR for the first time.

If you’re in Houston and you’d like to attend, register by clicking here.  Honestly, I don’t know how many people will be able to attend, so I recommend that you register early (but not often) to make sure you can get in.  If you want to learn about TAR in the Houston area, this is an excellent opportunity!

So, what do you think?  Are you interested in learning about TAR and are you going to be in the Houston area on April 3rd?  If so, we’d love to see you there!  And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.