Analysis

Fall 2019 Predictive Coding Technologies and Protocols Survey Results: eDiscovery Trends

So many topics, so little time!  Rob Robinson published the latest Predictive Coding and Technologies and Protocols Survey on his excellent ComplexDiscovery site last week, but this is the first chance I’ve had to cover it.  The results are in and here are some of the findings in the largest response group for this survey yet.

As Rob notes in the results post here, the third Predictive Coding Technologies and Protocols Survey was initiated on August 23 and concluded on September 5 with individuals invited to participate directly by ComplexDiscovery and indirectly by industry website, blog, and newsletter mentions – including a big assist from the Association of Certified E-Discovery Specialists (ACEDS).  It’s a non-scientific survey designed to help provide a general understanding of the use of predictive coding technologies and protocols from data discovery and legal discovery professionals within the eDiscovery ecosystem.  The survey was designed to provide a general understanding of predictive coding technologies and protocols and had two primary educational objectives:

  • To provide a consolidated listing of potential predictive coding technology and protocol definitions. While not all-inclusive or comprehensive, the listing was vetted with selected industry predictive coding experts for completeness and accuracy, thus it appears to be profitable for use in educational efforts.
  • To ask eDiscovery ecosystem professionals about their usage and preferences of predictive coding platforms, technologies, and protocols.

There were 100 total respondents in the survey (a nice, round number!).  Here are some of the more notable results:

  • 39 percent of responders were from law firms, 37 percent of responders were from software or services provider organizations, and the remaining 24 percent of responders were either part of a consultancy (12 percent), a corporation (6 percent), the government (3 percent), or another type of entity (3 percent).
  • 86 percent of responders shared that they did have a specific primary platform for predictive coding versus 14 percent who indicated they did not.
  • There were 31 different platforms noted as primary predictive platforms by responders, nine of which received more than one vote and they accounted for more than three-quarters of responses (76 percent).
  • Active Learning was the most used predictive coding technology, with 86 percent reporting that they use it in their predictive coding efforts.
  • Just over half (51 percent) of responders reported using only one predictive coding technology in their predictive coding efforts.
  • Continuous Active Learning (CAL) was (by far) the most used predictive coding protocol, with 82 percent reporting that they use it in their predictive coding efforts.
  • Maybe the most interesting stat: 91 percent of responders reported using technology-assisted review in more than one area of data and legal discovery. So, the uses of TAR are certainly expanding!

Rob has reported several other results and provided graphs for additional details.  To check out all of the results, click here.  Want to compare to the previous two surveys?  They’re here and here:o)

So, what do you think?  Do any of the results surprise you?  Please share any comments you might have or if you’d like to know more about a particular topic.

Image Copyright © FremantleMedia North America, Inc.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

The March Toward Technology Competence (and Possibly Predictive Coding Adoption) Continues: eDiscovery Best Practices

I know, because it’s “March”, right?  :o)  Anyway, it’s about time is all I can say.  My home state of Texas has finally added its name to the list of states that have adopted the ethical duty of technology competence for lawyers, becoming the 36th state to do so.  And, we have a new predictive coding survey to check out.

As discussed on Bob Ambrogi’s LawSites blog, just last week (February 26), the Supreme Court of Texas entered an order amending Paragraph 8 of Rule 1.01 of the Texas Disciplinary Rules of Professional Conduct. The amended comment now reads (emphasis added):

Maintaining Competence

  1. Because of the vital role of lawyers in the legal process, each lawyer should strive to become and remain proficient and competent in the practice of law, including the benefits and risks associated with relevant technology. To maintain the requisite knowledge and skill of a competent practitioner, a lawyer should engage in continuing study and education. If a system of peer review has been established, the lawyer should consider making use of it in appropriate circumstances. Isolated instances of faulty conduct or decision should be identified for purposes of additional study or instruction.

The new phrase in italics above mirrors the one adopted in 2012 by the American Bar Association in amending the Model Rules of Professional Conduct to make clear that lawyers have a duty to be competent not only in the law and its practice, but also in technology.  Hard to believe it’s been seven years already!  Now, we’re up to 36 states that have formally adopted this duty of technology competence.  Just 14 to go!

Also, this weekend, Rob Robinson published the results of the Predictive Coding Technologies and Protocols Spring 2019 Survey on his excellent Complex Discovery blog.  Like the first version of the survey he conducted back in September last year, the “non-scientific” survey designed to help provide a general understanding of the use of predictive coding technologies, protocols, and workflows by data discovery and legal discovery professionals within the eDiscovery ecosystem.  This survey had 40 respondents, up from 31 the last time.

I won’t steal Rob’s thunder, but here are a couple of notable stats:

  • Approximately 62% of responders (62.5%) use more than one predictive coding technology in their predictive coding efforts: That’s considerably higher than I would have guessed;
  • Continuous Active Learning (CAL) was the most used predictive coding protocol with 80% of responders reporting that they use it in their predictive coding efforts: I would have expected that CAL was the leader, but not as dominant as these stats show; and
  • 95% of responders use technology-assisted review in more than one area of data and legal discovery: Which seems a good sign to me that practitioners aren’t just limiting it to identification of relevant documents in review anymore.

Rob’s findings, including several charts, can be found here.

So, what do you think?  Which state will be next to adopt an ethical duty of technology competence for lawyers?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

EDRM Releases the Final Version of its TAR Guidelines: eDiscovery Best Practices

During last year’s EDRM Spring Workshop, I discussed on this blog that EDRM had released the preliminary draft of its Technology Assisted Review (TAR) Guidelines for public comment.  They gave a mid-July deadline for comments and I even challenged the people who didn’t understand TAR very well to review it and provide feedback – after all, those are the people who would hopefully stand to benefit the most from these guidelines.  Now, over half a year later, EDRM has released the final version of its TAR Guidelines.

The TAR Guidelines (available here) have certainly gone through a lot of review.  In addition to the public comment period last year, it was discussed in the last two EDRM Spring meetings (2017 and 2018), presented at the Duke Distinguished Lawyers’ conference on Technology Assisted Review in 2017 for feedback, and worked on extensively during that time.

As indicated in the press release, more than 50 volunteer judges, practitioners, and eDiscovery experts contributed to the drafting process over a two-year period. Three drafting teams worked on various iterations of the document, led by Matt Poplawski of Winston & Strawn, Mike Quartararo of eDPM Advisory Services, and Adam Strayer of Paul, Weiss, Rifkind, Wharton & Garrison. Tim Opsitnick of TCDI and U.S. Magistrate Judge James Francis IV (Southern District of New York, Ret.), assisted in editing the document and incorporating comments from the public comment period.

“We wanted to address the growing confusion about TAR, particularly marketing claims and counterclaims that undercut the benefits of various versions of TAR software,” said John Rabiej, deputy director of the Bolch Judicial Institute of Duke Law School, which oversees EDRM. “These guidelines provide guidance to all users of TAR and apply across the different variations of TAR. We avoided taking a position on which variation of TAR is more effective, because that very much depends on facts specific to each case. Instead, our goal was to create a definitive document that could explain what TAR is and how it is used, to help demystify it and to help encourage more widespread adoption.”  EDRM/Duke Law also provide a TAR Q&A with Rabiej here.

The 50-page document contains four chapters: The first chapter defines technology assisted review and the TAR process. The second chapter lays out a standard workflow for the TAR process. The third chapter examines alternative tasks for applying TAR, including prioritization, categorization, privilege review, and quality and quantity control. Chapter four discusses factors to consider when deciding whether to use TAR, such as document set, cost, timing, and jurisdiction.

“Judges generally lack the technical expertise to feel comfortable adjudicating disputes involving sophisticated search methodologies. I know I did,” said Magistrate Judge Francis, who assisted in editing the document. “These guidelines are intended, in part, to provide judges with sufficient information to ask the right questions about TAR. When judges are equipped with at least this fundamental knowledge, counsel and their clients will be more willing to use newer, more efficient technologies, recognizing that they run less risk of being caught up in a discovery quagmire because a judge just doesn’t understand TAR. This, in turn, will further the goals of Rule 1 of the Federal Rules of Civil Procedure: to secure the just, speedy, and inexpensive determination of litigation.”

EDRM just announced the release of the final version of the TAR guidelines yesterday, so I haven’t had a chance to read it completely through yet, but a quick comparison to the public comment version from last May seems to indicate the same topics and sub-topics that were covered back then, so there certainly appears to be no major rewrite as a result of the public comment feedback.  I look forward to reading it in detail and determining what specific changes were made.

So, what do you think?  Will these guidelines help the average attorney or judge better understand TAR?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Mike Q Says the Weakest Link in TAR is Humans: eDiscovery Best Practices

We started the week with a post from Tom O’Connor (his final post in his eDiscovery Project Management from Both Sides series).  And, we’re ending the week covering an article from Mike Quartararo on Technology Assisted Review (TAR).  You would think we were inadvertently promoting our webcast next week or something.  :o)

Remember The Weakest Link? That was the early 2000’s game show with the sharp-tongued British hostess (Anne Robinson) telling contestants that were eliminated “You are the weakest link.  Goodbye!”  Anyway, in Above the Law (Are Humans The Weak Link In Technology-Assisted Review?), Mike takes a look at the debate as to which tool is the superior tool for conducting TAR and notes the lack of scientific studies that point to any particular TAR software or algorithm being dramatically better or, more importantly, significantly more accurate, than any other.  So, if it’s not the tool that determines the success or failure of a TAR project, what is it?  Mike says when TAR has problems, it’s because of the people.

Of course, Mike knows quite a bit about TAR.  He’s managed his “share of” of projects, has used “various flavors of TAR” and notes that “none of them are perfect and not all of them exceed all expectations in all circumstances”.  Mike has also been associated with the EDRM TAR project (which we covered earlier this year here) for two years as a team leader, working with others to draft proposed standards.

When it comes to observations about TAR that everyone should be able to agree on, Mike identifies three: 1) that TAR is not Artificial Intelligence, just “machine learning – nothing more, nothing less”, 2) that TAR technology works and “TAR applications effectively analyze, categorize, and rank text-based documents”, and 3) “using a TAR application — any TAR application — saves time and money and results in a reasonable and proportional outcome.”  Seems logical to me.

So, when TAR doesn’t work, “the blame may fairly be placed at the feet (and in the minds) of humans.”  We train the software by categorizing the training documents, we operate the software, we analyze the outcome.  So, it’s our fault.

Last month, we covered this case where the plaintiffs successfully requested additional time for discovery when defendant United Airlines, using TAR to manage its review process, produced 3.5 million documents.  However, sampling by the plaintiffs (and later confirmed by United) found that the production contained only 600,000 documents that were responsive to their requests (about 17% of the total production).  That seems like a far less than ideal TAR result to me.  Was that because of human failure?  Perhaps, when it comes down to it, the success of TAR being dependent on humans points us back to the long-used phrase regarding humans and computers: Garbage In, Garbage Out.

So, what do you think?  Should TAR be considered Artificial Intelligence?  As always, please share any comments you might have or if you’d like to know more about a particular topic.

Image Copyright © British Broadcasting Corporation

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Plaintiffs Granted Discovery Extension Due to Defendant’s TAR Review Glitch: eDiscovery Case Law

In the case In Re Domestic Airline Travel Antitrust Litigation, MDL Docket No. 2656, Misc. No. 15-1404 (CKK), (D.D.C. Sept. 13, 2018), District of Columbia District Judge Colleen Kollar-Kotelly granted the Plaintiffs’ Motion for an Extension of Fact Discovery Deadlines (over the defendants’ objections) for six months, finding that defendant “United’s production of core documents that varied greatly from the control set in terms of the applicable standards for recall and precision and included a much larger number of non-responsive documents that was anticipated” (United’s core production of 3.5 million documents contained only 600,000 documents that were responsive).

Case Background

In the case involves a multidistrict class action litigation brought by the plaintiffs (purchasers of air passenger transportation for domestic travel) alleging that the defendant airlines willingly conspired to engage in unlawful restraint of trade, the plaintiffs filed an instant Motion for Extension of Time to Complete Discovery, requesting an extension of six months, predicated on an “issue with United’s ‘core’ document production,” asserting that defendant United produced more than 3.5 million [core] documents to the Plaintiffs, but “due to United’s technology assisted review process (‘TAR’), only approximately 17%, or 600,000, of the documents produced are responsive to Plaintiffs’ requests,” and the plaintiffs (despite having staffed their discovery review with 70 attorneys) required additional time to sort through them.

Both defendants (Delta and United) opposed the plaintiffs’ request for an extension, questioning whether the plaintiffs had staffed the document review with 70 attorneys and suggesting the Court review the plaintiffs’ counsel’s monthly time sheets to verify that statement.  Delta also questioned by it would take the plaintiffs so long to review the documents and tried to extrapolate how long it would take to review the entire set of documents based on a review of 3 documents per minute (an analysis that the plaintiffs called “preposterous”).  United indicated that it engaged “over 180 temporary contract attorneys to accomplish its document production and privilege log process within the deadlines” set by the Court, so the plaintiffs should be expected to engage in the same expenditure of resources.  But, the plaintiffs contended that they “could not have foreseen United’s voluminous document production made up [of] predominantly non-responsive documents resulting from its deficient TAR process when they jointly proposed an extension of the fact discovery deadline in February 2018.”

Judge’s Ruling

Judge Kollar-Kotelly noted that “Plaintiffs contend that a showing of diligence involves three factors — (1) whether the moving party diligently assisted the Court in developing a workable scheduling order; (2) that despite the diligence, the moving party cannot comply with the order due to unforeseen or unanticipated matters; and (3) that the party diligently sought an amendment of the schedule once it became apparent that it could not comply without some modification of the schedule.”  She noted that “there is no dispute that the parties diligently assisted the Court in developing workable scheduling orders through their preparation of Joint Status Reports prior to the status conferences in which discovery issues and scheduling were discussed, and in their meetings with the Special Master, who is handling discovery matters in this case.”

Judge Kollar-Kotelly also observed that “United’s core production of 3.5 million documents — containing numerous nonresponsive documents — was unanticipated by Plaintiffs, considering the circumstances leading up to that production” and that “Plaintiffs devoted considerable resources to the review of the United documents prior to filing this motion seeking an extension”.  Finding also that “Plaintiffs’ claim of prejudice in not having the deadlines extended far outweighs any inconvenience that Defendants will experience if the deadlines are extended”, Judge Kollar-Kotelly found “that Plaintiffs have demonstrated good cause to warrant an extension of deadlines in this case based upon Plaintiffs’ demonstration of diligence and a showing of nominal prejudice to the Defendants, if an extension is granted, while Plaintiffs will be greatly prejudiced if the extension is not granted.”  As a result, she granted the motion to request the extension.

So, what do you think?  Was the court right to have granted the extension?  Please let us know if any comments you might have or if you’d like to know more about a particular topic.

Case opinion link courtesy of eDiscovery Assistant.

Also, if you’re going to be in Houston on Thursday, September 27, just a reminder that I will be speaking at the second annual Legal Technology Showcase & Conference, hosted by the Women in eDiscovery (WiE), Houston Chapter, South Texas College of Law and the Association of Certified E-Discovery Specialists (ACEDS).  I’ll be part of the panel discussion AI and TAR for Legal: Use Cases for Discovery and Beyond at 3:00pm and CloudNine is also a Premier Platinum Sponsor for the event (as well as an Exhibitor, so you can come learn about us too).  Click here to register!

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Survey Says! Predictive Coding Technologies and Protocols Survey Results: eDiscovery Trends

Last week, I discussed the predictive coding survey that Rob Robinson was conducting on his Complex Discovery site (along with the overview of key predictive coding related terms.  The results are in and here are some of the findings.

As Rob notes in the results post here, the Predictive Coding Technologies and Protocols Survey was initiated on August 31 and concluded on September 15.  It’s a non-scientific survey designed to help provide a general understanding of the use of predictive coding technologies and protocols from data discovery and legal discovery professionals within the eDiscovery ecosystem.  The survey was designed to provide a general understanding of predictive coding technologies and protocols and had two primary educational objectives:

  • To provide a consolidated listing of potential predictive coding technology and protocol definitions. While not all-inclusive or comprehensive, the listing was vetted with selected industry predictive coding experts for completeness and accuracy, thus it appears to be profitable for use in educational efforts.
  • To ask eDiscovery ecosystem professionals about their usage and preferences of predictive coding platforms, technologies, and protocols.

There were 31 total respondents in the survey.  Here are some of the more notable results:

  • More than 80% of responders (80.64%) shared that they did have a specific primary platform for predictive coding versus just under 20% (19.35%), who indicated they did not.
  • There were 12 different platforms noted as primary predictive platforms by responders, but only three platforms received more than one vote and they accounted for more than 50% of responses (61%).
  • Active Learning was the most used predictive coding technology, with more than 70% of responders (70.96%) reporting that they use it in their predictive coding efforts.
  • Just over two-thirds of responders (67.74%) use more than one predictive coding technology in their predictive coding efforts, while just under one-third (32.25%) use only one.
  • Continuous Active Learning (CAL) was (by far) the most used predictive coding protocol, with more than 87% of responders (87.09%) reporting that they use it in their predictive coding efforts.

Rob has reported several other results and provided graphs for additional details.  To check out all of the results, click here.

So, what do you think?  Do any of the results surprise you?  Please share any comments you might have or if you’d like to know more about a particular topic.

Also, if you’re going to be in Houston on Thursday, September 27, just a reminder that I will be speaking at the second annual Legal Technology Showcase & Conference, hosted by the Women in eDiscovery (WiE), Houston Chapter, South Texas College of Law and the Association of Certified E-Discovery Specialists (ACEDS).  I’ll be part of the panel discussion AI and TAR for Legal: Use Cases for Discovery and Beyond at 3:00pm and CloudNine is also a Premier Platinum Sponsor for the event (as well as an Exhibitor, so you can come learn about us too).  Click here to register!

Image Copyright (C) FremantleMedia North America, Inc.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

If You’re an eDiscovery Professional Interested in Predictive Coding, Here is a Site You May Want to Check Out: eDiscovery Trends

On his Complex Discovery site, Rob Robinson does a great job of analyzing trends in the eDiscovery industry and often uses surveys to gauge sentiment within the industry for things like industry business confidence.  Now, Rob is proving and overview and conducting a survey regarding predictive coding technologies and protocols for representatives of leading eDiscovery providers that should prove interesting.

On his site at Predictive Coding Technologies and Protocols: Overview and Survey, Rob notes that “it is increasingly more important for electronic discovery professionals to have a general understanding of the technologies that may be implemented in electronic discovery platforms to facilitate predictive coding of electronically stored information.”  To help in that, Rob provides working lists of predictive coding technologies and TAR protocols that is worth a review.

You probably know what Active Learning is.  Do you know what Latent Semantic Analysis is? What about Logistic Regression?  Or a Naïve Bayesian Classifier?  If you don’t, Rob discusses definitions for these different types of predictive coding technologies and others.

Then, Rob also provides a list of general TAR protocols that includes Simple Passive Learning (SPL), Simple Active Learning (SAL), Continuous Active Learning (CAL) and Scalable Continuous Active Learning (S-CAL), as well as the Hybrid Multmodal Method used by Ralph Losey.

Rob concludes with a link to a simple three-question survey designed to help electronic discovery professionals identify the specific machine learning technologies and protocols used by eDiscovery providers in delivering the technology-assisted review feature of predictive coding.  It literally take 30 seconds to complete.  To find out the questions, you’ll have to check out the survey.  ;o)

So far, Rob has received 19 responses (mine was one of those).  It will be interesting to see the results when he closes the survey and publishes the results.

So, what do you think?  Are you an expert in predictive coding technologies and protocols?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.