Analysis

Never Mind! Plaintiffs Not Required to Use Predictive Coding After All – eDiscovery Case Law

Remember EORHB v. HOA Holdings, where, in a surprise ruling, both parties were instructed to use predictive coding by the judge?  Well, the judge has changed his mind.

As reported by Robert Hilson in the Association of Certified E-Discovery Specialists® (ACEDS) web site (subscription required), Delaware Chancery Court Vice Chancellor J. Travis Laster has revised his decision in EORHB, Inc. v. HOA Holdings, LLC, No. 7409-VCL (Del. Ch. May 6, 2013).  The new order enables the defendants to continue to utilize computer assisted review with their chosen vendor but no longer requires both parties to use the same vendor and enables the plaintiffs, “based on the low volume of relevant documents expected to be produced” to perform document review “using traditional methods.”

Here is the text of this very short order:

WHEREAS, on October 15, 2012, the Court entered an Order providing that, “[a]bsent a modification of this order for good cause shown, the parties shall (i) retain a single discovery vendor to be used by both sides, and (ii) conduct document review with the assistance of predictive coding;”

WHEREAS, the parties have proposed that HOA Holdings LLC and HOA Restaurant Group LLC (collectively, “Defendants”) retain ediscovery vendor Kroll OnTrack for electronic discovery;

WHEREAS, the parties have agreed that, based on the low volume of relevant documents expected to be produced in discovery by EORHB, Inc., Coby G. Brooks, Edward J. Greene, James P. Creel, Carter B. Wrenn and Glenn G. Brooks (collectively, “Plaintiffs”), the cost of using predictive coding assistance would likely be outweighed by any practical benefit of its use;

WHEREAS, the parties have agreed that there is no need for the parties to use the same discovery review platform;

WHEREAS, the requested modification of the Order will not prejudice any of the parties;

NOW THEREFORE, this –––– day of May 2013, for good cause shown, it is hereby ORDERED that:

(i) Defendants may retain ediscovery vendor Kroll OnTrack and employ Kroll OnTrack and its computer assisted review tools to conduct document review;

(ii) Plaintiffs and Defendants shall not be required to retain a single discovery vendor to be used by both sides; and

(iii) Plaintiffs may conduct document review using traditional methods.

Here is a link to the order from the article by Hilson.

So, what do you think?  Should a party ever be ordered to use predictive coding?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

More Updates from the EDRM Annual Meeting – eDiscovery Trends

Yesterday, we discussed some general observations from the Annual Meeting for the Electronic Discovery Reference Model (EDRM) group and discussed some significant efforts and accomplishments by the (suddenly heavily talked about) EDRM Data Set project.  Here are some updates from other projects within EDRM.

It should be noted these are summary updates and that most of the focus on these updates is on accomplishments for the past year and deliverables that are imminent.  Over the next few weeks, eDiscovery Daily will cover each project in more depth with more details regarding planned activities for the coming year.

Model Code of Conduct (MCoC)

The MCoC was introduced in 2011 and became available for organizations to subscribe last year.  To learn more about the MCoC, you can read the code online here, or download it as a 22 page PDF file here.  Subscribing is easy!  To voluntarily subscribe to the MCoC, you can register on the EDRM website here.  Identify your organization, provide information for an authorized representative and answer four verification questions (truthfully, of course) to affirm your organization’s commitment to the spirit of the MCoC, and your organization is in!  You can also provide a logo for EDRM to include when adding you to the list of subscribing organizations.  Pending a survey of EDRM members to determine if any changes are needed, this project has been completed.  Team leaders include Eric Mandel of Zelle Hofmann, Kevin Esposito of Rivulex and Nancy Wallrich.

Information Governance Reference Model (IGRM)

The IGRM team has continued to make strides and improvements on an already terrific model.  Last October, they unveiled the release of version 3.0 of the IGRMAs their press release noted, “The updated model now includes privacy and security as primary functions and stakeholders in the effective governance of information.”  IGRM continues to be one of the most active and well participated EDRM projects.  This year, the early focus – as quoted from Judge Andrew Peck’s keynote speech at Legal Tech this past year – is “getting rid of the junk”.  Project leaders are Aliye Ergulen from IBM, Reed Irvin from Viewpointe and Marcus Ledergerber from Morgan Lewis.

Search

One of the best examples of the new, more agile process for creating deliverables within EDRM comes from the Search team, which released its new draft Computer Assisted Review Reference Model (CARRM), which depicts the flow for a successful Computer Assisted Review project. The entire model was created in only a matter of weeks.  Early focus for the Search project for the coming year includes adjustments to CARRM (based on feedback at the annual meeting).  You can also still send your comments regarding the model to mail@edrm.net or post them on the EDRM site here.  A webinar regarding CARRM is also planned for late July.  Kudos to the Search team, including project leaders Dominic Brown of Autonomy and also Jay Lieb of kCura, who got unmerciful ribbing for insisting (jokingly, I think) that TIFF files, unlike Generalissimo Francisco Franco, are still alive.  🙂

Jobs

In late January, the Jobs Project announced the release of the EDRM Talent Task Matrix diagram and spreadsheet, which is available in XLSX or PDF format. As noted in their press release, the Matrix is a tool designed to help hiring managers better understand the responsibilities associated with common eDiscovery roles. The Matrix maps responsibilities to the EDRM framework, so eDiscovery duties associated can be assigned to the appropriate parties.  Project leader Keith Tom noted that next steps include surveying EDRM members regarding the Matrix, requesting and co-authoring case-studies and white papers, and creating a short video on how to use the Matrix.

Metrics

In today’s session, the Metrics project team unveiled the first draft of the new Metrics model to EDRM participants!  Feedback was provided during the session and the team will make the model available for additional comments from EDRM members over the next week or so, with a goal of publishing for public comments in the next two to three weeks.  The team is also working to create a page to collect Metrics measurement tools from eDiscovery professionals that can benefit the eDiscovery community as a whole.  Project leaders Dera Nevin of TD Bank and Kevin Clark noted that June is “budget calculator month”.

Other Initiatives

As noted yesterday, there is a new project to address standards for working with native files in the different EDRM phases led by Eric Mandel from Zelle Hofmann and also a new initiative to establish collection guidelines, spearheaded by Julie Brown from Vorys.  There is also an effort underway to refocus the XML project, as it works to complete the 2.0 version of the EDRM XML model.  In addition, there was quite a spirited discussion as to where EDRM is heading as it approaches ten years of existence and it will be interesting to see how the EDRM group continues to evolve over the next year or so.  As you can see, a lot is happening within the EDRM group – there’s a lot more to it than just the base Electronic Discovery Reference Model.

So, what do you think?  Are you a member of EDRM?  If not, why not?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Reporting from the EDRM Annual Meeting and a Data Set Update – eDiscovery Trends

The Electronic Discovery Reference Model (EDRM) Project was created in May 2005 by George Socha of Socha Consulting LLC and Tom Gelbmann of Gelbmann & Associates to address the lack of standards and guidelines in the electronic discovery market.  Now, beginning its ninth year of operation with its annual meeting in St. Paul, MN, EDRM is accomplishing more than ever to address those needs.  Here are some highlights from the meeting, and an update regarding the (suddenly heavily talked about) EDRM Data Set project.

Annual Meeting

Twice a year, in May and October, eDiscovery professionals who are EDRM members meet to continue the process of working together on various standards projects.  This will be my eighth year participating in EDRM at some level and, oddly enough, I’m assisting with PR and promotion (how am I doing so far?).  eDiscovery Daily has referenced EDRM and its phases many times in the 2 1/2 years plus history of the blog – this is our 144th post that relates to EDRM!

Some notable observations about today’s meeting:

  • New Participants: More than half the attendees at this year’s annual meeting are attending for the first time.  EDRM is not just a core group of “die-hards”, it continues to find appeal with eDiscovery professionals throughout the industry.
  • Agile Approach: EDRM has adopted an Agile approach to shorten the time to complete and publish deliverables, a change in philosophy that facilitated several notable accomplishments from working groups over the past year including the Model Code of Conduct (MCoC), Information Governance Reference Model (IGRM), Search and Jobs (among others).  More on that tomorrow.
  • Educational Alliances: For the first time, EDRM has formed some interesting and unique educational alliances.  In April, EDRM teamed with the University of Florida Levin College of Law to present a day and a half conference entitled E-Discovery for the Small and Medium Case.  And, this June, EDRM will team with Bryan University to provide an in-depth, four-week E-Discovery Software & Applied Skills Summer Immersion Program for Law School Students.
  • New Working Group: A new working group to be lead by Eric Mandel of Zelle Hoffman was formed to address standards for working with native files in the different EDRM phases.

Tomorrow, we’ll discuss the highlights for most of the individual working groups.  Given the recent amount of discussion about the EDRM Data Set group, we’ll start with that one today!

Data Set

The EDRM Enron Data Set has been around for several years and has been a valuable resource for eDiscovery software demonstration and testing (we covered it here back in January 2011).  The data in the EDRM Enron PST Data Set files is sourced from the FERC Enron Investigation release made available by Lockheed Martin Corporation.  It was reconstituted as PST files with attachments for the EDRM Data Set Project.  So, in essence EDRM took already public domain available data and made the data much more usable.  Initially, the data was made available for download on the EDRM site, then subsequently moved to Amazon Web Services (AWS).

In the past several days, there has been much discussion about the personally-identifiable information (“PII”) available within the FERC (and consequently the EDRM Data Set), including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers.  Consequently, the EDRM Data Set has been taken down from the AWS site.

The Data Set team led by Michael Lappin of Nuix and Eric Robi of Elluma Discovery has been working on a process (using predictive coding technology) to identify and remove the PII data from the EDRM Data Set.  Discussions about this process began months ago, prior to the recent discussions about the PII data contained within the set.  The team has completed this iterative process for V1 of the data set (which contains 1,317,158 items), identifying and removing 10,568 items with PII, HIPAA and other sensitive information.  This version of the data set will be made available within the EDRM community shortly for peer review testing.  The data set team will then repeat the process for the larger V2 version of the data set (2,287,984 items).  A timetable for republishing both sets should be available soon and the efforts of the Data Set team on this project should pay dividends in developing and standardizing processes for identifying and eliminating sensitive data that eDiscovery professionals can use in their own data sets.

The team has also implemented a Forensic Files Testing Project site where users can upload their own “modern”, non-copyrighted file samples that are typically encountered during electronic discovery processing to provide a more diverse set of data than is currently available within the Enron data set.

So, what do you think?  How has EDRM impacted how you manage eDiscovery?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Plaintiffs’ Objections to Defendant’s Use of Keyword Search before Predictive Coding Rejected – eDiscovery Case Law

Is it possible to produce documents for discovery too early?  At least one plaintiff’s group says yes.

In the case In Re: Biomet M2a Magnum Hip Implant Products Liability Litigation (MDL 2391), thhttps://cloudnine.com/ediscoverydaily/ralph-losey-of-jackson-lewis-llp-ediscovery-trends-part-1/e Plaintiffs’ Steering Committee in a Multi District Litigation objected to the defendant’s use of keyword searching prior to performing predictive coding and requested that the defendant go back to its original set of 19.5 million documents and repeat the predictive coding without performing keyword searching.  Indiana District Judge Robert L. Miller, Jr. denied the request.

Defendant’s Discovery Efforts to Date

In this dispute over hip implant products, the defendant began producing documents in cases that were eventually centralized, despite (sometimes forceful) requests by plaintiffs’ counsel not to begin document production until the decision whether to centralize was made.  The defendant used keyword culling to reduce the universe of documents and attachments from 19.5 million documents to 3.9 million documents, and removing duplicates left 2.5 million documents and attachments. The defendant performed statistical sampling tests, with a 99 percent confidence rate, to determine that between .55% and 1.33% of the unselected documents would be responsive and (with the same confidence level) that between 1.37% and 2.47% of the original 19.5 million documents were responsive.  The defendant’s approach actually retrieved 16% of the original 19.5 million.  The defendant then performed predictive coding to identify responsive documents to be produced from the set of 2.5 million documents.

According to the order, the defendant’s eDiscovery costs “are about $1.07 million and will total between $2 million and $3.25 million.” {emphasis added}  The defendant “invited the Plaintiffs’ Steering Committee to suggest additional search terms and offered to produce the rest of the non-privileged documents from the post-keyword 2.5 million”, but they declined, “believing they are too little to assure proper document production”.

Plaintiffs’ Objections

The plaintiffs’ Steering Committee objected, claiming that the defendant’s use of keyword searching “has tainted the process”, pointing to an article which “mentioned unidentified ‘literature stating that linear review would generate a responsive rate of 60 percent and key word searches only 20 percent, and [the defendants in the case being discussed] proposed that predictive coding at a 75 percent responsive rate would be sufficient.’” {emphasis added}  They requested that the defendant “go back to its 19.5 million documents and employ predictive coding, with plaintiffs and defendants jointly entering the ‘find more like this’ commands.”  In response to the defendant’s objections that virtually starting over would cost additional millions, the Steering Committee blamed the defendant for spending millions on document production despite being warned not to begin until the cases had been centralized.

Judge’s Ruling

Noting that “[w]hat Biomet has done complies fully with the requirements of Federal Rules of Civil Procedure 26(b) and 34(b)(2)”, Judge Miller noted that “the Steering Committee’s request that Biomet go back to Square One…and institute predictive coding at that earlier stage sits uneasily with the proportionality standard in Rule 26(b)(2)(C).”  Continuing, Judge Miller stated:

“Even in light of the needs of the hundreds of plaintiffs in this case, the very large amount in controversy, the parties’ resources, the importance of the issues at stake, and the importance of this discovery in resolving the issues, I can’t find that the likely benefits of the discovery proposed by the Steering Committee equals or outweighs its additional burden on, and additional expense to, Biomet.”

Judge Miller also rejected the Steering Committee’s position that the defendant can’t rely on proportionality arguments because they proceeded with document production while the centralization decision was pending: “The Steering Committee hasn’t argued (and I assume it can’t argue) that Biomet had no disclosure or document identification obligation in any of the cases that were awaiting a ruling on (or even the filing of) the centralization petition.”  As a result, he ruled that the Steering Committee would have to bear the expense for “production of documents that can be identified only through re-commenced processing, predictive coding, review, and production”.

So, what do you think?  Was the judge correct to accept the defendant’s multimodal approach to discovery?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Appeals Court Upholds Decision Not to Recuse Judge Peck in Da Silva Moore – eDiscovery Case Law

As reported by IT-Lex, the Second Circuit of the US Court of Appeals rejected the Plaintiff’s request for a writ of mandamus recusing Magistrate Judge Andrew J. Peck from Da Silva Moore v. Publicis Groupe SA.

The entire opinion is stated as follows:

“Petitioners, through counsel, petition this Court for a writ of mandamus compelling the recusal of Magistrate Judge Andrew J. Peck. Upon due consideration, it is hereby ORDERED that the mandamus petition is DENIED because Petitioners have not ‘clearly and indisputably demonstrate[d] that [Magistrate Judge Peck] abused [his] discretion’ in denying their district court recusal motion, In re Basciano, 542 F. 3d 950, 956 (2d Cir. 2008) (internal quotation marks omitted) (quoting In re Drexel Burnham Lambert Inc., 861 F.2d 1307, 1312-13 (2d Cir. 1988)), or that the district court erred in overruling their objection to that decision.”

Now, the plaintiffs have been denied in their recusal efforts in three courts.

Since it has been a while, let’s recap the case for those who may have not been following it and may be new to the blog.

Last year, back in February, Judge Peck issued an opinion making this case likely the first case to accept the use of computer-assisted review of electronically stored information (“ESI”) for this case.  However, on March 13, District Court Judge Andrew L. Carter, Jr. granted the plaintiffs’ request to submit additional briefing on their February 22 objections to the ruling.  In that briefing (filed on March 26), the plaintiffs claimed that the protocol approved for predictive coding “risks failing to capture a staggering 65% of the relevant documents in this case” and questioned Judge Peck’s relationship with defense counsel and with the selected vendor for the case, Recommind.

Then, on April 5, Judge Peck issued an order in response to Plaintiffs’ letter requesting his recusal, directing plaintiffs to indicate whether they would file a formal motion for recusal or ask the Court to consider the letter as the motion.  On April 13, (Friday the 13th, that is), the plaintiffs did just that, by formally requesting the recusal of Judge Peck (the defendants issued a response in opposition on April 30).  But, on April 25, Judge Carter issued an opinion and order in the case, upholding Judge Peck’s opinion approving computer-assisted review.

Not done, the plaintiffs filed an objection on May 9 to Judge Peck’s rejection of their request to stay discovery pending the resolution of outstanding motions and objections (including the recusal motion, which has yet to be ruled on.  Then, on May 14, Judge Peck issued a stay, stopping defendant MSLGroup’s production of electronically stored information.  On June 15, in a 56 page opinion and order, Judge Peck denied the plaintiffs’ motion for recusal.  Judge Carter ruled on the plaintiff’s recusal request on November 7, denying the request and stating that “Judge Peck’s decision accepting computer-assisted review … was not influenced by bias, nor did it create any appearance of bias”.

So, what do you think?  Will this finally end the recusal question in this case?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Fulbright’s Litigation Trends Survey Shows Increased Litigation, Mobile Device Collection – eDiscovery Trends

According to Fulbright’s 9th Annual Litigation Trends Survey released last month, companies in the United States and United Kingdom continue to deal with, and spend more on litigation.  From an eDiscovery standpoint, the survey showed an increase in requirements to preserve and collect data from employee mobile devices, a high reliance on self-preservation to fulfill preservation obligations and a decent percentage of organizations using technology assisted review.

Here are some interesting statistics from the report:

PARTICIPANTS

Here is a breakdown of the participants in the survey.

  • There were 392 total participants from the US and UK, 96% of which were either General Counsel (82%) or Head of Litigation (14%).
  • About half (49%) of the companies surveyed, were billion dollar companies with $1 billion or more in gross revenue.  36% of the total companies have revenues of $10 billion or more.

LITIGATION TRENDS

The report showed increases in both the number of cases being encountered by organizations, as well as the total expenditures for litigation.

Increasing Litigation Cases

  • This year, 92% of respondents anticipate either the same amount or more litigation, up from 89% last year.  26% of respondents expect litigation to increase, while 66% expect litigation to stay the same.  Among the larger companies, 33% of respondents expect more disputes, and 94% expect either the same number or an increase.
  • The number of respondents reporting that they had received a lawsuit rose this year to 86% estimating at least one matter, compared with 73% last year. Those estimating at least 21 lawsuits or more rose to 33% from 22% last year.
  • Companies facing at least one $20 million lawsuit rose to 31% this year, from 23% the previous year.

Increasing Litigation Costs

  • The percentage of companies spending $1 million or more on litigation has increased for the third year in a row to 54%, up from 51% in 2011 and 46% in 2010, primarily due to a sharp rise in $1 million+ cases in the UK (rising from 38% in 2010 up to 53% in 2012).
  • In the US, 53% of organizations spend $1 million or more on litigation and 17% spend $10 million or more.
  • 33% of larger companies spent $10 million on litigation, way up from 19% the year before (and 22% in 2010).

EDISCOVERY TRENDS

The report showed an increase in requirements to preserve and collect data from employee mobile devices, a high reliance on self-preservation to fulfill preservation obligations and a decent percentage of organizations using technology assisted review.

Mobile Device Preservation and Collection

  • 41% of companies had to preserve and/or collect data from an employee mobile device because of litigation or an investigation in 2012, up from 32% in 2011.
  • Similar increases were reported by respondents from larger companies (38% in 2011, up to 54% in 2012) and midsized companies (26% in 2011, up to 40% in 2012).  Only respondents from smaller companies reported a drop (from 26% to 14%).

Self-Preservation

  • 69% of companies rely on individuals preserving their own data (i.e., self-preservation) in any of their disputes or investigations.  Larger and mid-sized companies are more likely to utilize self-preservation (73% and 72% respectively) than smaller companies (52%).
  • 41% of companies use self-preservation in all of their matters, and 73% use it for half or more of all matters.
  • When not relying on self-preservation, 72% of respondents say they depend on the IT function to collect all data sources of pertinent custodians.
  • Reasons that respondents gave for not relying on self-preservation included: More cost effective and efficient not to rely on custodian 29%; Lack of compliance by custodians 24%; High profile matter 23%; High monetary or other exposure 22%; Need to conduct forensics 20%; Some or all custodians may have an incentive to improperly delete potentially relevant information; 18%; Case law does not support self-preservation 14% and High profile custodian 11%.

Technology Assisted Review

  • 35% of all respondents are using technology assisted review for at least some of their matters.  U.S. companies are more likely to employ technology-assisted review than their U.K. counterparts (40% versus 23%).
  • 43% of larger companies surveyed use technology assisted review, compared with 32% of mid-sized companies and 23% of the smaller companies.
  • Of those companies utilizing technology assisted review, 21% use it in all of their matters and 51% use it for half or more of their matters.

There are plenty more interesting stats and trends in the report, which is free(!).  To download your own copy of the report, click here.

So, what do you think?  Do any of those trends surprise you?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Daily Is Thirty! (Months Old, That Is)

Thirty months ago yesterday, eDiscovery Daily was launched.  It’s hard to believe that it has been 2 1/2 years since our first three posts that debuted on our first day.  635 posts later, a lot has happened in the industry that we’ve covered.  And, yes we’re still crazy after all these years for committing to a daily post each business day, but we still haven’t missed a business day yet.  Twice a year, we like to take a look back at some of the important stories and topics during that time.  So, here are just a few of the posts over the last six months you may have missed.  Enjoy!

In addition, Jane Gennarelli has been publishing an excellent series to introduce new eDiscovery professionals to the litigation process and litigation terminology.  Here is the latest post, which includes links to the previous twenty one posts.

Thanks for noticing us!  We’ve nearly quadrupled our readership since the first six month period and almost septupled (that’s grown 7 times in size!) our subscriber base since those first six months!  We appreciate the interest you’ve shown in the topics and will do our best to continue to provide interesting and useful eDiscovery news and analysis.  And, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Five Common Myths About Predictive Coding – eDiscovery Best Practices

During my interviews with various thought leaders (a list of which can be found here, with links to each interview), we discussed various aspects of predictive coding and some of the perceived myths that exist regarding predictive coding and what it means to the review process.  I thought it would be a good idea to recap some of those myths and how they compare to the “reality” (at least as some of us see it).  Or maybe just me.  🙂

1.     Predictive Coding is New Technology

Actually, with all due respect to each of the various vendors that have their own custom algorithm for predictive coding, the technology for predictive coding as a whole is not new technology.  Ever heard of artificial intelligence?  Predictive coding, in fact, applies artificial intelligence to the review process.  With all of the acronyms we use to describe predictive coding, here’s one more for consideration: “Artificial Intelligence for Review” or “AIR”.  May not catch on, but I like it.

Maybe attorneys would be more receptive to it if they understood as artificial intelligence?  As Laura Zubulake pointed out in my interview with her, “For years, algorithms have been used in government, law enforcement, and Wall Street.  It is not a new concept.”  With that in mind, Ralph Losey predicts that “The future is artificial intelligence leveraging your human intelligence and teaching a computer what you know about a particular case and then letting the computer do what it does best – which is read at 1 million miles per hour and be totally consistent.”

2.     Predictive Coding is Just Technology

Treating predictive coding as just the algorithm that “reviews” the documents is shortsighted.  Predictive coding is a process that includes the algorithm.  Without a sound approach for identifying appropriate example documents for the collection, ensuring educated and knowledgeable reviewers to appropriately code those documents and testing and evaluating the results to confirm success, the algorithm alone would simply be another case of “garbage in, garbage out” and doomed to fail.

As discussed by both George Socha and Tom Gelbmann during their interviews with this blog, EDRM’s Search project has published the Computer Assisted Review Reference Model (CARRM), which has taken steps to define that sound approach.  Nigel Murray also noted that “The people who really understand computer assisted review understand that it requires a process.”  So, it’s more than just the technology.

3.     Predictive Coding and Keyword Searching are Mutually Exclusive

I’ve talked to some people that think that predictive coding and key word searching are mutually exclusive, i.e., that you wouldn’t perform key word searching on a case where you plan to use predictive coding.  Not necessarily.  Ralph Losey advocates a “multimodal” approach, noting it as: “more than one kind of search – using predictive coding, but also using keyword search, concept search, similarity search, all kinds of other methods that we have developed over the years to help train the machine.  The main goal is to train the machine.”

4.     Predictive Coding Eliminates Manual Review

Many people think of predictive coding as the death of manual review, with all attorney reviewers being replaced by machines.  Actually, manual review is a part of the predictive coding process in several aspects, including: 1) Subject matter knowledgeable reviewers are necessary to perform review to create a training set of documents for the technology, 2) After the process is performed, both sets (the included and excluded documents) are sampled and the samples are reviewed to determine the effectiveness of the process, and 3) The resulting responsive set is generally reviewed to confirm responsiveness and also to determine whether the documents are privileged.  Without manual review to train the technology and verify the results, the process would fail.

5.     Predictive Coding Has to Be Perfect to Be Useful

Detractors of predictive coding note that predictive coding can miss plenty of responsive documents and is nowhere near 100% accurate.  In one recent case, the producing party estimated as many as 31,000 relevant documents may have been missed by the predictive coding process.  However, they also estimated that a much more costly manual review would have missed as many as 62,000 relevant documents.

Craig Ball’s analogy about the two hikers that encounter the angry grizzly bear is appropriate – the one hiker doesn’t have to outrun the bear, just the other hiker.  Craig notes: “That is how I look at technology assisted review.  It does not have to be vastly superior to human review; it only has to outrun human review.  It just has to be as good or better while being faster and cheaper.”

So, what do you think?  Do you agree that these are myths?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 3

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

  1. What are your general observations about LTNY this year and how it fits into emerging trends?
  2. If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball.  A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers.  Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post.  Wednesday was part 1 and yesterday was part 2.  Today is the third and last part.  A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

What are you working on that you’d like our readers to know about?

I’m really trying to make 2013 the year of distilling an extensive but idiosyncratic body of work that I’ve amassed through years of writing and bring it together into a more coherent curriculum.  I want to develop a no-cost casebook for law students and to structure my work so that it can be more useful for people in different places and phases of their eDiscovery education.  So, I’ll be working on that in the first six or eight months of 2013 as both an academic and a personal project.

I’m also trying to go back to roots and rethink some of the assumptions that I’ve made about what people understand.  It’s frustrating to find that lawyers talking about, say, load files when they don’t really know what a load file is, they’ve never looked at a load file.  They’ve left it to somebody else and, so, the resolution of difficulties has gone through so many hands and is plagued by so much miscommunication.   I’d like to put some things out there that will enable lawyers in a non-threatening and accessible way to gain comfort in having a dialog about the fundamentals of eDiscovery that you and I take for granted.  So, that we don’t have to have this reliance upon vendors for the simplest issues.  I don’t mean that vendors won’t do the work, but I don’t think we should have to bring a technical translator in for every phone call.

There should be a corpus of competence that every litigator brings to the party, enabling them to frame basic protocols and agreements that aren’t merely parroting something that they don’t understand, but enabling them to negotiate about issues in ways that the resolutions actually make sense.  Saying “I won’t give you 500 search terms, but I’ll give you 250” isn’t a rational resolution.  It’s arbitrary.

There are other kinds of cases that you can identify search terms “all the live long day” and they’re really never going to get you that much closer to the documents you want.  The best example in recent years was the Pippins v. KPMG case.  KPMG was arguing that they could use search terms against samples to identify forensically significant information about work day and work responsibility.  That didn’t make any sense to me at all.  The kinds of data they were looking for wasn’t going to be easily found by using keyword search.  It was going to require finding data of a certain character and bringing a certain kind of analysis to it, not an objective culling method like search terms.  Search terms have become like the expression “if you have a hammer, the whole world looks like a nail”.  We need to get away from that.

I think a little education made palatable will go a long way.  We need some good solid education and I’m trying to come up with something that people will borrow and build on.  I want it to be something that’s good enough that people will say “let’s just steal his stuff”.  That’s why I put it out there – it’s nice that they credit me and I appreciate it; but if what you really want to do is teach people, you don’t do it for the credit, you do it for the education.  That’s what I’m about, more this year than ever before.

Thanks, Craig, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 2

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

  1. What are your general observations about LTNY this year and how it fits into emerging trends?
  2. If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball.  A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers.  Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post.  Yesterday was part 1.  Today is part 2 and part 3 will be published in the blog on Friday.  A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

I noticed that you are speaking at a couple of sessions here.  What would you like to tell me about those sessions?

{Interviewed the evening before the show}  I am on a Technology Assisted Review panel with Maura Grossman and Ralph Losey that should be as close to a barrel of laughs as one can have talking about technology assisted review.  It is based on a poker theme – which was actually Matt Nelson’s (of Symantec) idea.  I think it is a nice analogy, because a good poker player is a master or mistress of probabilities, whether intuitively or overtly performing mental arithmetic that are essentially statistical and probability calculations.  Such calculations are key to quality assurance and quality control in modern review.

We have to be cautious not to require the standards for electronic assessments to be dramatically higher than the standards applied to human assessments.  It is one thing with a new technology to demand more of it to build trust.  That’s a pragmatic imperative.  It is another thing to demand so exalted a level of scrutiny that you essentially void all advantages of the new technology, including the cost savings and efficiencies it brings.  You know the old story about the two hikers that encounter the angry grizzly bear?  They freeze, and then one guy pulls out running shoes and starts changing into them.  His friend says “What are you doing? You can’t outrun a grizzly bear!” The other guy says “I know.  I only have to outrun you”.  That is how I look at technology assisted review.  It does not have to be vastly superior to human review; it only has to outrun human review.  It just has to be as good or better while being faster and cheaper.

We cannot let the vague uneasiness about the technology cause it to implode.  If we have to essentially examine everything in the discard pile, so that we not only pay for the new technology but also back it up with the old.  That’s not going to work.  It will take a few pioneers who get the “arrows in the back” early on—people who spend more to build trust around the technology that is missing at this juncture.  Eventually, people are going to say “I’ve looked at the discard pile for the last three cases and this stuff works.  I don’t need to look at all of that any more.

Even the best predictive coding systems are not going to be anywhere near 100% accurate.  They start from human judgment where we’re not even sure what “100% accurate” is, in the context of responsiveness and relevance.  There’s no “gold standard”.  Two different qualified people can look at the same document and give a different assessment and approximately 40% of the time, they do.  And, the way we decide who’s right is that we bring in a third person.  We indulge the idea that the third person is the “topic authority” and what they say goes.  We define their judgment as right; but, even their judgments are human.  To err being human, they’re going to make misjudgments based on assumptions, fatigue, inattention, whatever.

So, getting back to the topic at hand, I do think that the focus on quality assurance is going to prompt a larger and long overdue discussion about the efficacy of human review.  We’ve kept human review in this mystical world of work product for a very long time.  Honestly, the rationale for work product doesn’t naturally extend over to decisions about responsiveness and relevance.  Even though, most of my colleagues would disagree with me out of hand.  They don’t want anybody messing with privilege or work product.  It’s like religion or gun control—you can’t even start a rational debate.

Things are still so partisan and bitter.  The notions of cooperation, collaboration, transparency, translucency, communication – they’re not embedded yet.  People come to these processes with animosity so deeply seated that you’re not really starting on a level playing field with an assessment of what’s best for our system of justice.  Justice is someone else’s problem.  The players just want to win.  That will be tough to change.

We “dinosaurs” will die off, and we won’t have to wait for the glaciers to advance.  I think we will have some meteoric events that will change the speed at which the dinosaurs die.  Technology assisted review is one.  We’ve seen a meteoric rise in the discussion of the topic, the interest in the topic, and I think it will have a meteoric effect in terms of more rapidly extinguishing very bad and very expensive practices that don’t carry with them any more superior assurance of quality.

More from Craig tomorrow!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.