Review

For a More Complete and Accurate Review, Be Persistent: eDiscovery Best Practices

Manual document review can be prone to error.  It’s easy to miss highly relevant documents or privileged documents if you fail to spot the terms that cause them to be identified as highly relevant documents or privileged.  To help spot those terms, you have to be “persistent”.  And, there’s a new review of CloudNine that you might want to check out!

By “persistent”, I’m talking about persistent highlighting, which is the topic for this week’s eDiscovery Tech Tip of the Week (see what I did there?).  :o)  Let’s face it: Failing to spot highly relevant, hot or privilege terms during document review can lead to important documents being missed or inadvertent disclosure of privileged information.  Persistent highlighting enables these important terms to be always highlighted – regardless of search criteria – enabling them to be more easily spotted during review, which improves the quality of the review process.

When a review platform offers persistent highlighting, there is typically an area where you can identify the terms that you want to always be highlighted.  Once you build that list, those terms will then always be highlighted anytime you review a document containing them, generally in a color different than the highlight color used for highlighting retrieved search terms.

Persistent highlighting can help improve the accuracy and completeness of your review and can help reduce potential inadvertent disclosures of privileged information.  To see an example of how Persistent Highlighting is conducted using our CloudNine platform, click here (requires BrightTalk account, which is free).

———————————————

When evaluating an eDiscovery platform, it’s important to check out reviews of the platform so that you can gain from other perspectives on what those people like about a platform and where there are opportunities for improvement.  As we discussed previously, sites like Capterra, G2 Crowd and Gartner Peer Insights enable you to learn about actual client experiences with the platform.  And, earlier this month, we covered this free Buyer’s Guide, which reviews several eDiscovery solutions, including CloudNine, in a variety of product categories.

Now, here’s a new review of our CloudNine platform by industry thought leader Tom O’Connor.  As you may know, Tom is a long time consultant in the industry and also does some work with CloudNine, as well as participating on our webcasts with me (which has been great fun!) and writing articles.  Now, Tom has written a review of our platform that covers the full range of features, while also identifying some features he would like to see added.  So, I guess I can’t retire yet?  Thanks a lot, Tom!  ;o)  Anyway, here is a link to Tom’s review of CloudNine for your consideration.

So, what do you think?  Do you use persistent highlighting in your review processes?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Disagrees with Plaintiff’s Contentions that Defendant’s TAR Process is Defective: eDiscovery Case Law

In Winfield, et al. v. City of New York, No. 15-CV-05236 (LTS) (KHP) (S.D.N.Y. Nov. 27, 2017), New York Magistrate Judge Katharine H. Parker, after conducting an in camera review of the defendant’s TAR process and a sample set of documents, granted in part and denied in part the plaintiffs’ motion, ordering the defendant to provide copies of specific documents where the parties disagreed on their responsiveness and a random sample of 300 additional documents deemed non-responsive by the defendant.  Judge Parker denied the plaintiff’s request for information about the defendant’s TAR process, finding no evidence of gross negligence or unreasonableness in their process.

Case Background

In this dispute over alleged discrimination in the City’s affordable housing program, the parties had numerous disputes over the handling of discovery by the defendant in the case.  The plaintiffs lodged numerous complaints about the pace of discovery and document review, which initially involved only manual linear review of documents, so the Court directed the defendant to complete linear review as to certain custodians and begin using Technology Assisted Review (“TAR”) software for the rest of the collection.  After a dispute over the search terms selected for use, the plaintiffs proposed over 800 additional search terms to be run on certain custodians, most of which (after negotiation) were accepted by the defendant (despite a stated additional cost of $248,000 to review the documents).

The defendant proposed to use its TAR software for this review, but the plaintiffs objected, contending that the defendant had over-designated documents as privileged and non-responsive, using an “impermissibly narrow view of responsiveness” during its review process.  To support its contention, the plaintiffs produced certain documents to the Court that the defendant produced inadvertently (including 5 inadvertently produced slip sheets of documents not produced), which they contended should have been marked responsive and relevant.  As a result, the Court required the defendant to submit a letter for in camera review describing its predictive coding process and training for document reviewers.  The Court also required the defendant to provide a privilege log for a sample set of 80 documents that it designated as privileged in its initial review.  Out of those 80 documents, the defendant maintained its original privilege assertions over only 20 documents, finding 36 of them non-privileged and producing them as responsive and another 15 of them as non-responsive.

As a result, the plaintiffs filed a motion requesting random samples of several categories of documents and also sought information about the TAR ranking system used by the defendant and all materials submitted by the defendant for the Court’s in camera review relating to predictive coding.

Judge’s Ruling

Judge Parker noted that both parties did “misconstrue the Court’s rulings during the February 16, 2017 conference” and ordered the defendant to “expand its search for documents responsive to Plaintiffs’ document requests as it construed this Court’s prior ruling too narrowly”, indicating that the plaintiffs should meet and confer with the defendant after reviewing the additional production if they “believe that the City impermissibly withheld documents responsive to specific requests”.

As for the plaintiffs’ challenges to the defendant’s TAR process, Judge Parker referenced Hyles v. New York City, where Judge Andrew Peck, referencing Sedona Principle 6, stated the producing party is in the best position to “evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”  Judge Parker also noted that “[c]ourts are split as to the degree of transparency required by the producing party as to its predictive coding process”, citing cases that considered seed sets as work product and other cases that supported transparency of seed sets.  Relying on her in camera review of the materials provided by the defendant, Judge Parker concluded “that the City appropriately trained and utilized its TAR system”, noting that the defendant’s seed set “included over 7,200 documents that were reviewed by the City’s document review team and marked as responsive or non-responsive in order to train the system” and that “the City provided detailed training to its document review team as to the issues in the case.”

As a result, Judge Parker ordered the defendant “to produce the five ‘slip-sheeted’ documents and the 15 NR {non-responsive documents reclassified from privileged} Documents”, “to provide to Plaintiffs a sample of 300 non-privileged documents in total from the HPD custodians and the Mayor’s Office” and to “provide Plaintiffs with a random sample of 100 non-privileged, non-responsive documents in total from the DCP/Banks review population” (after applying the plaintiffs’ search terms and utilizing TAR on that collection).  Judge Parker ordered the parties to meet and confer on any disputes “with the understanding that reasonableness and proportionality, not perfection and scorched-earth, must be their guiding principles.”  Judge Parker denied the plaintiffs’ request for information about the defendant’s TAR process (but “encouraged” the defendant to share information with the plaintiffs) and denied the plaintiffs’ request to the defendant’s in camera submissions as being protected by the work product privilege.

So, what do you think?  Should TAR ranking systems and seed sets be considered work product or should they be transparent?  Please share any comments you might have or if you’d like to know more about a particular topic.

Case opinion link courtesy of eDiscovery Assistant.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Sure, No Keyword Before TAR, But What About Keyword Instead of TAR?: eDiscovery Best Practices

Last month, we discussed whether to perform keyword search culling before performing Predictive Coding/Technology Assisted Review (TAR) and, like many have concluded before (even a judge in FCA US, LLC v. Cummins, Inc.), we agree that you shouldn’t perform keyword search culling before TAR.  But, should TAR be performed instead of keyword search – in all cases?  Is TAR always preferable to keyword search?

I was asked that question earlier this week by a colleague, so I thought I would relay what I essentially told him.

Many attorneys that I have observed over the years have typically tried to approach keyword search this way: 1) Identify a bunch of potentially responsive terms, 2) string them together with OR operators in between (i.e., {term 1} OR {term 2}, etc.), 3) run the search, 4) add family members (emails and attachments linked to the files with hits) to the results, and 5) begin review.

If that’s the keyword search methodology you plan to use, then, yes, a sound TAR approach is preferable to that approach pretty much every time.  Sure, proportionality concerns can affect the decision, but I would recommend a sound approach over an unsound approach every time.  Unfortunately, that’s the approach a lot of attorneys still use when it comes to keyword search.

However, it’s important to remember that the “A” in TAR stands for “Assisted” and that TAR is not just about the technology, it’s as much about the process that accompanies the technology.  A bad approach to using TAR will generally lead to bad results with the technology, or at least inefficient results.  “Good TAR” includes a sound process for identifying training candidates for the software, reviewing those candidates and repeating the process iteratively until the collection has been classified at a level that’s appropriate to meet the needs of the case.

What about keyword search?  “Good keyword search” also includes a sound process for identifying potentially responsive terms, using various mechanisms to refine those terms (which can include variations, at an appropriate level, that can also be responsive), performing a search for each term, testing the result set (to determine if the term is precise enough and not overbroad) and testing what was not retrieved (to determine what, if anything, might have been missed).  We covered some useful resources for testing and sampling earlier this week here.

Speaking of this week, apparently, this is my week for the “wayback machine” on this blog.  In early 2011, I described a defensible search approach for keyword search for which I created an acronym – “STARR”.  Not Ringo or Bart, but Search, Test, Analyze, Revise (if necessary), Repeat (the first four steps until precision and recall is properly balanced).  While you might think that “STARR” sounds a lot like “TAR”, I coined my acronym for the keyword search approach well before the TAR acronym became popular (just sayin’).

Regardless whether you use STARR or TAR, the key is a sound approach.  Keyword search, if you’re using a sound approach in performing it, can still be an appropriate choice for many cases and document collections.

So, what do you think?  Do you think that keyword search still has a place in eDiscovery?  If not, why not?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

To Keyword Cull or Not to Keyword Cull? That is the Question: eDiscovery Trends

We’re seeing a lot of discussion about whether to perform keyword searching before predictive coding.  We’ve even seen a recent case where a judge weighed in as to whether TAR with or without keyword searching is preferable.  Now, we have a new article published in the Richmond Journal of Law and Technology that weighs in as well.

In Calling an End to Culling: Predictive Coding and the New Federal Rules of Civil Procedure (PDF version here), Stephanie Serhan, a law student, looks at the 2015 Federal Rules amendments (particularly Rules 1 and 26(b)(1)) as justification for applying predictive coding “at the outset on the entire universe of documents in a case.”  Serhan concludes that doing so is “far more accurate, and is not more costly or time-consuming, especially when the parties collaborate at the outset.”

Serhan discusses the importance of timing to predictive coding and explains the technical difference between predictive coding at the outset of a case vs. predictive coding after performing keyword searches.  One issue of keyword culling that Serhan notes is that it “is not as accurate because the party may lose many relevant documents if the documents do not contain the specified search terms, have typographical errors, or use alternative phraseologies”.  Serhan assumes that those “relevant documents removed by keyword culling would likely have been identified using predictive coding at the outset instead.”

Serhan also takes a look at the impact on efficiency and cost between the two methods and concludes that the “actual cost of predictive coding will likely be substantially equal in both methods since the majority of the costs will be incurred in both methods.”  She also looks at TAR related cases, both before and after the 2015 Rules changes.

More and more people have concluded that predictive coding should be done without keyword culling and with good reason.  Applying predictive coding to a set unaltered by keywords would not only likely be more accurate, but also be more efficient as keyword searching requires its own methodology that includes testing of results (and documents not retrieved) before moving on.  Unless there’s a need to limit the volume of collected data because of cost considerations, there is no need to apply keyword culling before predictive coding.

Culling that does make sense is Hash based deduplication, elimination of clearly non-responsive domains and other activities where clearly redundant or non-responsive ESI can be removed from the collection.  That’s a different type of culling that does make sense.

So, what do you think?  To keyword cull or not to keyword cull?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Denies Defendant’s Motion to Overrule Plaintiff’s Objections to Discovery Requests

Court Determines TAR Without Keyword Search Culling First is Preferable: eDiscovery Case Law

In FCA US, LLC v. Cummins, Inc., No. 16-12883 (E.D.  Mich., Mar. 28, 2017), Michigan District Judge Avern Cohn “rather reluctantly” decided a dispute between the plaintiff and defendant on whether the universe of electronic material subject to Technology Assisted Review (TAR) review should first be culled by the use of search terms by agreeing with the plaintiff that “[a]pplying TAR to the universe of electronic material before any keyword search reduces the universe of electronic material is the preferred method.”

Case Background

In this dispute over the allocation of the cost incurred for an auto part that became the subject of a recall, the parties agreed on many issues relating to discovery and particularly electronic discovery.  However, one issue that they couldn’t agree on was whether the universe of electronic material subject to TAR review should first be culled by the use of search terms. The plaintiff took the position that the electronic material subject to TAR review should not first be culled by the use of search terms, while the defendant took the position that a pre-TAR culling is appropriate.

Judge’s Ruling

Noting that the Court decides “rather reluctantly” to rule on the issue, Judge Cohn stated:

“Given the magnitude of the dispute and the substantial matters upon which they agree, the parties should have been able to resolve the discovery issue without the Court as decision maker. Be that as it may, having reviewed the letters and proposed orders together with some technical in-house assistance including a read of The Sedona Conference TAR Case Law Primer, 18 Sedona Con. J. ___ (forthcoming 2017), the Court is satisfied that FCA has the better postion (sic). Applying TAR to the universe of electronic material before any keyword search reduces the universe of electronic material is the preferred method. The TAR results can then be culled by the use of search terms or other methods.”

As a result, Judge Cohn agreed to enter the plaintiff’s proposed order regarding the TAR approach.

So, what do you think?  Should TAR be performed with no pre-search culling beforehand?  Should courts rule on a preferred TAR approach?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Here’s a FREE Training Course on How to Do TAR: eDiscovery Best Practices

A lot of people can talk about technology assisted review (TAR), but how many people actually know how to conduct an eDiscovery review using TAR?  Here’s a TAR course that’s designed to show you how to do it.

On Ralph Losey’s excellent e-Discovery Team® blog, he has unveiled the e-Discovery Team’s training course on how to conduct electronic document review enhanced by active machine learning, a type of specialized Artificial Intelligence.  In other words, the training course is designed to teach you how to “do” TAR.

The TAR method that Ralph discusses is called Hybrid Multimodal IST Predictive Coding 4.0 and the Course is composed of Sixteen Classes, which are individual pages on the e-Discovery Team site.  The Classes are here (links to each one are available via the Introduction link below):

  1. First Class: Introduction
  2. Second Class: TREC Total Recall Track
  3. Third Class: Introduction to the Nine Insights Concerning the Use of Predictive Coding in Legal Document Review
  4. Fourth Class: 1st of the Nine Insights – Active Machine Learning
  5. Fifth Class: Balanced Hybrid and Intelligently Spaced Training
  6. Sixth Class: Concept and Similarity Searches
  7. Seventh Class: Keyword and Linear Review
  8. Eighth Class: GIGO, QC, SME, Method, Software
  9. Ninth Class: Introduction to the Eight-Step Work Flow
  10. Tenth Class: Step One – ESI Communications
  11. Eleventh Class: Step Two – Multimodal ECA
  12. Twelfth Class: Step Three – Random Prevalence
  13. Thirteenth Class: Steps Four, Five and Six – Iterate
  14. Fourteenth Class: Step Seven – ZEN Quality Assurance Tests
  15. Fifteenth Class: Step Eight – Phased Production
  16. Sixteenth Class: Conclusion

Ralph notes that “With a lot of hard work you can complete this online training program in a long weekend. After that, this course can serve as a solid reference to consult during your complex document review projects.”

The sixteen classes in this course cover seventeen topics, split into nine insights and eight workflow steps:

  1. Active Machine Learning (aka Predictive Coding)
  2. Concept & Similarity Searches (aka Passive Learning)
  3. Keyword Search (tested, Boolean, parametric)
  4. Focused Linear Search (key dates & people)
  5. GIGO & QC (Garbage In, Garbage Out) (Quality Control)
  6. Balanced Hybrid (man-machine balance with IST)
  7. SME (Subject Matter Expert, typically trial counsel)
  8. Method (for electronic document review)
  9. Software (for electronic document review)
  10. Talk (step 1 – relevance dialogues)
  11. ECA (step 2 – early case assessment using all methods)
  12. Random (step 3 – prevalence range estimate, not control sets)
  13. Select (step 4 – choose documents for training machine)
  14. AI Rank (step 5 – machine ranks documents according to probabilities)
  15. Review (step 6 – attorneys review and code documents)
  16. Zen QC (step 7 – Zero Error Numerics Quality Control procedures)
  17. Produce (step 8 – production of relevant, non-privileged documents)

Ralph provides charts to illustrate the insights and steps on his Introduction class.  Ralph notes that they “offer this information for free on this blog to encourage as many people as possible in this industry to get on the AI bandwagon.”  I asked Ralph if he wanted to say anything additional to our readers regarding the course and he told me that he “will be adding homework assignments next at the end of each class.”  I look forward to “diving in” and reviewing the classes on my own ASAP.  Thanks, Ralph!

So, what do you think?   Have you used TAR on a case yet?  If so, how did it go?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

6,905 Billable Hours for Attorney Review May Not Be Billable if the Reviewer Isn’t Actually an Attorney: eDiscovery News

A contract lawyer for a Pennsylvania plaintiffs’ firm clocked 6,905 hours of work on a shareholder lawsuit against former executives and directors of Sprint Corp. related to its 2005 merger with Nextel.  One problem, however: that attorney had apparently been disbarred for years when he performed the work.

According to an article in the Wall Street Journal (One Lawyer, 6,905 Hours Leads to $1.5 Million Bill in Sprint Suit, written by Joe Palazzolo and Sara Randazzo, subscription required), “Alexander” Silow, a contract lawyer for the Weiser Law Firm PC, clocked 6,905 hours of work during the case. Averaging about 13 hours a day, Silow reviewed 48,443 documents and alone accounted for $1.5 million, more than a quarter of the requested legal fees, according to court documents.  Those awarded fees had already been cut from $4.2 million down to just $450,000 back in November of last year.

That initial fee reduction was awarded after Kansas District Judge James Vano called the requested amount “unbelievable.” “It seems that the vast amount of work performed on this case was illusory, perhaps done for the purpose of inflating billable hours,” Judge Vano, who sits in Olathe, Kan., wrote in a Nov. 22 opinion.

Silow had been working as a contract attorney for at least eight years when staffing agency Abelson Legal Search placed him at the Weiser firm in 2008, according to a Feb. 3 letter from the firm to Judge Vano. The law firm was notified by a third party it declined to name and learned that no one with Silow’s name was listed in a state database of licensed lawyers, Robert B. Weiser, co-founder of the firm, said in the letter.  Silow had presented himself to the firm as “Alexander J. Silow”, but “was in actuality named Jeffrey M. Silow” and confessed he had been disbarred when the firm confronted him, the letter said. The firm has since ended its relationship with Mr. Silow and alerted authorities, it said.  The Pennsylvania’s attorney discipline office confirmed Mr. Silow was disbarred in 1987 but could provide no additional information.

At least one Sprint Shareholder has requested that the case be reviewed again by Judge Vano in light of the new allegations.

According to Lester Brickman, an emeritus professor at Benjamin N. Cardozo School of Law in New York who has written about bill padding, plaintiffs’ firms bill for work done by contract attorneys like Mr. Silow at hourly rates of $300 or more when they submit their fee requests, but they typically pay the attorneys $20 to $40 per hour.  Brickman said it is common for firms to staff cases with contract attorneys and direct them to review thousands of documents to run up the fees.  In this case, bill padding and excessive markup appears to have been the least of the firm’s problems.

Thanks to ACEDS for the tip on the story!

Also, yesterday, I thanked our readers for 6 1/2 years of support and readership of the blog.  Today, I want to thank JD Supra and its readership for being named the Readers’ Choice Top Author in eDiscovery (and CloudNine being named the Top Firm) for 2017!  Distribution of our posts via JD Supra has grown our readership greatly over the past year and I really appreciate our partnership with JD Supra and thank all of you for reading our blog, whether it’s via JD Supra or the “old fashioned way” via our site!  Thank you so much!

So, what do you think?   Should firms do more to ensure that the attorneys they use for review are actually licensed attorneys?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

78 is Great! eDiscovery Daily Is Seventy Eight! (Months Old, That Is)

A new record!  (Get it?)  Seventy eight months ago today (a.k.a., 6 1/2 years), eDiscovery Daily was launched.  It’s hard to believe that it has been 6 1/2 years since our first three posts debuted on our first day, September 20, 2010.  Now, we’re up to 1,656 lifetime posts, and so much has happened in the industry that we’ve covered.

Twice a year, we like to take a look back at some of the important stories and topics during that time.  So, here are just a few of the posts over the last six months you may have missed.  Enjoy!

Thanks, once again, for your support!  Our subscriber base and daily views continue to grow, and we owe it all to you!  Thanks for the interest you’ve shown in the topics!  We will do our best to continue to provide interesting and useful eDiscovery news and analysis.  And, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Jason R. Baron of Drinker Biddle & Reath LLP, Part 2: eDiscovery Trends

This is the fifth of the 2017 Legaltech New York (LTNY) Thought Leader Interview series.  eDiscovery Daily interviewed several thought leaders at LTNY (aka Legalweek) this year to get their observations regarding trends at the show and generally within the eDiscovery industry.

Today’s thought leader is Jason R. Baron of Drinker Biddle & Reath LLP.  Jason is a member of Drinker Biddle’s Information Governance and eDiscovery practice and co-chair of the Information Governance Initiative.  An internationally recognized speaker and author on the preservation of electronic documents, Jason previously served as the first Director of Litigation for the U.S. National Archives and Records Administration, and as trial lawyer and senior counsel at the Department of Justice.  He also was a founding co-coordinator of the National Institute of Standards and Technology TREC Legal Track, a multi-year international information retrieval project devoted to evaluating search issues in a legal context.  He served as lead editor of the recently published ABA book, Perspectives on Predictive Coding and Other Advanced Search Methods for the Legal Practitioner.

Jason provided so much good information that we decided to publish his interview in two parts.  The first part of his interview was published on Friday and Craig Ball’s interview originally scheduled for today will be published on Wednesday and Thursday of next week (also in two parts).

Since you mentioned the recent trend we’ve seen toward an emphasis on technology competence for attorneys, I was going to also mention that we’re up to 26 states that have adopted some sort of technology competence requirement, with Florida being the first state that has required technology CLE for their attorneys.  Do you think the increased emphasis on technology competence will change the general lack of understanding of technology (and advanced search technologies) within the legal profession?

Doug, I’m happy to say that, in the 36+ years that I’ve practiced law, I’ve never had to meet a CLE requirement – the Massachusetts and the DC bars don’t require it!  (Call me lucky.  I’ve also vowed never to take another bar exam.)  But there is clearly a movement towards states’ adoption of the ABA professional rules, including the comment to Rule 1.1 mentioned above.  And in addition to California’s ethics opinion, there are any number of local courts where a great deal of e-discovery competency has been expected of counsel for some time.  (The Seventh Circuit Electronic Discovery Pilot Program, Judge Paul Grimm in Maryland, and state and federal courts in New York have all led the way on this.)   Anyone who practices eDiscovery in a large, complex case is going to have to confront this fact.   Technical competency is of course also needed in a larger percentage of smaller cases in state and federal court as well, given that the need to search Facebook and related apps, as well as GPS devices and other smart technologies, all will be increasingly useful for handling smaller cases involving personal injury, divorce or employment law.  You just can’t hide from the world that we are in – we are immersed in ESI and increasingly immersed in algorithms and analytics that affect all of our lives.

So, to be technically competent in e-discovery in 2017, you do need to know what you don’t know.  Like the California ethics opinion states, that means either knowing the technical points spelled out in the opinion, or knowing enough to know you need help by going to an expert within your firm or a consultant.  Or to step aside and get a co-counsel to help.  And, I think that will be a trend line that we will see.  I don’t think we are all required by these opinions to be Maura Grossman or Judge Peck, or that we need to get an advanced degree in information science or data analytics (although it might help!).   We just need to know enough to ask questions about what it takes to do a better job in eDiscovery.

The highest goal for the e-discovery bar will continue to be working in a way that is consistent with Rule 1, with a just, speedy and economical approach to litigation.  And, in the information governance arena, we can make ourselves valuable and competent as well, by knowing something about advanced search.   At Drinker Biddle, my colleague Bennett Borden is a partner and Chief Data Scientist of our firm – to my knowledge the only lawyer who holds those two titles at an AmLaw 100 firm.   Bennett has been on a “soapbox” as well, saying that we can apply what we have learned using analytics in eDiscovery to every field of practice at a law firm, whether it’s mergers and acquisitions or employment law or anything.  Our practice group routinely is called upon to advise and be part of an ongoing firm-wide discussion of how clients need insight into their large data sets.

We all know where these lines are going.  In terms of technology, at least, the world is not going backwards.  We’re not heading to a place where computers are getting less smart – just the opposite.  Whether you agree with me that the pace of change is itself accelerating,  or just think change is happening, we are in a world where (for the rest of our lives) we are going to be confronting new apps and new technologies.  And, as lawyers, we need to understand the implications across a range of engagements across all legal domains.

In addition to what we’ve already discussed, what are you working on that you’d like our readers to know about?

A few things come to mind: One is that the Information Governance Initiative celebrated its third anniversary during this year’s Legaltech.  Under Barclay Blair and Bennett Borden’s stewardship, the IGI has grown now to 25+ supporters from the legal tech community.  The IGI is widely recognized as a robust “think tank” providing thought leadership about IG topics.  Aside from white papers and benchmark studies, what we have focused on in the last couple of years is holding a Chief Information Governance Officer (CIGO) summit and, this year, that will take place on May 10 and 11 in Chicago.  As we have in the last two years, we will endeavor to gather 60 or 70 “card carrying” members of the IG profession – people who are in a leadership role within IG at their respective organizations.  Many of those who come are de facto Chief Information Governance Officers except that they have some other title on their business card.  This year we will again have a serious conversation about what it means to be a leader in IG.

I have written a recent article in Ethical Boardroom (a magazine out of the UK that may not be very well known in the US, but has really good content regarding corporate governance issues), that I would like to be a theme for 2017 and going forward: how to involve the boards of directors in companies in participating in oversight of information governance issues, to essentially deputize them as fiduciaries of IG in some sense.  Through Sarbanes-Oxley and through the efforts of many companies, board members have developed expertise on cybersecurity issues, and there have been many articles about how you can get involved in that.  But, I think there’s a broader conversation than just data breach issues which encompasses IG – and I have written an article on that.  I’ve also been interested in data ethics issues including moderating a panel at the last annual ACC meeting in San Francisco, so I’ll also be talking about algorithmic bias this year as well.

The last thing that I’ll bring up which is very close to my heart is that, since 2007, I’ve been asked to lead a workshop with my fellow organizers, called the DESI (Discovery of ESI) VII workshop series at the International Conference of AI and Law (ICAIL).  The format of the workshop is that people come and present work that they’ve done, either research papers or even just position papers of 4 or 5 pages.  So, it’s a very easy lift to be part of a very smart community of PhDs and lawyers talking about sophisticated topics.

This year, the workshop will be held on June 12 at King’s College in London.  In prior years we’ve had workshops in Barcelona, Rome, Beijing, Pittsburgh, San Diego, Palo Alto, as well as once before in London itself.   Now, we’re back in London, and I encourage all of your readers to attend and consider participating by putting in a paper.  Maura Grossman and Gordon Cormack graciously have agreed to be the opening keynote speakers at this year’s workshop, which will be especially focused on identifying and protecting sensitive information in large collections.  This is an eDiscovery problem in complex litigation involving privileged documents, but it’s also a problem for privacy related materials (like PII and PHI), and a problem that comes up in audits and internal investigations as to what is proprietary and what can be provided.

Filtering content is also a problem in terms of allowing public access to the vast digital archives of government.  In the case of White House email, we’ve been accumulating emails since the 1990s and there will soon be close to a billion emails that are in existence at the National Archives.  One cannot, however, walk into the National Archives and see any of those e-mails, at least any time soon.  One can walk in and see paper records — but the large and growing collections of e-mails and certain other forms of electronic documents remain off limits because of the sensitive nature of content scattered throughout the collections.  In fact, it may be many, many decades before the vast bulk of NARA’s e-mail collection is made available to the public.  So, that’s an issue that I’ve been writing and speaking on, and that I trust will be discussed at DESI VII.  I will also be speaking on this subject at CeDEM 2017, an upcoming e-Democracy conference being held outside of Vienna, Austria this coming May. 

Thanks, Jason, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Don’t forget our webcast tomorrow: Best Practices for Effective eDiscovery Searching at noon CST.  Click here to register!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Jason R. Baron of Drinker Biddle & Reath LLP, Part 1: eDiscovery Trends

This is the fifth of the 2017 Legaltech New York (LTNY) Thought Leader Interview series.  eDiscovery Daily interviewed several thought leaders at LTNY (aka Legalweek) this year to get their observations regarding trends at the show and generally within the eDiscovery industry.

Today’s thought leader is Jason R. Baron of Drinker Biddle & Reath LLP.  Jason is a member of Drinker Biddle’s Information Governance and eDiscovery practice and co-chair of the Information Governance Initiative.  An internationally recognized speaker and author on the preservation of electronic documents, Jason previously served as the first Director of Litigation for the U.S. National Archives and Records Administration, and as trial lawyer and senior counsel at the Department of Justice.  He also was a founding co-coordinator of the National Institute of Standards and Technology TREC Legal Track, a multi-year international information retrieval project devoted to evaluating search issues in a legal context.  He served as lead editor of the recently published ABA book, Perspectives on Predictive Coding and Other Advanced Search Methods for the Legal Practitioner.

Jason provided so much good information that we decided to publish his interview in two parts.  The remainder of his interview will be published on Monday and Craig Ball’s interview (also in two parts) will be published on Wednesday and Thursday of next week.

What are your observations about LTNY this year and how it compared to other LTNY shows that you have attended?

It certainly has had a different look and feel this year, given that it’s now Legalweek in its branding.  (Although some of us old timers still will always refer to it as Legaltech).  I was very impressed with the lineup of speakers, despite the fact that several of the names that have routinely appeared year after year, like myself, were not speaking this year.  Instead, it seems there has been a broader reach that goes beyond eDiscovery, including a whole set of people who are in the data analysis and forensics world.

I really liked the keynote on day one.  I previously had read Andrew McAfee’s book, The Second Machine Age – and I certainly agreed with his observation in the keynote that we are not only in an era of accelerating change but that the pace of acceleration is itself accelerating.  Part of McAfee’s presentation was about how, up until about a year ago, predictions were that it would take about another decade or two for a software program to beat the world’s best Go player.  The speaker showed articles from 2015, from publications like The Wall Street Journal, that talked up the complexity of the game Go.  The speaker noted how Go is intuitive, that the game progresses in complexity, and the really interesting thing is that, unlike chess, the best Go players in the world really have no idea how they do it.  They simply intuit, by looking at the 19 x 19 grid, as to a winning strategy.  And yet, remarkably, this past year, a machine did beat the world’s best Go player – a decade ahead of time in terms of the predictions! 

McAfee went on to highlight another element, true in both Go and chess, which will be increasingly true in every domain in which machines are learning, namely: that machines are filled with surprises.  They not only do better in some domains now than the best human, but they go about doing tasks in ways that are different than humans.  McAfee illustrated a move in Go that the machine did that no human would ever do; it so surprised the best player in the world that he got up out of this chair and walked around the room, as he just couldn’t believe it (in part because it seemed like a move that a novice would make).  And yet the machine won that game.  So, we’re actually at an inflection point where humans are learning from software how to play these games both better and differently (i.e., more like a computer).

Now, you can add to these examples to illustrate a larger point of special interest to the lawtech crowd: that we are closer and closer to experiencing a “Turing test” moment in a number of domains, where it is increasingly difficult to distinguish AI from human responses.  Because we are living in a world where things are happening at such an accelerated pace, it wouldn’t surprise me, in five years, that Legaltech (oops—Legalweek) will be mostly about the law of AI and robots – including the ethics of handling extremely smart robots that mimic human behavior and then some.  I am not a believer that soon we will be in peril based on the world being taken over by super-intelligent machines.  But I do believe that we will be increasingly reliant on software, and that software will perform at a level that the Alexas and Siris of the future will seem to be our buddies, not just limited automated personal assistants.  We won’t even need screens anymore — we’ll simply be giving verbal instructions to these devices.  You already see that increasingly with not only Amazon Echo’s Alexa but in smart dolls and a range of other products.  But all this also means that Alexa and the other devices are accumulating data (ESI) from the people using them — all of which is grist for the e-discovery mill.  This world of IoT, smart devices, and smart analytics, is what McAfee and others are talking about: the acceleration in technological change is itself accelerating!

I think all of this means an even more interesting Legaltech of the future. Predictive coding technology was the hot topic at Legaltech about four years ago, after Judge Peck issued his ruling in DaSilva Moore.  (I think there were a dozen or more panels on technology assisted review and predictive coding that year.)  More recently we’ve seen a wave of panels on information governance and data analytics – which I plead guilty to being a part of.  As I said, I think we are now looking at a world of smart devices, IoT, AI, and robotics that will soon dominate the conversation in raising lots of ethical issues.  Indeed, I just read that in the EU an effort has been initiated to have a committee looking into the ethics of robots and human interaction with robots.  So, we are living in very, very interesting and exciting times.  That’s what you get when you’re living in an exponentially growing world of data. 

Last year, there were a few a notable case law decisions related to the use of Technology Assisted Review.  How do you think those cases impacted the use and acceptance of TAR?

I think you’re seeing a more sophisticated level on the part of a greater slice of the judiciary in predictive coding cases; it isn’t simply the same cadre of judges providing the rulings.  There are also new rulings in the UK, Ireland, and Australia, and that is all good.  I’m not going to talk in detail about any one case, but I think that there is a trend line that can be seen where the lurking question in complex, document-intensive e-discovery cases is whether a party acted reasonably in not using some form of advanced search and review techniques, like technology assisted review.   As Judge Peck said in the Hyles case, we’re not there yet, but it seems to me that’s where the hockey puck will be soon.

If I’m right, and the burden will be to explain why one didn’t use advanced search methods, it follows that clients should be demanding the greater efficiencies that can be obtained through such methods. Granted, you have to get past a certain level of financial risk in a case to justify use of advanced search methods.  Obviously, employing keywords and even manual searching in very small collections is still perfectly viable. But when you’re in complex litigation of a certain size, it is unfathomable to me that a major Fortune 500 corporation wouldn’t at least game out the cost of using traditional manual search methods supplemented by keywords, versus the use of some advanced software to supplement those older, “tried and true” methods.  As you know, I am a very big advocate for all of us looking into these issues, not just to benefit clients in eDiscovery but also across all kinds of legal engagements.

I realize I have been an evangelist for advanced search techniques.  So let me just quote, for the record here, a couple of sentences I’ve written as part of an Introduction to the book Perspectives on Predictive Coding and Other Advanced Search Methods for the Legal Practitioner (link above):  “As the book goes to print, there appear to be voices in the profession questioning whether predictive coding has been oversold or overhyped and pointing to resistance in some quarters to wholesale embrace of the types of algorithmics and analytics on display throughout this volume.  Notwithstanding these critics, the editors of this volume remain serene in their certainty that the chapters of this book represent the future of eDiscovery and the legal profession as it will come to be practiced into the foreseeable future by a larger and larger contingent of lawyers.  Of course, for some, the prospect of needing to be technically competent in advanced search techniques may lead to considerations of early retirement. For others, the idea that lawyers may benefit by embracing predictive coding and other advanced technologies is exhilarating.  We hope this book inspires the latter feeling on the part of the reader.”

Since you have mentioned your book, tell us more about its contents.

This book was a labor of love, as no one will be getting any royalties!  Michael Berman originally suggested to me that this volume would be a useful addition to the legal literature, and over the next two-plus years he and I, with the able assistance of Ralph Losey, managed to pull off getting the best minds in the profession to contribute content and to work towards publication.  I think this is a volume that speaks not only to practitioners “inside the bubble” (i.e., at Legaltech or at places like The Sedona Conference®), but also to a larger contingent of lawyers who don’t know about the subject in any depth and wish to learn.  These are lawyers who earnestly want to be technologically competent under ABA Model Rule 1.1, and who are aware of a growing body of bar guidance, including the recent California Bar opinion on e-discovery competence.   I think more and more, especially in complex cases, such competency means being at least aware of emerging, advanced search and document review techniques.  That may not exactly be easy for some lawyers (especially in my age cohort), but I am sure it will be easier for the generation succeeding us. 

As for some specifics, Judge Andrew Peck wrote the book’s Foreword, Maura Grossman and Gordon Cormack were very generous in not only submitting an expert, original chapter (“A Tour of TAR”), but also allowing us to reprint their glossary of TAR terms.   Phil Favro provided a supplement to his leading article with Judge John Facciola on seed sets and privilege, and Judge Waxse’s important (and controversial) law review article on courts being called upon to apply a Daubert test for advanced search is included.

Most of the 20 chapters in the book are original. There is a really excellent chapter about antitrust law and predictive coding, by Robert Keeling and Jeffrey Sharer.  There is a wonderful chapter on emerging e-discovery standards by Gil Keteltas, Karin Jenson and James Sherer.  Ronni Solomon and her colleagues at King & Spalding wrote a chapter on the defensibility of TAR for a big firm on the defense side.  The late Bill Butterfield and Jeannine Kenney wrote a chapter spelling out from the plaintiff’s side considerations about how to use predictive coding in a fair way.  William Hamilton supplied a much-needed chapter discussing predictive coding for small cases.  Vincent Catanzano, Sandra Rampersaud, and Samantha Greene contributed chapter on setting up TAR protocols.  There are several other chapters talking about information governance written by Sandy Serkes, Leigh Isaacs, and including a reprint of Bennett Borden’s and my law review on “Finding the Signal in the Noise” (with a new coda).  Part of the book provides perspectives from other leading PhDs whom I’ve worked with during the TREC Legal Track and at other workshops, including Doug Oard, Dave Lewis, and William Webber.  Bruce Hedin and colleagues at H5 supplied a thought provoking chapter talking about standards in the field in the use of advanced search. Kathryn Hume educates us on deep learning.   Michael Berman and I (with co-authors), and Ralph Losey, each supplied additional articles rounding out the volume.

Although no one expects the book to be a best-seller on Amazon, I really believe the 650 pages of text will be of interest to readers of your column Doug, and so I do recommend checking it out while supplies last (kidding)!

Part 2 of Jason’s interview will be published on Monday.

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.