eDiscoveryDaily

Why Is TAR Like a Bag of M&M’s?, Part Two: eDiscovery Best Practices

Editor’s Note: Tom O’Connor is a nationally known consultant, speaker, and writer in the field of computerized litigation support systems.  He has also been a great addition to our webinar program, participating with me on several recent webinars.  Tom has also written several terrific informational overview series for CloudNine, including eDiscovery and the GDPR: Ready or Not, Here it Comes (which we covered as a webcast), Understanding eDiscovery in Criminal Cases (which we also covered as a webcast) and ALSP – Not Just Your Daddy’s LPO.  Now, Tom has written another terrific overview regarding Technology Assisted Review titled Why Is TAR Like a Bag of M&M’s? that we’re happy to share on the eDiscovery Daily blog.  Enjoy! – Doug

Tom’s overview is split into four parts, so we’ll cover each part separately.  The first part was covered on Tuesday.  Here’s part two.

History and Evolution of Defining TAR

Most people would begin the discussion by agreeing with this framing statement made by Maura Grossman and Gordon Cormack in their seminal article, Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, (XVII RICH. J.L. & TECH. 11 (2011):

Overall, the myth that exhaustive manual review is the most effective—and therefore, the most defensible—approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort.

A technology-assisted review process may involve, in whole or in part, the use of one or more approaches including, but not limited to, keyword search, Boolean search, conceptual search, clustering, machine learning, relevance ranking, and sampling.

So, TAR began as a process and in the early stage of the discussion, it was common to refer to various TAR tools under the heading “analytics” as illustrated by the graphic below from Relativity.

Copyright © Relativity

That general heading was often divided into two main categories

Structured Analytics

  • Email threading
  • Near duplicate detection
  • Language detection

Conceptual Analytics

  • Keyword expansion
  • Conceptual clustering
  • Categorization
  • Predictive Coding

That definition of Predictive Coding as part of the TAR process held for quite some time. In fact, the current EDRM definition of Predictive Coding still refers to it as:

An industry-specific term generally used to describe a Technology-Assisted Review process involving the use of a Machine Learning Algorithm to distinguish Relevant from Non-Relevant Documents, based on a Subject Matter Expert’s Coding of a Training Set of Documents

But before long, the definition began to erode and TAR started to become synonymous with Predictive Coding. Why?  For several reasons I believe.

  1. The Grossman-Cormack glossary of 2013 used the phrase Coding” to define both TAR and PC and I think various parties then conflated the two. (See No. 2 below)

  1. Continued use of the terms interchangeably. See EG, Ralph Losey’s TARCourse,” where the very beginning of the first chapter states, “We also added a new class on the historical background of the development of predictive coding.”  (which is, by the way, an excellent read).
  2. Any discussion of TAR involves selecting documents using algorithms and most attorneys react to math the way the Wicked Witch of the West reacted to water.

Again, Ralph Losey provides a good example.  (I’m not trying to pick on Ralph, he is just such a prolific writer that his examples are everywhere…and deservedly so). He refers to gain curves, x-axis vs y-axis, HorvitsThompson estimators, recall rates, prevalence ranges and my personal favorite “word-based tf-idf tokenization strategy.”

“Danger. Danger. Warning. Will Robinson.”

  1. Marketing: the simple fact is that some vendors sell predictive coding tools. Why talk about other TAR tools when you don’t make them? Easier to call your tool TAR and leave it at that.

The problem became so acute that by 2015, according to a 2016 ACEDS News Article, Maura Grossman and Gordon Cormack trademarked the terms “Continuous Active Learning” and “CAL”, claiming those terms’ first commercial use on April 11, 2013 and January 15, 2014. In an ACEDS interview earlier in the year, Maura stated that “The primary purpose of our patents is defensive; that is, if we don’t patent our work, someone else will, and that could inhibit us from being able to use it. Similarly, if we don’t protect the marks ‘Continuous Active Learning’ and ‘CAL’ from being diluted or misused, they may go the same route as technology-assisted review and TAR.”

So then, what exactly is TAR? Everyone agrees that manual review is inefficient, but nobody can agree on what software the lawyers should use and how. I still prefer to go back to Maura and Gordon’s original definition. We’re talking about a process, not a product.

TAR isn’t a piece of software. It’s a process that can include many different steps, several pieces of software, and many decisions by the litigation team. Ralph calls it the multi-modal approach: a combination of people and computers to get the best result.

In short, analytics are the individual tools. TAR is the process you use to combine the tools you select.  The next consideration, then, is how to make that selection.

We’ll publish Part 3 – Uses for TAR and When to Use or Not Use It – next Tuesday.

So, what do you think?  How would you define TAR?  And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Image Copyright © Mars, Incorporated and its Affiliates.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Chris Dale of the eDisclosure Information Project: eDiscovery Trends 2018

This is the eleventh (and final) of the 2018 Legaltech New York (LTNY) Thought Leader Interview series.  eDiscovery Daily interviewed several thought leaders at LTNY this year (and some afterward) to get their observations regarding trends at the show and generally within the eDiscovery industry.

Today’s thought leader is Chris Dale.  Chris is Editor of the eDisclosure Information Project.  Chris qualified as an English solicitor in 1980. He was a litigation partner in London and then a litigation software developer and litigation support consultant before turning to commentary on electronic disclosure / discovery. He runs the e-Disclosure Information Project which disseminates information about the court rules, the problems, and the technology to lawyers and their clients, to judges, and to suppliers. He was a member of Senior Master Whitaker’s Working Party which drafted Practice Direction 31B and the Electronic Documents Questionnaire. Chris is also a well-known speaker and commentator in the UK, the US and other common law jurisdictions.

{I spoke to Chris during LTNY and this is a rough transcript of our discussion}

Everybody over here in the US is talking about the General Data Protection Regulation (GDPR), what it will mean to American businesses and especially the potential of large fines for lack of compliance.  What are people saying about it in Europe?  Is this as big a deal as everyone is making it out to be?

I’m plotting with somebody to have a conference in London before the implementation date, at which we will not mention the GDPR in the marketing profile.  Whether we’ll get that through the marketing department or the education department I have yet to find out, but there’s no doubt that GDPR is driving a lot of attention, often for the wrong reasons.

You say for the wrong reasons? Why do you say that?

A lot of people are talking about the 4% fines as if that was the only driver which matters. There are lots of people who are talking about “citizens” but have not responded to my challenge find the word ”citizens” anywhere in the GDPR. There’s a lot of pig ignorance about precisely what it says and what its terminology is, let alone what its effect is likely to be. That’s quieting down and the shouters are beginning to shout a bit less about 4% fines. Of course, the fines have to be mentioned, because they are part of the bottom line, and companies like Facebook may well face the very big fines. As a motive for doing anything about GDPR, they should not be the most compelling one for most companies.  It would be good to see people taking a more rounded approach to what the implications are. I interviewed someone this morning who is very good, very knowledgeable about it and yet he (to my surprise) mentioned the 4% fine.  But when we discussed the fine, it was clear that he didn’t mean everybody is at a risk from that. It would be good to see people produce a business case for dealing with GDPR that doesn’t refer either to the 4% fine or to “citizens”, because there’s a lot of nonsense going on about it at the moment. There are an awful lot of people who act as experts on it, but whose first paragraph about it betrays the fact that they haven’t got a clue.  I saw an article promoting a service just recently and the first paragraph had a gross error in it.

So, I guess it’s an understatement to say there’s a lot of misconceptions about the GDPR?

Yes. One of the results of that, to some extent anyway, is that companies just throw their hands up in horror and say, “Well, I can’t comply by the due date, so I’ll just hide and pretend it’s going to go away.” That’s the result of hype and what happens when providers raise the stakes. We saw it in Zubulake and we saw it with the federal changes way back. People are thinking, “Well I can’t comply with it anyway, so why bother?” And that is not exactly the right attitude.

There are people who talk about the 72-hour deadline for breach notification as if that meant you have to do everything, produce every last bit of information to a regulator and possibly to the people affected within the 72 hours.  All this hype tends to make a lot of organizations say, “I can’t, I know I don’t comply with that anyway, I’ll just keep my head down, hope they hit somebody else.” Whereas there needs to be a more moderated approach to what needs to be done and what the implications are of not doing it, and a more positive look at what you gain from compliance.

My favorite quotation came from a chap called Patrick Burke whom you may know. He was in private practice advising on privacy and data protection, and specifically on the GDPR, and I asked him, “What’s your clients’ reactions, are they in fear of fines?” He said, “No, they just want to keep doing business.” Which is a really good line. Very quickly, the clients of the organizations who haven’t complied are probably going to start putting it into their RFP. They’ll be asking not just, “Have you complied?” but ”Can you indicate what you have done to be consistent with compliance with GDPR?” Those who have to say, “Oh, I don’t know what you’re talking about,” which I’m afraid includes quite a lot, will start losing business.  Possibly the companies who are asking that themselves won’t be compliant or know what it means, but it’ll become one of those tick box items like so many other things and the inability to give a satisfactory answer will lose business.

If you look at one of the companies in the UK that’s been fined under the present regime, they were fined 400,000 pounds, which sounds a lot of money until you look at what else they lost – £80 million in direct and indirect costs. It is said that they lost more than 6% of their market share, so you could multiply that £400,000 fine by roughly ten times under the new regime and you’re still not scratching the surface of the losses they’ve suffered overall, because they come across as the sort of company that doesn’t look after its customers’ data properly.

The conventional marketing of, “I know GDPR and I can help you through it,” doesn’t scratch the surface particularly if they start using terminology that doesn’t actually exist in it.  But you’re not offering expert services in guidance through the GDPR, you’re offering the ability to do specific things like to identify personal data and believing it is somebody else’s problem perhaps to decide what personal data is and what the risk profile is. Stick resolutely to the provision of services to meet whatever requirements are offered are sought, such as the ability to identify personal data and the ability to adapt, to show what data you’ve got in case you need to do so.

Perhaps that doesn’t matter as long as one gets business from it.  There’s certainly a lot of work coming out of it. Maybe we are at last finding the ROI for information governance that was missing on the first round through IG. Maybe people will begin to realize that if they get rid of their “crap”, they have less of a problem. That’s valuable.  The end result is less crap, or at least a better handle on their crap through data maps and things like that.  And knowing where it is from the moment of its creation and what in it might be offensive – knowing what ought to be deleted.

Or at least confront the decision. “Yes, I ought to delete that because the EU rules say that once it’s no longer serving the purpose for which it was collected it ought to be deleted.” Weighing that against, “Yes, but then I might be in trouble with a US regulator or court.”  It’s about making the informed decision that you’re keeping it or not keeping it, depending on which of those risks you see as the most important.  We will see gradually US courts acknowledging that there’s an EU requirement to delete data – if it contains personal information that’s no longer required for the purpose for which it was collected –and to acknowledge that that is a reason, an excuse if you like, for its non-availability.

That will take three to five years. For a period you’ll have judges who either don’t give a damn what everybody outside the US says or are too uninformed to understand what all this means with arguments put in front of them by lawyers who neither care nor understand what it all means.  But you’ll reach a point where I think the US courts will acknowledge that there are problems for those who keep EU data at the rate it that has been done most times in the EU. All that will take time and I hope there’ll be some examples. There’ll be some people who get some serious difficulties because of failure to comply. We don’t wish bad things for those people, but until we start seeing that we don’t actually know what target we’re aiming for. Regardless how well we might think we understand the statutes until we start seeing how regulators enforce them, we won’t know what to expect.  Of course, whether we get consistency between EU regulators (as is the hope) or whether in fact they all end up with different shades of interpretation will make a difference. It will take time and very interesting to see. That problem makes it sound even more daunting than it was.

In addition to GDPR, the Supreme Court decision in the Microsoft Ireland case will also have an impact of privacy rights for data subpoenaed by US law enforcement agencies.  What do you think will happen there and what do you think will be the impact?

I think it’s likely that Judge Francis’ original opinion is upheld by the Supreme Court. I think it will be upheld because the politics of it is not Supreme Court’s concern. It will be interesting to see what will happen when the Supreme Court says, “Yes, it’s absolutely fine for US agencies to dip their hands into data stores all over the world even if they don’t know that it’s a US citizen.” That’s a perfectly possible outcome.  What are you going to do then? What’s China going to do? There’s all these sort of political things, which as I say are not strictly the concern of the Supreme Court. What’s the backlash going to be?  Nobody knows.

Regardless of what the decision ultimately is, the CLOUD Act currently before Congress (to amend the Stored Communications Act to allow US federal law enforcement to compel U.S.-based service providers via warrant or subpoena to provide requested data stored on servers regardless of whether they are located within the U.S. or in foreign countries) could make the Supreme Court decision moot.  {Editor’s Note: The CLOUD Act was signed into law as part of the Omnibus Bill in March.}

What would you like our readers to know about things you’re working on?

We have some new civil discovery rules pending in England and Wales, and we have had some cases worthy of comment recently. The main thing is to keep writing – I’m getting 60 more page views per day this year than last year (that is, nearly 22,000 extra page views a year) which suggests a growing interest in this subject.

Part of that, perhaps, is down to the videos which we do and I am keen to make more use of this medium to get messages across, whether about the rules of England and Wales, the GDPR, or the interesting developments in discovery US and worldwide. They are very time-consuming to do properly but are well worth it.

Thanks, Chris, for participating in the interview!

As always, please share any comments you might have or if you’d like to know more about a particular topic!

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation.  Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer:  The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine.  eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance.  eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Why Is TAR Like a Bag of M&M’s?: eDiscovery Best Practices

Editor’s Note: Tom O’Connor is a nationally known consultant, speaker, and writer in the field of computerized litigation support systems.  He has also been a great addition to our webinar program, participating with me on several recent webinars.  Tom has also written several terrific informational overview series for CloudNine, including eDiscovery and the GDPR: Ready or Not, Here it Comes (which we covered as a webcast), Understanding eDiscovery in Criminal Cases (which we also covered as a webcast) and ALSP – Not Just Your Daddy’s LPO.  Now, Tom has written another terrific overview regarding Technology Assisted Review titled Why Is TAR Like a Bag of M&M’s? that we’re happy to share on the eDiscovery Daily blog.  Enjoy! – Doug

Tom’s overview is split into four parts, so we’ll cover each part separately.  Here’s the first part.

Introduction

Over the past year I have asked this question several different ways in blogs and webinars about technology assisted review (TAR). Why is TAR like ice cream? Think Baskin Robbins? Why is TAR like golf? Think an almost incomprehensible set of rules and explanations. Why is TAR like baseball, basketball or football? Think never ending arguments about the best team ever.

And now my latest analogy. Why is TAR like a bag of M&M’s?  Because there are multiple colors with sometimes a new one thrown in and sometimes they have peanuts inside but sometimes they have chocolate.  And every now and then you get a bag of Reese’s Pieces and think to yourself, “ hmmmm, this is actually better than M&M’s. “

Two recent cases spurred this new rumination on TAR. First came the decision in Winfield, et al. v. City of New York, No. 15-CV-05236 (LTS) (KHP) (S.D.N.Y. Nov. 27, 2017) (covered by eDiscovery Daily here), where Magistrate Judge Parker ordered the parties to meet and confer on any disputes with regards to a TAR process “with the understanding that reasonableness and proportionality, not perfection and scorched-earth, must be their guiding principles.”  More recently is the wonderfully crafted validation protocol (covered by ACEDS here) from Special Master Maura Grossman in the In Re Broiler Chicken Antitrust Litigation, (Jan. 3, 2018) matter currently pending in the Northern District of Illinois.

Both of these cases harkened back to Aurora Cooperative Elevator Company v. Aventine Renewable Energy or Independent Living Center of Southern California v. City of Los Angeles, a 2015 where the court ordered the use of predictive coding after extensive discovery squabbles and the 2016 decision in Hyles v. New York City (covered by eDiscovery Daily here) where by Judge Peck, in declining to order the parties to use TAR, used the phrase on page 1 of his Order, “TAR (technology assisted review, aka predictive coding) … “.

Which brings me to my main point of discussion. Before we can decide on whether or not to use TAR we have to decide what TAR is.  This discussion will focus on the following topics:

  1. History and Evolution of Defining TAR
  2. Uses for TAR and When to Use or Not Use It
  3. Justification for Using TAR
  4. Conclusions

We’ll publish Part 2 – History and Evolution of Defining TAR – on Thursday.

So, what do you think?  How would you define TAR?  And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Image Copyright © Mars, Incorporated and its Affiliates.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Brad Jenkins of CloudNine: eDiscovery Trends 2018

This is the tenth of the 2018 Legaltech New York (LTNY) Thought Leader Interview series.  eDiscovery Daily interviewed several thought leaders at LTNY this year (and some afterward) to get their observations regarding trends at the show and generally within the eDiscovery industry.

Today’s thought leader is Brad Jenkins.  Brad is CEO of CloudNine.  Brad has over 20 years of experience as an entrepreneur, as well as 18 years leading customer focused companies in the litigation technology arena. Brad consults clients on implementing best practices in litigation document management and the impact of technology on managing discovery. Brad is very involved in eDiscovery educational efforts and currently serves as the President of the Houston ACEDS Chapter.  He’s also my boss!  :o)

What were your impressions of LTNY this year?

Personally, I was in meetings with clients, partners and analysts during most of the conference, so I don’t have a lot of observations to share.  It seemed as though a lot of people with whom I was meeting didn’t spend a lot of time at the conference either.  That seems to have become a trend – a lot of people come to New York for LTNY each year (from a lot of corporations, law firms and providers), but it seems that more and more of them are using that opportunity to set up meetings because LTNY provides that opportunity to meet in person.  More attendees seem to actually be spending their time at meetings at hotels near the Hilton instead of actually at the show in the Hilton.

From a CloudNine perspective, we had a booth in the exhibit hall and, from what I understand (and also observed when I stopped by the booth), traffic was generally good.  Though I’ve heard from others overall that traffic in the exhibit hall was down again this year and there were less providers again this year.  CloudNine also co-hosted the “Drinks with Doug and Mary” happy hour at Ruth’s Chris Steak house on Wednesday, along with ACEDS and our partner Compliance Discovery.  That was a huge success, as we had an overflow of people register for the event and a full house during the happy hour.  One of the best things about Legaltech is the opportunity to network with others in the industry and it’s still one of the best conferences for that.

If you were “king of LTNY” for a year, what changes would you make?

I would consider moving the conference to a larger venue – one that provided a large number of meeting rooms for the meetings that currently take place near the show at nearby hotels, so those meetings could actually be held right there at the show.  Assuming they charged a reasonable rate for the meeting rooms, Legaltech could then keep more of the attendees plugged into the show itself and those people could attend sessions and check out the providers in the exhibit hall much more easily.

CloudNine just recently acquired the eDiscovery product lines (Concordance, LAW PreDiscovery and Early Data Analyzer) from LexisNexis.  Why did CloudNine decide to acquire these products and what does CloudNine intend to do with them?

For CloudNine, the opportunity through this acquisition to offer a hybrid of both on-premise and off-premise solutions simply made sense to us as a way to support customer needs regardless of their eDiscovery task, security, and cost requirements.  This purchase enables us to immediately begin serving the on-premise segment with a portfolio of proven and performing products and it provides us the technology that will enable us to deliver fully integrated and automated solutions that can address off-premise requirements, on-premise requirements and combined on/off-premise (hybrid) requirements through a single provider.

CloudNine has been a user of the purchased product line offerings for more than a decade, so we understand what they can do today and the potential of what they can do in the future.  As an example, our production team uses LAW daily to support our clients’ processing and production needs.  Just as the daily use of our CloudNine SaaS platform enables us to uniquely understand how our customers use the product – because we are using it to accomplish the same tasks they are – our regular use of the acquired products provides that same level of unique understanding of how they are used as well.

As for our plans for the products, CloudNine plans to invest significant resources in the support, development, enhancement, and availability of these products.  They will be offered as solutions available as part of the CloudNine portfolio of offerings.  Our acquisition of these products includes the current development and support teams, so the customers using these products will be working with the same people that have been supporting their use of the products today, helping to ensure a smooth transition at the outset.  Our short-term plans include technology integration of our existing cloud offerings and our new on-premise offerings to support client needs by developing connectors using our Outpost technology to automate and accelerate data transfer.  CloudNine partners will also benefit from the purchase as it will give them additional access to proven eDiscovery offerings that will see that significant investment in development and support that I mentioned.

Our ability to acquire and invest in these products is facilitated by our new capital partners, Peak Rock Capital.  Peak Rock’s principals have extensive experience working with businesses in the technology and software industries and they have considerable experience with these types of transactions, so they will be integral to our plans to support and enhance the products.

We know that customers of these products likely have questions (as do others), so we prepared an FAQ document to help address as many of those common questions as possible.  We have already begun to reach out to existing customers for their feedback on the products and encourage them to reach out to us, as well, to enable us to understand what features and capabilities they would like to see added.  If you’re a recent or current customer of these products, we want to hear from you!

What would you like our readers to know about things you’re working on?

Isn’t that enough? {laughs}  Actually, one personal item to note is my involvement in starting a new chapter of the Association of Certified E-Discovery Specialists (ACEDS) in Houston.  I’m the current President of the Houston chapter and our chapter has been in existence for several months now.  The Houston ACEDS chapter just conducted a panel discussion earlier this month on Technology Assisted Review that was very well received and we look forward to future ACEDS events in Houston.

As for CloudNine, we’re obviously very excited about not only our recent acquisition, but about accomplishments at the company as a whole.  Earlier this year, we announced a new preservation and collection capability for CloudNine and, in just the past few days, CloudNine was highlighted in G2 Crowd’s Spring 2018 Grid Report for eDiscovery as one of the best eDiscovery software solutions based on customer satisfaction.  We’re very proud of the highly positive feedback we get from customers for our CloudNine platform and look forward extending that to our new product offerings.  We also continue to emphasize eDiscovery education through this blog and through our monthly webcasts on hot topics related to eDiscovery, information governance and cybersecurity.  All of these initiatives continue to further our mission of simplifying the discovery process for our customers and for others in the legal industry.  The more things change, the more our mission of simplifying discovery remains the same.

Thanks, Brad, for participating in the interview!

As always, please share any comments you might have or if you’d like to know more about a particular topic!

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation.  Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer:  The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine.  eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance.  eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Defendant Sanctioned for “Deliberately” Altering a Skype Communication: eDiscovery Case Law

In GoPro, Inc. v. 360Heros, Inc., No. 16-cv-01944-SI (N.D. Cal. March 30, 2018), California District Judge Susan Illston denied the plaintiff’s motion for summary judgment and denied the defendant’s motion in limine to exclude the testimony of the plaintiff’s forensic analysis expert, but granted (in part) the plaintiff’s motion for partial terminating sanctions against the defendant for forging evidence in two Skype conversations, opting for an adverse inference instruction sanction and reimbursement of expenses related to forensic analysis and testimony instead of the terminating sanctions sought.

But first, this week’s eDiscovery Tech Tip of the Week is about Saving Searches.  Documentation is a key component of a sound discovery process. The ability to automatically save the searches that you have performed not only saves time in retrieving those documents later, but it also helps document the process for performing searches should you have to defend your approach in court.  Saving searches is just one component of an overall program for documenting your approach in eDiscovery.

To see an example of how Saving Searches is conducted using our CloudNine platform, click here (requires BrightTalk account, which is free).

Case Background

In this case regarding federal and state trademark infringement and unfair competition, the defendant (in November 2016) produced to the plaintiff two emails (in a single PDF format) containing the transcript of two 2014 Skype conversations between representatives of the plaintiff and defendant where . In the Skype conversations as produced by the defendant, the plaintiff referenced the term “abyss” twice (the parties had a dispute over plaintiff’s ABYSS mark).

At his deposition, the defendant representative testified under oath that the PDF document was a true and correct copy of the Skype conversation, stating that he had copied and pasted the Skype conversation into an email, and then sent it to himself. He claimed the only alteration he made to the document was to highlight the two lines of conversation containing the word “abyss”. In response to the plaintiff’s request for the Skype files in their native form, the defendant representative claimed the original Skype conversation was no longer available to him.

As part of its investigation into the defendant’s claims, the plaintiff accessed equipment containing its end of the Skype conversation and its Skype records did not contain the two highlighted lines referencing “abyss.” To confirm their findings, GoPro retained a forensic expert (Derek Duarte of Blackstone Discovery) to conduct a forensic analysis, which determined that it’s representative’s imaged Skype database did not contain the two highlighted lines referencing “abyss”, leading to the motion for partial terminating sanctions.  In response, the defendant claimed that the expert’s results were unverifiable and unreliable because he could not verify that the data on the hard drive contained the same data as it did in 2014.

Judge’s Ruling

In ruling on the motion, Judge Illston ruled, as follows:

“The Court is not persuaded by defendant’s explanation of the suspect document, and concludes on the present record that defendant deliberately altered it in an effort to strengthen its legal position with respect the ABYSS mark. GoPro argues it has been prejudiced because as part of its investigation into 360Heros’ prior use defense, GoPro incurred various expenditures, including having to locate and hire an expert to forensically investigate the Skype chat. Sanctions less drastic than terminating sanctions are available to remedy any potential prejudice to GoPro. Accordingly, the Court finds that sanctions are warranted, and that the appropriate sanctions in this case are twofold: (1) an adverse inference instruction at trial, related to Mr. Kintner’s conduct; and (2) reimbursement to GoPro of the costs incurred in retaining Mr. Duarte, including expenses paid to Mr. Duarte and the cost of attorney time required to locate and retain Mr. Duarte. Plaintiff shall submit its statement of costs so incurred in a sworn document to be filed no later than April 13, 2018.”

Judge Illston also denied the plaintiff’s motion for summary judgment, finding that “defendant raises material issues of fact as to numerous of the factors”.  She also denied the defendant’s motion in limine to exclude the testimony of the plaintiff’s forensic analysis expert, but Judge Illston found the expert to be “qualified to testify on these matters” and that his proposed testimony was “directly relevant to the authenticity of the disputed Skype conversation.”

So, what do you think?  Did the judge go far enough with her sanctions?  Please share any comments you might have or if you’d like to know more about a particular topic.

Case opinion link courtesy of eDiscovery Assistant.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Law Enforcement Has Found a New Way to Put a Finger on iPhone Evidence: eDiscovery Trends

A dead finger, that is.  Believe it or not, cops are now opening iPhones with dead people’s fingerprints.

A couple of days ago Sharon Nelson (on her excellent Ride the Lightning blog) covered a Forbes article that discussed a suspect who mowed down a group of people in his car, went on a stabbing spree with a butcher’s knife and was shot dead by a police officer on the grounds of Ohio State University.  To try to access the phone to learn more about the assailant’s motives, an FBI agent applied the bloodied body’s index finger to the iPhone found on the deceased suspect.

In that case, it didn’t work as the iPhone had gone to sleep and when reopened required a passcode.  But, this technique is working in many other cases.  Separate sources close to local and federal police investigations in New York and Ohio, who asked to remain anonymous as they weren’t authorized to speak on record, said it was now relatively common for fingerprints of the deceased to be depressed on the scanner of Apple iPhones, devices which have been wrapped up in increasingly powerful encryption over recent years. For instance, the technique has been used in overdose cases, said one source. In such instances, the victim’s phone could contain information leading directly to the dealer.

Not surprisingly, there are concerns about whether a warrant should be required. Greg Nojeim, senior counsel and director of the Freedom, Security and Technology Project at the Center for Democracy & Technology, said it’s possible in many cases there would be a valid concern about law enforcement using fingerprints on smartphones without any probable cause. “That’s why the idea of requiring a warrant isn’t out of bounds,” Nojeim added.

Think having an iPhone X that replaces the fingerprint security with facial recognition technology will keep law enforcement at bay?  Think again.  It could be an easier way into iPhones than Touch ID. Marc Rogers, researcher and head of information security at Cloudflare, told Forbes he’d been looking at Face ID in recent months and had discovered it didn’t appear to require the face of a living person to work – apparently the technology can be deceived simply using photos of open eyes or even only one open eye on the suspect.  “In that sense it’s easier to unlock than Touch ID – all you need to do is show your target his or her phone and the moment they glance it unlocks,” he stated.

Or open the eyes of the dead suspect.  Dead men tell no tales?  Maybe they do after all.

So, what do you think?  Should a warrant be required to access phones with fingerprint or facial recognition technology?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Maura R. Grossman of the University of Waterloo: eDiscovery Trends 2018

This is the eighth of the 2018 Legaltech New York (LTNY) Thought Leader Interview series.  eDiscovery Daily interviewed several thought leaders at LTNY this year (and some afterward) to get their observations regarding trends at the show and generally within the eDiscovery industry.

Today’s thought leader is Maura R. Grossman.  Maura is a Research Professor in the David R. Cheriton School of Computer Science at the University of Waterloo and principal of Maura Grossman Law.  Previously, she was Of Counsel at Wachtell, Lipton, Rosen & Katz, where she pioneered the use of technology-assisted review (TAR) for electronic discovery.  Maura’s research with Gordon V. Cormack has been cited in cases of first impression in the United States, Ireland, and (by reference) in the United Kingdom and Australia, approving the use of TAR in civil litigation.  Maura has served as a special master in the Southern District of New York and the Northern District of Illinois to assist with issues involving search methodology.  In 2015 and 2016, Maura served as a coordinator of the Total Recall Track at the National Institute of Standards and Technology’s Text Retrieval Conference (TREC); in 2010 and 2011, she served as a coordinator of the TREC Legal Track.  Maura is also an Adjunct Professor at Osgoode Hall Law School at York University and at the Georgetown University Law Center.  Previously, she taught at Columbia Law School, Rutgers Law School—Newark, and Pace Law School.

I thought I’d start by asking about a couple of cases related to Technology-Assisted Review.  The first one that we talked about that I’d be interested in your thoughts about is FCA US v. Cummins where the judge (Avern Cohn) stated a position that applying TAR without prior keyword culling is the preferred method (although he said he did it rather reluctantly).  What are your thoughts about that decision?  What are your thoughts in general about courts making rulings on preferred TAR approaches?

I don’t believe there is a black or white answer to this question.  I think you have to consider how things might be done in a perfect world, and how they might be done in the real world we live in, which includes time, cost, and burden for ingesting and hosting large volumes of data.  In a perfect world, you wouldn’t perform multiple, sequential culling processes because each one removes potentially relevant information, and that effect is multiplied.  So, if you do keyword culling first—let’s say you do a good job and you get 75% of the relevant data, which would be pretty high for keywords—then, you apply TAR on top of that and, again, you get 75% of the relevant data.  For each of the steps you apply, those steps would sequentially reduce the number of relevant documents you found, resulting in a combined result of about 56% of the relevant data.

In a perfect world, you wouldn’t do that, and you would just put all the data into the TAR system and get 75% recall.  By “all” of the data, I don’t mean the entire enterprise, I mean all of the appropriate data—the appropriate time frame, the appropriate custodians, and so forth.  That’s in a perfect world, if you wanted to maximize your recall, and there were no time and cost considerations.  But, we don’t actually live in that world most of the time.  We live in a world where there generally are time and cost constraints, and per-gigabyte loading and hosting fees. So, often, parties feel the need to reduce the amount of data they are loading into a review platform to reduce processing and hosting fees, and other potential fees as well.  Therefore, parties often want to use keywords first in the real world.  Ultimately, you have to look at the specifics of the matter, including the volume of data, the value and importance of the case, and what’s at stake in the matter.

You also have to look at how effective keywords will be in retrieving information in the particular case at hand.  Keywords may work very well in a case where everything is referred to by its “widget number,” but keywords may not work as well in certain types of fraud cases where parties don’t always know the specific language used, or what the nature of the conspiracy was.  Keywords can be more challenging in those situations.  So, you really have to look at the whole picture.

In FCA v. Cummins, I think what the judge was trying to say was that generally, in a world without any of these other considerations, the best practice would be not to cull first.  I would tend to agree with that from a scientific and technical perspective.  But, that’s not always practical.

Also, I believe it was in Rio Tinto v. Vale where Judge Peck said (in dicta) that in a perfect world, if there were no cost or time or any other considerations, you would take all the data and you would just use the best TAR system you could, and you would be more likely to find the highest number of relevant documents.  But that can drive up costs, and that may not be not proportionate in all cases.  So, it’s really a question of proportionality, and what will work best in each situation, and how much time and resources you have to iterate and test the keywords, and other related factors.

Also, as you know in FCA v. Cummins, the judge didn’t really go into much detail; it was a very short decision.  Maybe it was a relatively small data set and loading it all didn’t make much of a difference.  We just don’t know enough about the facts there.

I got the impression that this case might have involved two equally weighted parties, with equal amounts of data, so the judge may have felt that the parties needed to perform TAR the same way, so he felt he was forced to make a decision.  Do you think that would have an impact as to why a court might decide or not?

I think that where the data volumes (and therefore burdens) are symmetric, there tends to be an understanding that what’s good for the goose is good for the gander.  Parties in those circumstances tend to be more circumspect about what they demand because they know they’ll be subject to the same thing in return.  If I’m representing one party and you’re representing the other, and I ask for everything in native form, I’m probably not going to be able to turn around and argue that I don’t want to produce in native, too, unless I have an awfully good reason for that.

So, I do think that changes the landscape a little bit.  Parties tend not to ask the other side for things that are unduly burdensome if they’re going to be forced to provide those same things themselves.  It can be very different when one side is using TAR and the other side isn’t, and when motivations or incentives are not aligned.  That can affect what parties request.

Another case we talked about was Winfield v. City of New York, and one of the key aspects of the objections by the plaintiffs about the defendants’ TAR process was the process of how they had been designating documents as non-responsive.  What are your thoughts about that?  Do you think arguments that the subjectivity of the subject matter experts will come into play in more and lead to objections in other cases?

Most of the research that I’ve reviewed and most research that I’ve done with Gordon Cormack has suggested that a few documents coded one way or the other are highly unlikely to make a substantial difference in the outcome—at least for CAL algorithms, and even for most robust SAL algorithms.  I know that people like to say, “garbage in, garbage out,” but I’ve never seen any evidence for the proposition that TAR compounds errors made by reviewers, and there is some evidence that TAR can mitigate reviewers’ errors.  The results of most TAR systems appear to be satisfactory unless there are pretty significant numbers of miscoded documents, for example, in the 20 to 25 percent (or higher) range.  Of course, if you’ve coded all of the documents of a particular kind as “not relevant,” you’ve now taught the algorithm not to find anything like that.  Chances are, though, if you have multiple reviewers, whether contract attorneys or even junior associates, not everything will be marked the same way.  There’s going to be a fair amount of noise in the coding decisions, and that doesn’t seem to have a major impact on the results.

With some of the early TAR systems, a lot of commentators said that it had to be a senior partner, or an expert, who trained the system.  But, most of the research that we’ve done, that Jeremy Pickens at Catalyst has done, and that others have done, suggests that a little noise from junior reviewers or contract attorneys, who may be a bit more generous in their definition of relevance, actually yields a better algorithm than a senior partner, who may have a very, narrow view of what’s relevant.  The junior people, who are more generous in their conceptions of relevance, tend to train a better algorithm—meaning a system that will achieve higher recall—in the long run.  So, a little bit of noise actually doesn’t hurt.

I wasn’t particularly surprised that in Winfield there were a few documents that were “arguably relevant,” and about which the two sides disagreed on coding.  That’s going to happen in any matter, and that’s not really going to affect the outcome one way or the other, because those documents are marginal in the first place.  Certainly, if someone is systematically marking all the “hot” documents as non-responsive, that will make a difference, but that wasn’t what was going on there.

In Winfield, the Court said, the documents were marginal, and arguably were relevant.  The Court also said it had reviewed the training process in camera and there was nothing wrong with it.  Most of the case law says that a party shouldn’t get discovery-on-discovery unless there’s a material flaw in the adversary’s process.  If you look at the position taken in the Defense of Process paper that The Sedona Conference published, then that should probably have been the end of the discussion.  But, the judge in Winfield went one step further, and said, “Well, because there’s some evidence that there may have been some disagreements in coding, I’ll permit a sample.”  That’s a little scary to me, because if there was no material deficiency, and we’re talking about a few marginal documents with coding where people disagree, that’s going to occur in every single case.

When you open up the collection to sampling, what happens is that the parties will find more marginal documents where they disagree on the coding.  That often leads to a lot of sparring, as we’ve seen in other cases where the parties disagree about marginal documents and fight about them, and that just drives up cost.  In the long run, those are not the documents that make or break the case.

We’re at a point where more and more people are using TAR, but a lot of people still haven’t really embraced TAR yet.  For those people who have not really gotten started with it, what would be your advice on how they could best get started on learning and applying TAR in their cases?

I would suggest they play with it, and try it out on a set of data that they have already completely manually reviewed and thoroughly QC’d, and where they are confident that everything was well done.  Use that to test some of the different tools out there before you have a live matter you want to use it on, so that you don’t have to decide what tools work and don’t work while you are in a crisis mode, when time is of the essence.

It would be helpful to do that homework, and develop a good understanding of the different work flows, and the different tools, and what kinds of data they work better or worse on.  For example, some are better with spreadsheets than others, some are better with OCR text, and others are better with foreign language or short documents.  Ideally, counsel would do that homework beforehand, and know something about the different tools that are available and their pros and cons.

If they haven’t done that, or feel they can’t, then I would encourage people to use it in ways that don’t impact defensibility considerations as much.  For example, they can use it on an internal investigation, or on incoming data, or simply to prioritize their data for review—even if they plan to review it all—so that they can start reviewing the most-likely responsive documents first and work down from there.

There are also many uses for QC, where the algorithm suggests that there may be errors.  Look at the documents that the TAR system gave a high score for relevance that the reviewers coded as “not relevant,” or vice versa.  There are many uses that don’t implicate defensibility where people can still try TAR, see how it works, and get comfortable with it.  Usually after people see how it works and see that it’s effective—if they’re using a tool that actually is effective—it’s not a hard sell after that.  It’s that first step that’s the hardest, and that’s why I encourage people to do the testing before they’re in a critical situation, before they have to go in front of the court and argue whether they can use it, or not use it.

What would you like our readers to know about things you’re doing, and what you’re working on?

I continue to do research on TAR tools and processes, and on the evaluation of TAR methods.  Gordon Cormack and I are “heads down” doing a lot of work on those things.  One area that we’ve been addressing recently is the notion that some people have been saying that a CAL process can only be used if you’re actually going to put eyes on every document.  Because of that, some people prefer the SAL approach because it can give them a fixed review set.  There is a method we’ve written about, and for which we’ve filed a patent, called S-CAL.  We’ve been doing a lot more work in that area to help parties get the benefits of CAL, but still be able to have the predictability they want of knowing exactly how many documents they’re going to have to review, so they can know how many reviewers they need, how long the review will take, and what it will cost.  Our aim is to be able to do that using a form of CAL, but also to be able to provide an accurate estimate of recall and precision.

That’s one area of research we’re working on.  I’m also becoming increasingly interested in artificial intelligence and the legal, ethical, and policy issues it implicates.  Last semester, I taught the first course (that I’m aware of) that brought together 18 computer science graduate students, and 15 law students, to explore different areas of artificial intelligence and the legal, ethical, and policy issues associated with them.  For example, we looked at autonomous cars, we looked at autonomous weapons, we looked at relationships with robots, and we looked at what to do about job loss.  We looked at data privacy, and the concentration of vast amounts of personal data in the hands of a small number of private companies.  We looked at predictive policing and use of algorithms to predict recidivism in the criminal justice system, and it was a really, really interesting experience, bringing both of those groups together to do that.  I’ve been focused a little more in that area, as well as continuing my information retrieval research and other research in collaboration with Gordon and my other colleagues and students at the University of Waterloo.  And, of course, Gordon and I work on TAR matters.  I still do consulting, expert work, and serve as a special master, and I really love that part of my job.

Thanks, Maura, for participating in the interview!

As always, please share any comments you might have or if you’d like to know more about a particular topic!

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation.  Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer:  The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine.  eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance.  eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Jason R. Baron of Drinker Biddle & Reath LLP: eDiscovery Trends 2018

This is the seventh of the 2018 Legaltech New York (LTNY) Thought Leader Interview series.  eDiscovery Daily interviewed several thought leaders at LTNY this year (and some afterward) to get their observations regarding trends at the show and generally within the eDiscovery industry.

Today’s thought leader is Jason R. Baron.  Jason is a member of Drinker Biddle & Reath LLP’s Information Governance and eDiscovery practice and co-chair of the Information Governance Initiative.  An internationally recognized speaker and author on the preservation of electronic documents, Jason previously served as the first Director of Litigation for the U.S. National Archives and Records Administration, and as trial lawyer and senior counsel at the Department of Justice.  He also was a founding co-coordinator of the National Institute of Standards and Technology TREC Legal Track, a multi-year international information retrieval project devoted to evaluating search issues in a legal context.  He served as lead editor of the recently published ABA book, Perspectives on Predictive Coding and Other Advanced Search Methods for the Legal Practitioner.

What were your general observations about LTNY this year?

{Interviewed the last day of the show}

We have come to a moment where artificial intelligence (AI) is being recognized as important for the legal industry. You see it everywhere. Five years ago, we saw the emergence of one form of AI in the guise of technology-assisted review in e-discovery.  Now the moment has arrived for the merger of AI and law more generally — not just for the purpose of more efficiently finding relevant documents in the haystack, but using artificial intelligence techniques across a spectrum of legal contexts. That’s a good thing.

I just finished reading two books that I highly recommend to your readers.  One is by Max Tegmark, called Life 3.0. Another is by the former chess grandmaster of the world, Garry Kasparov, called Deep Thinking.  Both books talk about the rise of AI in our lives. Tegmark has this wonderful illustration of the rising waters of AI, where it now engulfs chess and Go, and is lapping up against more creative intellectual activities including story writing and software development. Whether we’re talking about robots, intelligent agents, or software with predictive powers, we are seeing AI replace tasks carried out both in factories as well as by the professional class.  I would think that over the course of the next five to ten years, we’re going to see at Legalweek a greater and greater focus on AI applications in the law and what that means, including issues surrounding law and ethics.

The “trolley car problem” – involving whether one should throw a switch to make sure that a hypothetical train doesn’t hit a group of children instead of a large gentleman — is now a real problem faced by the makers of driverless car software.  With driverless cars and taxis, you’re going to see injuries in some cases.  So, there’s the question of liability, i.e., whether the software developer or manufacturer are held to a standard of strict liability, and what kind of ethical considerations are involved.  We’re seeing a world of future hypotheticals coming into being across a whole range of applications.  I think that’s exciting.

In your session at Legaltech regarding Internet and Things, your panel discussed privacy and ethics. When it comes to mobile devices, Internet of Things devices, and so forth, it certainly seems to me that a lot of attorneys would prefer not to worry about data on those devices or go collect from them. What do you think is going to be necessary to change that mentality?

I don’t know whether the mentality really has to change, especially in light of the 2015 Amendments that highlight the need for proportionality and discovery. I have always been a fan of iterative processes and tiered eDiscovery, so that you get early on the good stuff (i.e., the “low hanging fruit”). So now, we’re talking about a whole set of devices that are streaming data and a lot of applications that are out there – and what we discussed in that session and what I believe is that courts should be taking a hard look at the need for, in the first instance, going after all of these various types of communications and streams of data.  In other words, a judge should be saying:  “Why don’t we start with traditional email or text messages, and go on from there in terms of discovery of other apps and other data streams.”

I think the jury is out as to whether data from the Internet of Things is itself going to be at the center of a huge amount of litigation in the near-term. There’s clearly some case law already on personal wearable devices and there will be litigation about software used in driverless cars.  And there are a bunch of cases in the civil and criminal areas where smart devices or intelligent agents (like Echo) seem to be omnipresent as evidence-gatherers – acting as an artificial “fly on the wall” when bad things happen in apartments or homes.  So we are seeing at the margins some case law –but I’m not sure that there’s going to be a rapid rise in terms of eDiscovery case law with respect to all of these different appliances. I think the point though not to lose sight of is that we still have a large task in handling more traditional forms of documents and ESI, and that these may still be the “low hanging fruit” in many, many cases without worrying about exotic forms of IOT that may or may not be relevant. Nonetheless, we’re increasingly in a world of smart devices, so to the extent of smart devices provide evidence of something that’s going wrong in the world, and there’s a legal case to be had, that kind of data will have to be dealt with.

The bottom line is that competency for lawyers is changing.  It’s not just whether you know the difference between various forms of technology assisted review and whether you’re up on the latest continuous active learning, TAR 3.0, 4.0, or whatever.  It’s not just that. It’s not tied to the big case. It’s that you need to be aware that there are sources of data everywhere, in every case. Whether it’s a family law case or a personal injury case or whatever, there may be sources of data beyond what lawyers of a certain age know about in having previously sought.  So, the duty of competence is really just basically the duty of keeping up with the world around us in 2018 and beyond.

Another big topic at the show has been GDPR.  What are your observations on GDPR and how it’s going to impact, not just how information is handled in the EU, but how American companies are going to work with companies that have information in the EU?

The practice that I joined a few years ago at Drinker Biddle is an Information Governance and eDiscovery group. There’s a separate set of lawyers here who have been, for many years, experts in EU privacy law. It has been quite obvious to me in the last year in the run-up to GDPR that these practice groups really need to merge, and that the kind of questions that we are getting from companies with a global footprint about information governance are entwined increasingly with “what do we do about GDPR?”  We will know more after May 25, 2018, of course, when compliance rulings and interpretations are handed down, and fines are levied, in terms of what constitutes best practices under the GDPR.  But in the meantime, I’d say that GDPR-readiness is acting as a driver for US companies paying more attention to best practices in information governance. So I think it’s a good thing all of us have gotten a little bit up to speed on GDPR requirements.

I’ll tell you one aspect which may or may not be the sexiest topic in the world, but it’s the world I inhabit: on the issue of record retention, GDPR actually represents a sea-change in the way one goes about thinking about a corporate firm’s retention obligations. I’ve written about this in Ethical Boardroom and other places. The typical engagement for us as a law firm is being asked to provide advice on harmonizing a global set of record requirements into a schedule with simplified bigger buckets, coupled with automating processes around electronic content management.

It’s always been the perspective in US records schedules that the retention periods set out in the schedules operate as minimums for purposes of Sarbanes-Oxley, HIPAA, TARP, whatever. You name the vertical and it’s a minimum. For compliance purposes, you have to save data for a certain amount of years. If you save it longer, there’s no big penalty in most instances. Well, the GDPR is flipping that long-held assumption.

The specter of having an EU audit where your firm holds petabytes of data that involve potential personal information that has been in lying around for a decade or more after a retention period has ended is, shall we say, problematic. It’s not going to affect every company right away in May 2018. But, I would predict that if we’re talking in a year or two or three, some entity is going to be fined out there. Whatever the records schedule says now is a potential landmine for a company, unless it pays stricter attention to ensuring compliance with the retention periods within the schedule.   The environment that I see is one which is probably good for lawyers, because at least at firms like mine, companies are coming to us saying they really haven’t grappled with the disposition of legacy data. They may have some policies in place, but it’s not really automated in a way that results in real deletion. The bottom line: what is needed are defensible deletion policies that are complied with in accordance with records schedules, so as to meet important aspects of the GDPR.

The last thing I’d say is that, as is well known, the entire subject of privacy represents a paradigm clash as between the US and the EU, especially with respect to the concept of the “right to be forgotten.” I actually have been on record for a number of years as being quite sympathetic to the EU perspective — for example, at Georgetown’s 2017 Advanced eDiscovery program I gave one of the so-called “eD talks (sort of like a TED talk),  I said that I didn’t wish to be a shill for a future corporate Orwellian state. In that talk, I traced the issues that have animated me for the past 15 years or so about being smart in the eDiscovery space about search.  But I also noted that AI has evolved to the point where we now are using analytics in ways that may be increasingly creepy in terms of surveillance of employees, or the ability to de-anonymize data on consumers.

All of that said, at Georgetown and in other talks I have lobbied for a notion of corporate responsibility in the AI and law space – arguing that there should be something akin to IRBs – human subject review panels – used, where corporations consider the algorithmic impact on people and a need for greater transparency on what decisions are being made by software.  Beyond algorithmic bias and surveillance, I would bet there are a hundred other types of issues in the space that what I will call an “algorithmic review board” might be called upon to handle.  But in my view there’s some level of corporate responsibility to be met in an increasingly AI era.   So, I think the EU privacy model is one that we should pay attention to in terms of the impact of algorithms on our lives, and what it means to have some sort of zone of privacy that you have meaningfully consented to as an employee or consumer.

You mentioned blockchain and that another topic your panel discussed in your session yesterday.  How do you see that unfolding and the impact of watching on the legal industry?

As I said at the session the other day, on the Gartner hype cycle the buzz around blockchains is definitely going up.    Of course, regulation of cryptocurrencies is a very hot topic.  However, blockchain and distributed ledger technologies are not just Bitcoin or ICO’s.  Rather, blockchains represent a new way of establishing trust on the internet.  One can imagine endless variations and possibilities of using blockchain applications for good purposes that have nothing to do with cryptocurrencies. You can use the distributor ledger technology for record keeping, for supply chains, for any number of applications which are of great interest. There isn’t a day that goes by where I don’t see some article that says ”Blockchains will be a disruptive force in ‘such and such’ industry.” Now, is it hype? Some of it may well be, but I think that, at bottom, the idea that you can hash information in a way to put together in a chain and make it immutable — where you have trust that that chain retains within it some kind of authentic pointers to information, and that you basically trust the objects themselves in a way that doesn’t rely on third parties —  is exciting.

It’s a very interesting development. You see a lot of interest across industries. There’s still a certain mystery to blockchain. Where are mining operations? Who’s doing the mining? How do the algorithms work? What is a blockchain’s future when all the tokens have been mined?   I myself have questions about all of that and don’t profess to understand all the details. But, I have been really interested in the potential for these applications and we’re going to see it talked about more and more. If AI was the primary new thing for Legalweek this year, I think blockchain was also right up there. We’ll see in the future.

I think there’s a wonderful moment here where more lawyers should be involved in at least knowing what the technology is all about and thinking creatively about its applications for the future.

What would you like our readers to know about things you’re working on?

My professional interests are a bit different from most of the people that hang out at Legaltech, mainly due to the fact that I spent 33 years in the government, including at the Justice Department and as Director of Litigation to the National Archives. I still have a passion for how to preserve and how to access public records in digital form. I’ve been very privileged over the last year to give talks in Amsterdam, in Vienna, in Cape Town, in London, and in the US and Canada all on the subject of how we should be thinking about amassing huge collections of public record archives in digital form, and how to access those records. Paradoxically, you put stuff in digital form with the idea that you’re going to be able to search it easily, compared with boxes and manual paper. However, it ends up that it’s very difficult to access huge digital collections, especially if they are filled with personally identifiable information (PII) and other forms of sensitive data.  What animates me in the papers that I’ve done at IEEE and at other conferences and forums is to talk about the need to apply what we know in the eDiscovery space now with respect to AI. Machine learning technologies can be very helpful to extract out sensitive data from large collections, and to have a public use version of the larger collection in some form in order that people can get access to huge collections of email or other electronic records that constitute public archives.

So I intend to devote a fair amount of time going forward on issues concerning the freedom of information aspects of the law. How do we stay informed about what governments are doing? That’s a difficult question in the US and it’s even more difficult around the world. That is of interest to me. I’m very thankful that I work in a law firm that has allowed me the opportunity to pursue that interest, in addition to thinking about matters that actually result in billable hours! {laughs}

Also, the Information Governance Initiative continues apace with its just published third State of IG Report.  (See www.iginitiative.com.)   Barclay Blair has led the way on that. I think we are seeing a greater penetration in the corporate space of the idea of IG, that there’s a greater maturity, a greater acceptance of IG councils and IG champions. All of that’s good. We have, as we have had for the last three years, ia Chief Information Governance Officer (CIGO) Summit in Chicago, which will take place on May 9th and 10th.  As we always have, we gather together for a single summit 60 or 70 individuals who are card-carrying IG people that have some kind of title in the space. We talk about leadership.  And the IGI will continue to be partnering with lots of innovative companies to produce white papers and to have an ongoing conversation about the importance of Information Governance. I’m delighted to be part of that effort.

Thanks, Jason, for participating in the interview!

As always, please share any comments you might have or if you’d like to know more about a particular topic!

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Today is the Day for the University of Florida E-Discovery Conference!: eDiscovery Best Practices

The University of Florida E-Discovery Conference is being held today!  And, for the first time, I’m going to be there!  Regardless of where you are, it’s not too late to attend!

The focus of this year’s conference is effectively managing the everyday case and they will have interesting sessions throughout the day, covering topics ranging from eDiscovery security and data protection to early assessment of the case and the data to keywords, TAR and AI (do I need to spell out those acronyms anymore?).  Want to know about eDiscovery of the JFK files?  It’s here.  Want to get judges’ perspectives on sanctions and other eDiscovery issues?  That’s here too.

The panel of speakers is a regular who’s who in eDiscovery, including Craig Ball, George Socha, Kelly Twigger, David Horrigan, Martin Audet, Mary Mack, Rose Jones, Mike Quartararo and also US Magistrate Judges John Facciola, James Francis, Judges William Matthewman, Mac McCoy, Amanda Arnold Sansone and Gary Jones, and retired Florida Circuit Court Judge Ralph Artigliere.

I’m on a panel discussion at 9am ET in a session titled Getting Critical Information From The Tough Locations – Cloud, IOT, Social Media, And Smartphones! with Craig, Kelly, with Judge Sansone.  We’ll be discussing real solutions for collecting ESI from those difficult locations.  Check it out!

The conference is being conducted in Gainesville, FL on the University of Florida Levin College of Law campus, though I understand it’s a full house.  However, it’s also being livestreamed.  There are CLE-accredited sessions all day from 8am to 5:30pm ET and the conference has been approved for 7.5 Continuing Legal Education (CLE) general credits, 2.0 ethics credits and 3.0 technology credits by the Florida Bar for attorneys attending the conference. The Florida Bar has also approved 7.5 civil trial certification credits.    So, this is a great opportunity to get those needed CLE credits!

Also yesterday, E-Discovery CareerFest was conducted.  And, for the first time, the Law School E-Discovery Core Curriculum Consortium (composed of law professors teaching electronic discovery courses at their respective law schools) will host its first in person workshop focusing on curriculum development on Friday, March 30th from 9am to 12pm ET.

Click here to register for the conference – it’s only $99 for livestream attendance.  And, if you’re a currently enrolled student (in an ABA accredited law school, accredited E-Discovery graduate program or accredited paralegal program), it’s free(!), either in person or livestreamed.  It’s also free if you’re university or college faculty, professional staff, judicial officials, clerks and employees of government bodies and agencies, it’s free(!) for you too.  Come check it out!

So, what do you think?  Are you going to attend the conference next month?  If not, why not?  Please share any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.