Analysis Archives

Jason R. Baron of Drinker Biddle & Reath LLP: eDiscovery Trends

February 26, 2016

This is the third of the 2016 LegalTech New York (LTNY) Thought Leader Interview series. eDiscovery Daily interviewed several thought leaders at LTNY this year to get their observations regarding trends at the show and generally within the eDiscovery industry. Unlike previous years, some of the questions posed to each thought leader were tailored to their position in the industry, so we have dispensed with the standard questions we normally ask all thought leaders.

Today’s thought leader is Jason R. Baron. An internationally recognized speaker and author on the preservation of electronic documents, Jason is a member of Drinker Biddle’s Information Governance and eDiscovery practice and also a member of the leadership team for the Information Governance Initiative. Jason previously served as Director of Litigation for the U.S. National Archives and Records Administration (NARA) and as trial lawyer and senior counsel at the Department of Justice. He was a founding co-coordinator of the National Institute of Standards and Technology TREC Legal Track, a multi-year international information retrieval project devoted to evaluating search issues in a legal context. He also founded the international DESI (Discovery of Electronically Stored Information) workshop series, bringing together lawyers and academics to discuss cutting-edge issues in eDiscovery.

What are your general observations about LTNY this year and how it fits into emerging trends?

It’s clear to me that there has been a maturing of the market for the kind of analytics software that some of us have been evangelizing about in the eDiscovery space for some time. This year, it was noticeable that there weren’t 27 sessions devoted to technology assisted review in e-discovery cases! However, in place of that narrower focus, there were any number of sessions on analytics and applying analytics to a broader segment of the legal space, which I applaud.

Also, I think there was an acknowledgement that, from the perspective of Information Governance, there is an analytics play to be had. With bigger and bigger data sets, companies need to face the fact that both employees and customers generate huge amounts of data and they need to make sure that they understand and have visibility into that data. So, the tools that evolved for purposes of eDiscovery are perfectly suitable – with tweaks – to cover a variety of legal purposes, and we’re seeing that play out at LegalTech.

At LTNY, you were one of the panelists on the Thursday keynote addressing issues such as private servers, bring-your-own-device (BYOD) and other organization challenges for managing data by individual employees. What do organizations, such as government entities and corporations, need to do manage personal data more effectively?

Well, I’m glad you asked me that. The session that I had the privilege of speaking on (with Judge Scheindlin and Edward McMahon as fellow panelists and Professor Dan Capra moderating) was all about what I call “shadow IT,” which is a phenomenon that is closely related to but distinct from BYOD. In the past decade or so, we all have been empowered to simply go to the Internet to use whatever variety of cool apps that are out there, like Google Docs and Dropbox, to facilitate communications and doing work and “parking” documents. We go out and communicate routinely on Gmail and other forms of commercial services. All of these activities, to the extent that they involve communications that relate to business or the work of governments, are what I consider to be “shadow IT” in nature because they are not controlled by a traditional IT department in a corporation or agency.

So, maybe a decade ago, if there was a Rule 34 request, you were pretty much assured that all of the relevant material could be gathered by a state-of-the-art IT custodian performing a collection effort against individual accounts on an official system. That’s no longer absolutely the case. Today, you need to ask follow-up questions as to where individuals are parking their documents and where are they communicating outside the “official” channel for doing so.

In government, there are very well known, long standing rules for what constitutes a Federal record, including email. There is an expectation on the part of the public – and there should be an expectation on the part of government officials — that Freedom of Information Act (FOIA) requests for records created about government business will be made available. (Indeed, at least some of those records will be preserved as permanent records in the National Archives of the United States.) So, it is incumbent to make sure that one follows the rules — and the rules for government are different than what they are for the private sector. A clear statute in place since 2014 says that anytime that you’re communicating about government business on a private commercial network, you need to either “cc” or forward that message within twenty days to an official record keeping system. This isn’t the place to get into what regulations were prior to 2014 and how that plays out in terms of the political realm, but our panel did cover the general topic of the responsibility of the officials to make sure that their communications about government business are, in fact, captured in an official system somewhere.

Also, for some time, I have been a very big advocate of email archiving and capture technologies generally, so that we don’t lose history and don’t lose a broad swath of government records that are otherwise not going to be captured if you simply leave it to individuals themselves to take steps to preserve.

The problem of shadow IT is one that is equally of concern in the private sector because high level corporate officials sometimes, in various verticals, are governed by strict email archiving requirements (e.g.,SEC and FINRA rules). So senior people need to also be aware that, if they’re communicating about cover topics outside of the usual channels, they need to take additional steps to make sure that those are properly archived.

These issues are only emerging now and it’s probably only going to get “worse”! In my view, the issues are going to be more complex in the future with more apps, more platforms, more devices and more opportunities for “end runs” around the traditional IT department.

In the case Nuvasive v. Madsen Medical, the Court recently vacated an adverse inference instruction sanction previously applied against the plaintiff because of the amendment to Rule 37(e). Do you see that as a trend for other cases and do you expect that other parties that have been sanctioned will file motions to have their sanctions re-considered?

There are some subtle provisions as to when courts will or will not apply the new rules to existing cases. But, beyond that, I have been watching with great interest the number of decisions that have been handed down that are applying the new provisions of Rule 37, and doing so in a way that suggests that courts will continue to be quite active in monitoring what is happening in discovery — imposing severe sanctions where appropriate and, when there isn’t the requisite level of intent, applying some sort of curative measures otherwise. So, I think there may have been a greater level of judicial activity than was anticipated in the immediate period since December 1 when the rules changed. It seems clear to most observers in the space that we’re going to have dozens and dozens of decisions in 2016 that apply the new rules, and we will get to see the patterns emerging pretty quickly.

What are you working on that you’d like our readers to know about?

I think the exciting work of the Information Governance Initiative (IGI) continues to push smart conversations in the space about how corporations can get a handle on their data. We had a very successful IGI summit, known as the Chief Information Governance Officers (CIGO) summit, in Chicago last year. We’re going to have the second CIGO summit in May of this year again in Chicago and we’re looking forward to that. We also have any number of activities that we’re planning to do in terms of retreats, dinners and boot camps, etc. I think IG is still an emerging discipline that should be of great interest to many corporate actors who don’t have a good handle on their existing workflows, policies and programs about data – whether it’s data breach or data reduction or data archiving or data analytics. I feel very privileged to be part of a group of individuals at the IGI that are really doing some serious thinking about these types of topics.

I must say I was surprised by Monica Bay at LegalTech, who pulled me in at the last moment to be a judge at the second “Shark Tank” session held there — where I felt a little like being on “America’s Got Talent” as one of three judges in the room looking at the individual entrepreneurs who were giving presentations. But, as the session progressed (and as recorded by David Horrigan, who was tweeting the session in live stream fashion) it seemed very clear to me that maybe it’s time for me to retire! I say so because of the profusion of disruptive technologies in the space, whether it has to do with smart contracts or dialing up lawyers over the web, it all heavily suggests that all of our current business models are going to be disrupted in due course and maybe very soon! There are simply a lot of exciting technologies in the space for which the CodeX people are fostering a platform. In the end I confess to being quite happy that Monica pulled me in, and I would urge your readership to pay attention to what CodeX is doing. I believe there is a conference coming up (CodeX FutureLaw 2016) on May 20, which is focusing on how technology is changing the landscape of the legal profession and the impact of those changes.

Thanks, Jason, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

George Socha of Socha Consulting LLC: eDiscovery Trends

February 24, 2016

This is the second of the 2016 LegalTech New York (LTNY) Thought Leader Interview series. eDiscovery Daily interviewed several thought leaders at LTNY this year to get their observations regarding trends at the show and generally within the eDiscovery industry. Unlike previous years, some of the questions posed to each thought leader were tailored to their position in the industry, so we have dispensed with the standard questions we normally ask all thought leaders.

Today’s thought leader is George Socha. A litigator for 16 years, George is President of Socha Consulting LLC, offering services as an electronic discovery expert witness, special master and advisor to corporations, law firms and their clients, and legal vertical market software and service providers in the areas of electronic discovery and automated litigation support. George has also been co-author of the leading survey on the electronic discovery market, The Socha-Gelbmann Electronic Discovery Survey; in 2011, he and Tom Gelbmann converted the Survey into Apersee, an online system for selecting eDiscovery providers and their offerings. In 2005, he and Tom Gelbmann launched the Electronic Discovery Reference Model project to establish standards within the eDiscovery industry – today, the EDRM model has become a standard in the industry for the eDiscovery life cycle and there are nine active projects with over 300 members from 81 participating organizations. George has a J.D. for Cornell Law School and a B.A. from the University of Wisconsin – Madison.

What are your general observations about LTNY this year and about emerging eDiscovery trends overall?

{Interviewed the first morning of LTNY, so the focus of the question to George was more about his expectations for the show and also about general industry trends}.

This is the largest legal technology trade show of the year so it’s going to be a “who’s who” of people in the hallways. It will be an opportunity for service and software providers to roll out their new “fill in the blank”. It will be great to catch up with folks that I only get to see once a year as well as folks that I get to see a lot more than that. And, yet again, I don’t expect any dramatic revelations on the exhibit floor or in any of the sessions.

We continue to hear two recurring themes: the market is consolidating and eDiscovery has become a commodity. I still don’t see either of these actually happening. Consolidation would be if some providers were acquiring others and no new providers were coming along to fill in the gaps, or if a small number of providers was taking over a huge share of the market. Instead, as quickly as one provider acquires another, two, three or more new providers pop up and often with new ideas they hope will gain traction. In terms of dominating the market, there has been some consolidation on the software side but as to services provider the market continues to look more like law firms than like accounting firms.

In terms of commoditization, I think we still have a market where people want to pay “K-mart, off the rack” prices for “Bespoke” suits. That reflects continued intense downward pressure on prices. It does not suggest, however, that the e-discovery market has begun to approximate, for example, the markets for corn, oil or generic goods. E-discovery services and software are not yet fungible – with little meaningful difference between them other than price. I have heard no discussion of “e-discovery futures.” And providers and consumers alike still seem to think that brand, levels of project management, and variations in depth and breadth of offerings matter considerably.

Given that analytics happens at various places throughout the eDiscovery life cycle, is it time to consider tweaking the EDRM model to reflect a broader scope of analysis?

The question always is, “what should the tweak look like?” The questions I ask in return are “What’s there that should not be there?”, “What should be there that is not?” and “What should be re-arranged?” One common category of suggested tweaks are the ones meant to change the EDRM model to look more like one particular person’s or organization’s workflow. This keeps coming up even though the model was never meant to be a workflow – it is a conceptual framework to help break one unitary item into a set of more discrete components that you can examine in comparison to each other and also in isolation.

A second set of tweaks focuses on adding more boxes to the diagram. Why, we get asked, don’t we have a box called Early Case Assessment, and another called Legal Hold, and another called Predictive Coding, and so on. With activities like analytics, you can take the entire EDRM diagram and drop it inside any one of those boxes or in that circle. Those concepts already are present in the current diagram. If, for example, you took the entire EDRM diagram and dropped it inside the Identification box, you could call that Early Case Assessment or Early Data Assessment. There was discussion early on about whether there should be a box for “Search”, but Search is really an Analysis function – there’s a home for it there.

A third set of suggested tweaks centers on eliminating elements from the diagram. Some have proposed that we combine the processing and review boxes into a single box – but the rationale they offer is that because they offer both those capabilities there no longer is a need to show separate boxes for the separate functions.

What are you working on that you’d like our readers to know about?

First, we would like to invite current and prospective members to join us on April 18 for our Spring meeting which will be at the ACEDS conference this year. The conference is from April 18 through April 20, with the educational portion of the conference slated for the 19th and 20th.

For several years at the conference, ACEDS has given out awards honoring eDiscovery professionals. To congratulate this year’s winners we will be giving them one-year individual EDRM memberships.

On the project side, one of the undertakings we are working on is “SEAT-1,” a follow up to our eMSAT-1 (eDiscovery Maturity Self-Assessment Test). SEAT-1 will be a self-assessment test specifically for litigation groups and law firms. The test is intended to enable them to better assess where they are at, how they are doing and where they want to be. We are also working on different ways to deliver our budget calculators. It’s too early to provide details on that, but we’re hoping to be able to provide more information soon.

Finally, in the past year we have begun to develop and deliver member-only resources. We published a data set for members only and we put a new section of EDRM site with information about the changes to the Federal rules, including a comprehensive collection of information about the changes to the rules. This year, we will be working on additional resources to be available to just our members.

Thanks, George, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

2015 eDiscovery Case Law Year in Review, Part 3

January 14, 2016

As we noted yesterday and Tuesday, eDiscovery Daily published 89 posts related to eDiscovery case decisions and activities over the past year, covering 72 unique cases! Yesterday, we looked back at cases related to disputes about discovery, eDiscovery cost reimbursement and issues related to privilege and confidentiality assertions. Today, let’s take a look back at cases related to cooperation issues, social media and mobile phone discovery, technology assisted review and the first part of the cases relating to sanctions and spoliation.

We grouped those cases into common subject themes and will review them over the next few posts. Perhaps you missed some of these? Now is your chance to catch up!

COOPERATION

Why can’t we all just get along? There were several instances where parties couldn’t agree and had to kick issues up to the court for resolution, here are four such cases:

Judge Shows Her Disgust via “Order on One Millionth Discovery Dispute”: In Herron v. Fannie Mae, et al., DC District Judge Rosemary M. Collyer issued an order titled “Order on One Millionth Discovery Dispute” where she decided that “[c]ontrary to its usual practice, the Court will rule immediately, in writing” on the latest discovery disputes between the plaintiff and defendant.

Court’s “New and Simpler Approach to Discovery” Identifies Search Terms for Plaintiff to Use: In Armstrong Pump, Inc. v. Hartman, New York Magistrate Judge Hugh B. Scott granted in part the defendant’s motion to compel discovery responses and fashioned a “new and simpler approach” to discovery, identifying thirteen search terms/phrases for the plaintiff to use when searching its document collection.

Court Agrees to Allow Defendant to Use Search Terms to Identify ESI to Preserve: In You v. Japan, California District Judge William Alsup granted the defendant’s motion to limit preservation of articles to those that contain one of several relevant search terms, as long as the defendant’s proposal was amended to include one additional term requested by the plaintiffs.

Court Orders Defendant to Supplement Data Used for Statistical Sampling: In United States ex rel Guardiola v. Renown Health, Nevada Magistrate Judge Valerie P. Cooke agreed with the relator’s contention that the data used to finalize the relator’s proposed statistical sampling plan was incomplete due to how data was identified within one of two billing systems used by the defendant. As a result, she ordered the defendant to “EXPEDITIOUSLY PRODUCE” the additional data (and, yes, she used all caps).

SOCIAL MEDIA

Requests for social media data in litigation continue, so here are three cases related to requests for social media data:

Court Rejects Defendants Motion Seeking Limitless Access to Plaintiff’s Facebook Account: In the class action In re Milo’s Kitchen Dog Treats Consolidated Cases, Pennsylvania Magistrate Judge Maureen P. Kelly denied the defendants’ Motion to Compel Unredacted Facebook Data File and Production of Username and Password, disagreeing that the discovery of one highly relevant Facebook entry justified the defendants to be “somehow entitled to limitless access to her Facebook account”. Judge Kelly did order the plaintiff to produce previously produced redacted Facebook pages to the Court unredacted so that an in camera inspection could be conducted to confirm that the redacted information was truly privileged.

Plaintiff’s Motion to Quash Subpoena of Text Messages Granted by Court: In Burdette v. Panola County, Mississippi Magistrate Judge S. Allan Alexander granted the plaintiff’s Motion to Quash Subpoena where the defendant subpoenaed the plaintiff’s text messages and call log records from his mobile provider.

When Claiming Workplace Injury, Facebook Posts Aren’t Handy, Man: In the case In Newill v. Campbell Transp. Co., Pennsylvania Senior District Judge Terrence F. McVerry ruled on the plaintiff’s motion in limine on miscellaneous matters by allowing the defendant to introduce Facebook posts into evidence that related to the plaintiff’s physical capabilities, but not those that related to his employability.

TECHNOLOGY ASSISTED REVIEW

Believe it or not, we only covered one technology assisted review case last year, at least officially. Though, we did at least cover it twice. Here is the case:

Judge Peck Wades Back into the TAR Pits with ‘Da Silva Moore Revisited’: In Rio Tinto Plc v. Vale S.A., New York Magistrate Judge Andrew J. Peck approved the proposed protocol for technology assisted review (TAR) presented by the parties, but made it clear to note that “the Court’s approval ‘does not mean. . . that the exact ESI protocol approved here will be appropriate in all [or any] future cases that utilize [TAR].’” Later on, Judge Peck assigned a well-respected industry thought leader as special master to the case.

SPOLIATION / SANCTIONS

I’ll bet that you won’t be surprised that, once again, the topic with the largest number of case law decisions related to eDiscovery are those related to sanctions and spoliation issues. Of the 72 cases we covered this past year, 39 percent of them (28 total cases) related to sanctions and spoliation issues. Sometimes requests for sanctions are granted, sometimes they’re not. Here are the first ten cases:

Appeals Court Upholds Default Judgment Sanctions for Defendant’s Multiple Discovery Violations: In Long Bay Management Co., Inc. et. al. v. HAESE, LLC et. al., the Appeals Court of Massachusetts found that the default judge had not abused her discretion in ordering sanctions and assessing damages and ordered that the plaintiffs could submit a petition for appellate attorneys’ fees incurred in responding to the appeal.

Court Grants Defendants’ Motion to Exclude Plaintiff’s Use of Spoliation Evidence: In West v. Talton, Georgia District Judge C. Ashley Royal granted the defendants’ Motion in Limine to exclude all evidence and argument regarding spoliation, reserving its ruling on the remaining issues in the Motion in Limine.

Not Preserving Texts Results in Adverse Inference Sanctions for Plaintiff: In NuVasive, Inc. v. Madsen Med., Inc., California Chief District Judge Barry Ted Moskowitz granted the defendants’ motion for adverse inference sanctions for failure to preserve text messages from four custodial employees that were key to the case.

Court States that Duty to Meet and Confer is Not an “Empty Formality”, Denies Request for Sanctions: In Whitesell Corporation v. Electrolux Home Products, Inc. et. al., Georgia District Judge J. Randal Hall denied the plaintiff’s motion for sanctions against the defendant for identifying a deponent that (according to the plaintiff) had no particularized information regarding the defendant’s efforts to produce documents, stating that he was “unimpressed” by the plaintiff’s effort to confer on the matter and stating that the “duty-to-confer is not an empty formality”.

Despite Failure to Implement a Litigation Hold, Defendant Escapes Sanctions: In Flanders v. Dzugan et. al., despite the fact that the defendant failed to implement a litigation hold, Pennsylvania District Judge Nora Barry Fischer denied the plaintiff’s Motion for Sanctions alleging the defendants failed to preserve evidence relevant to the case, finding that the plaintiff “cannot show any evidence was actually lost or destroyed”, “cannot show that if evidence was lost or destroyed, it would have been beneficial to his case” and “[n]or can Plaintiff show bad faith”.

Court Denies Motion for Sanctions Against Veterinary Hospital for Spoliation of ESI: In Grove City Veterinary Service, LLC et. al. v. Charter Practices International, LLC, Oregon Magistrate Judge John V. Acosta concluded that the plaintiffs had not met their burden of showing they are entitled to sanctions for spoliation of evidence by deleting one of the veterinarian’s archived work emails.

Defendant Gets Summary Judgment, Not Dismissal, Due to Plaintiff’s Wiping of Hard Drive: In Watkins v. Infosys, Washington District Judge John C. Coughenour denied the defendant’s Motion for the Sanction of Dismissal but granted the defendant’s Motion for Summary Judgment against the plaintiff for spoliation of data due to her use of “Disk Wiping” software to delete ESI.

Court Rules that State Agency is Not Responsible for Emails Deleted via the Retention Policy of Another State Agency: In Wandering Dago, Inc. v. N.Y. State Office of Gen. Servs., New York Magistrate Judge Randolph F. Treece denied the plaintiff’s request for sanctions, stating that “that neither the individual Defendants nor their Attorney had a duty to preserve” the emails of the Deputy Secretary of Gaming and Racing to the President of the New York Racing Authority (“NYRA”).

Apparently, in Discovery, Delta is Not Ready When You Are and It Has Cost Them Millions: A few years ago, we covered a case law decision in the Delta/Air Tran Baggage Fee Antitrust Litigation, where Delta was ordered to pay plaintiff attorney’s fees and costs for eDiscovery issues in that litigation. Apparently, Delta’s difficulties in this case have continued, as they have been ordered this week to pay over $2.7 million in sanctions for failing to turn over ESI, to go along with more than $4.7 million in sanctions for earlier discovery violations.

Court Denies Request for Sanctions for Routine Deletion of Files of Departed Employees: In Charvat et. al. v. Valente et. al., Illinois Magistrate Judge Mary M. Rowland denied the plaintiff’s request for spoliation sanctions for the defendant’s admitted destruction of computer files belonging to two departed employees, finding that the plaintiff did not provide any evidence that the defendant acted in bad faith.

Tomorrow, we will cover the remaining cases relating to sanctions and spoliation. Stay tuned!

Want to take a look at cases we covered the previous four years? Here they are:

So, what do you think? Did you miss any of these? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Mo’ Data, Mo’ Data, Mo’ Data from EDRM: eDiscovery Trends

November 4, 2015

It didn’t take long for EDRM to deliver on its promise of an advanced data set. Back in August, EDRM announced the release of the first of its “Micro Datasets”, designed for eDiscovery data testing and process validation. The first one was small, this new data set is MUCH bigger.

The initial August offering was a 136.9 MB zip file containing the latest versions of everything from Microsoft Office and Adobe Acrobat files to image files containing EDRM specific work product files and data from public websites to uncommon formats including .mbox email storage files and .gz archive files. On Monday, EDRM announced the release of a new 5.7 GB Micro Dataset. As before, this new EDRM dataset was assembled to meet eDiscovery data testing and process validation needs of software and tool providers, litigation support organizations, law firms and educational organizations and is sourced from publicly available data and free from copyright restrictions.

Designed to support exception handling exercises and advanced testing, the files in the new dataset have various levels of corruption, and the dataset contains a duplicate set of files that are encrypted. The file types in the set include:

A variety of.csv files
Websites and web pages
Adobe Acrobat files
Graphic files and photographs
Public census data
Microsoft Office files
Audio files
4 email boxes with shared correspondence, threads and attachments
Multiple Encase .e01 files containing data from a phone and another data source

This new EDRM Micro Dataset is available exclusively to EDRM members. Current EDRM members have been notified by email with instructions for file downloading (I just downloaded my copy yesterday and look forward to delving into it this week). So, if you’re interested in joining EDRM, there has never been a better time! Organizations and individuals interested in EDRM membership will find information at https://www.edrm.net/join/.

“The EDRM Dataset team has done outstanding work in advancing the industry with the development of advanced datasets that better reflect the types of data anomalies and challenges faced by e-discovery professionals today,” said George Socha, co-founder of EDRM. “EDRM members will benefit greatly from their work, in addition to the education, guidelines and latest in industry best practices provided to members.”

Five years after the Enron data set was converted to Outlook by the EDRM Data Set team (in November of 2010) we’re beginning to have some new dataset options. We may actually someday see an eDiscovery product demo without Enron data!

So, what do you think? Are you looking forward to checking out the new data set? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Here’s a Look at How and Where Legal Departments are Utilizing Data Analytics: eDiscovery Trends

November 3, 2015

Let’s face it, data analytics are everywhere. It’s no longer just Netflix suggesting movie choices based on previously watched movies or Amazon suggesting your next purchase, all companies are using data analytics to drive their business processes in various departments, including their legal departments. But how are in-house legal departments actually using data analytics capabilities? Here’s a new study that offers some answers.

The Coalition of Technology Resources for Lawyers (CTRL), an industry education and research group committed to the development of practical and proactive guidance for lawyers as they attempt to leverage various technologies in practice, commissioned the Information Governance Initiative (IGI) to conduct a survey regarding in-house legal departments’ use of data analytics across six use cases. Those use cases are: 1) eDiscovery/Other Investigations, 2) Legal Matter Management, Billing, & Budgeting, 3) Information Governance, 4) Outcome Analysis or Risk Assessment, 5) Contract Review and 6) Selection of Outside Counsel. Data Analytics in the Legal Community 2015-2016 Trends is the resulting report prepared by CTRL based on that study.

While the study doesn’t identify the number of participants, it does note that a majority of survey respondents were attorneys (around two-thirds), with most holding senior-level positions. Around one third of respondents were non-attorneys, including IT, analytics, and other professionals within or providing support for the inhouse legal team.

Perhaps not surprisingly, eDiscovery/Other Investigations was the use case with the highest percentage of utilization of data analytics – it was the only use case for which a majority of legal departments (56%) reported that they were using data analytics. Legal departments reported that their top three uses for data analytics in this area were culling and early case assessment (at 72.4% of respondents using analytics for eDiscovery each) and relevancy review (71.1%) – these were the only uses with over 70% of respondents. In addition to that, 71% of legal departments indicated that their spending on analytics for eDiscovery would increase or stay the same next year.

As for Information Governance (IG), it was the third most common use case with almost one third of legal departments using analytics. Respondents using data analytics for IG indicated that it was used for “facilitate defensible disposition” and “facilitate compliance with records policies or other requirements” the most (77.4% of respondents using analytics for IG each).

The free eight page report is available here.

So, what do you think? Does your legal department use data analytics? If so, for what? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Orders Defendant to Supplement Data Used for Statistical Sampling: eDiscovery Case Law

October 27, 2015

In United States ex rel Guardiola v. Renown Health, No. 3:12-cv-00295-LRH-VPC, (D. Nev. Sep. 1, 2015), Nevada Magistrate Judge Valerie P. Cooke agreed with the relator’s contention that the data used to finalize the relator’s proposed statistical sampling plan was incomplete due to how data was identified within one of two billing systems used by the defendant. As a result, she ordered the defendant to “EXPEDITIOUSLY PRODUCE” the additional data (and, yes, she used all caps).

Case Background

In this qui tam action under the False Claims Act (for which we covered a previous ruling here), the court had already held, in November 2014, that statistical sampling of claims was appropriate to save costs by enabling the parties to avoid examining every potential claim. In the attempt for the relator (the person bringing the qui tam action on behalf of the United States) to finalize her proposed sampling plan, a dispute developed over the meaning of a zero-day stay at the defendant’s facilities.

The dispute arose because one of the defendant’s two billing systems used the patient’s registration time instead of the time the patient actually begins receiving inpatient medical care as the admit time – as a result, claims were falling out of the zero day stay population, which was defined as less than 24 hours from patient admit time to discharge time. When reviewing the initial data for sampling, the relator was surprised that there were fewer claims than she expected – which lowered her chance of recovery in the case and ultimately later learned that this was due to how the billing system determined the admit time. So she requested additional data to be produced. The defendant objected, arguing that the relator sought “at this late hour” to acquire more data and alter the definition of a zero-day stay to include said data.

Judge’s Ruling

Noting that “[t]he question of relevancy should be construed liberally and with common sense and discovery should be allowed unless the information sought has no conceivable bearing on the case”, Judge Cooke stated:

“The time-adjusted data is discoverable, for it is indisputably relevant. Evidence is relevant when ‘it has any tendency to make a fact more or less probable than it would be without the evidence’ and ‘the fact is of consequence in determining the action.’…Relator has adequately explained the basis for her belief that the time-adjusted claims properly fall within the data universe for zero-day stays, based upon the guidelines for an inpatient stay and the problem with the Siemens’ ‘admit time.’”

Judge Cooke also noted that the defendant “retain[s] the right, and will have the opportunity, to question or attack the reliability of” the expert and the statistical sampling process.

Judge Cooke also considered whether her November order allowing for statistical sampling permitted the inclusion of the time-adjusted data in the sampling plan. Based on the definition of a zero-day stay as “a hospital stay of less than 24 hours” (from time of admission), she ruled that “the November order permits inclusion of the time-adjusted claims.” As a result, she ordered the defendant to “EXPEDITIOUSLY PRODUCE data consistent with relator’s proposal to include the time-adjusted claims” and for the parties to meet and confer to determine the plan for producing the data and finalizing the statistical sampling plan.

So, what do you think? Was inclusion of the additional data appropriate? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Organize Your Collection by Message Thread to Save Costs During Review: eDiscovery Best Practices

October 21, 2015

This topic came up recently with a client, so I thought it was timely to revisit…

Not only is insanity doing the same thing over and over again and expecting a different result, but in eDiscovery review, it can be even worse when you do get a different result.

One of the biggest challenges when reviewing electronically stored information (ESI) is identifying duplicates so that your reviewers aren’t reviewing the same files again and again. Not only does that drive up costs unnecessarily, but it could lead to problems if the same file is categorized differently by different reviewers (for example, inadvertent production of a duplicate of a privileged file if it is not correctly categorized).

There are a few ways to identify duplicates. Exact duplicates (that contain the exact same content in the same file format) can be identified through hash values, which are a digital fingerprint of the content of the file. MD5 and SHA-1 are the most popular hashing algorithms, which can identify exact duplicates of a file, so that they can be removed from the review population. Since many of the same emails are emailed to multiple parties and the same files are stored on different drives, deduplication through hashing can save considerable review costs.

Sometimes, files are exact (or nearly exact) duplicates in content but not in format. One example is a Word document published to an Adobe PDF file – the content is the same, but the file format is different, so the hash value will be different. Near-deduplication can be used to identify files where most or all of the content matches so they can be verified as duplicates and eliminated from review.

Another way to identify duplicative content is through message thread analysis. Many email messages are part of a larger discussion, which could be just between two parties, or include a number of parties in the discussion. To review each email in the discussion thread would result in much of the same information being reviewed over and over again. Instead, message thread analysis pulls those messages together and enables them to be reviewed as an entire discussion. That includes any side conversations within the discussion that may or may not be related to the original topic (e.g., a side discussion about lunch plans or did you see The Walking Dead last night).

CloudNine’s review platform (shameless plug warning!) is one example of an application that provides a mechanism for message thread analysis of Outlook emails that pulls the entire thread into one conversation for review in a popup window. By doing so, you can focus your review on the last emails in each conversation to see what is said without having to review each email.

With message thread analysis, you can minimize review of duplicative information within emails, saving time and cost and ensuring consistency in the review.

So, what do you think? Does your review tool support message thread analysis? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Here’s a New Dataset Option, Thanks to EDRM: eDiscovery Trends

August 28, 2015

For several years, the Enron data set (converted to Outlook by the EDRM Data Set team back in November of 2010) has been the only viable set of public domain data available for testing and demonstration of eDiscovery processing and review applications. Chances are, if you’ve seen a demo of an eDiscovery application in the last few years, it was using Enron data. Now, the EDRM Data Set team has begun to offer some new dataset options.

Yesterday, EDRM announced the release of the first of its “Micro Datasets.” As noted in the announcement, the datasets are designed for eDiscovery data testing and process validation. Software vendors, litigation support organizations, law firms and others may use these smaller sets to qualify support, test speed and accuracy in indexing and search, and conduct more forensically oriented analytics exercises throughout the eDiscovery workflow.

The initial offering is a 136.9 MB zip file containing the latest versions of everything from Microsoft Office and Adobe Acrobat files to image files and contains EDRM specific work product files and data from public websites. There are even some uncommon formats including .mbox email storage files and .gz archive files! The EDRM Dataset group has scoured the internet and found usable freely available data at universities, government sites and elsewhere, a selection of which are included in the zip file.

The first EDRM Micro Dataset zip file is available now for download here. While it’s an initial small set, EDRM has promised “advanced” data sets to come. Those advanced data sets, to be released in the near future, will be available exclusively to EDRM members. Members will be notified by email with instructions for file downloading. Organizations interested in EDRM membership will find information at https://www.edrm.net/join/. Now, there is more reason than ever to join!

So, what do you think? Are you tired of using the Enron data set and look forward to alternatives? If so, today is your lucky day! Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Got Problems with Your eDiscovery Processes? “SWOT” Them Away: eDiscovery Best Practices

August 25, 2015

Having recently helped a client put one of these together, it seemed appropriate to revisit this topic…

Understanding the internal and external challenges that your organization faces allows it to approach ongoing and future discovery more strategically. A “SWOT” analysis is a tool that can be used to develop that understanding.

A “SWOT” analysis is a structured planning method used to evaluate the Strengths, Weaknesses, Opportunities, and Threats associated with a specific business objective. That specific business objective can be a specific project or all of the activities of a business unit. It involves specifying the objective of the specific business objective and identifying the internal and external factors that are favorable and unfavorable to achieving that objective. The SWOT analysis is broken down as follows:

Strengths: characteristics of the business or project that give it an advantage over others;
Weaknesses: are characteristics that place the team at a disadvantage relative to others;
Opportunities: elements in the environment that the project could exploit to its advantage;
Threats: elements in the environment that could cause trouble for the business or project.

“SWOT”, get it?

From an eDiscovery perspective, a SWOT analysis enables you to take an objective look at how your organization handles discovery issues – what you do well and where you need to improve – and the external factors that can affect how your organization addresses its discovery challenges. The SWOT analysis enables you to assess how your organization handles each phase of the discovery process – from Information Governance to Presentation – to evaluate where your strengths and weaknesses exist so that you can capitalize on your strengths and implement changes to address your weaknesses.

How solid is your information governance program? How well does your legal department communicate with IT? How well formalized is your coordination with outside counsel and vendors? Do you have a formalized process for implementing and tracking litigation holds? These are examples of questions you might ask about your organization and, based on the answers, identify your organization’s strengths and weaknesses in managing the discovery process.

However, if you only look within your organization, that’s only half the battle. You also need to look at external factors and how they affect your organization in its handling of discovery issues. Trends such as the growth of social media, and changes to state or federal rules addressing handling of electronically stored information (ESI) need to be considered in your organization’s strategic discovery plan.

Having worked through the strategic analysis process with several organizations over a number of years, I find that the SWOT analysis is a useful tool for summarizing where the organization currently stands with regard to managing discovery, which naturally identifies areas for improvement that can be addressed.

So, what do you think? Has your organization performed a SWOT analysis of your discovery process? Please share any comments you might have or if you’d like to know more about a particular topic.

Graphic source: Wikipedia.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Here are a Few Common Myths About Technology Assisted Review: eDiscovery Best Practices

August 11, 2015

A couple of years ago, after my annual LegalTech New York interviews with various eDiscovery thought leaders (a list of which can be found here, with links to each interview), I wrote a post about some of the perceived myths that exist regarding Technology Assisted Review (TAR) and what it means to the review process. After a recent discussion with a client where their misperceptions regarding TAR were evident, it seemed appropriate to revisit this topic and debunk a few myths that others may believe as well.

TAR is New Technology

Actually, with all due respect to each of the various vendors that have their own custom algorithm for TAR, the technology for TAR as a whole is not new technology. Ever heard of artificial intelligence? TAR, in fact, applies artificial intelligence to the review process. With all of the acronyms we use to describe TAR, here’s one more for consideration: “Artificial Intelligence for Review” or “AIR”. May not catch on, but I like it. (much to my disappointment, it didn’t)…

Maybe attorneys would be more receptive to it if they understood as artificial intelligence? As Laura Zubulake pointed out in my interview with her, “For years, algorithms have been used in government, law enforcement, and Wall Street. It is not a new concept.” With that in mind, Ralph Losey predicts that “The future is artificial intelligence leveraging your human intelligence and teaching a computer what you know about a particular case and then letting the computer do what it does best – which is read at 1 million miles per hour and be totally consistent.”

TAR is Just Technology

Treating TAR as just the algorithm that “reviews” the documents is shortsighted. TAR is a process that includes the algorithm. Without a sound approach for identifying appropriate example documents for the collection, ensuring educated and knowledgeable reviewers to appropriately code those documents and testing and evaluating the results to confirm success, the algorithm alone would simply be another case of “garbage in, garbage out” and doomed to fail. In a post from last week, we referenced Tom O’Connor’s recent post where he quoted Maura Grossman, probably the most recognized TAR expert, who stated that “TAR is a process, not a product.” True that.

TAR and Keyword Searching are Mutually Exclusive

I’ve talked to some people that think that TAR and key word searching are mutually exclusive, i.e., that you wouldn’t perform key word searching on a case where you plan to use TAR. Not necessarily. Ralph Losey continues to advocate a “multimodal” approach, noting it as: “more than one kind of search – using TAR, but also using keyword search, concept search, similarity search, all kinds of other methods that we have developed over the years to help train the machine. The main goal is to train the machine.”

TAR Eliminates Manual Review

Many people (including the New York Times) think of TAR as the death of manual review, with all attorney reviewers being replaced by machines. Actually, manual review is a part of the TAR process in several aspects, including: 1) Subject matter knowledgeable reviewers are necessary to perform review to create a training set of documents for the technology, 2) After the process is performed, both sets (the included and excluded documents) are sampled and the samples are reviewed to determine the effectiveness of the process, and 3) The resulting responsive set is generally reviewed to confirm responsiveness and also to determine whether the documents are privileged. Without manual review to train the technology and verify the results, the process would fail.

TAR Has to Be Perfect to Be Useful

Detractors of TAR note that TAR can miss plenty of responsive documents and is nowhere near 100% accurate. In one recent case, the producing party estimated as many as 31,000 relevant documents may have been missed by the TAR process. However, they also estimated that a much more costly manual review would have missed as many as 62,000 relevant documents.

Craig Ball’s analogy about the two hikers that encounter the angry grizzly bear is appropriate – the one hiker doesn’t have to outrun the bear, just the other hiker. Craig notes: “That is how I look at technology assisted review. It does not have to be vastly superior to human review; it only has to outrun human review. It just has to be as good or better while being faster and cheaper.”

So, what do you think? Do you agree that these are myths? Can you think of any others? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Analysis