Analysis Archives

Three “C”s, Cowboys, Cannibals and Craig (Ball) – eDiscovery Best Practices

January 14, 2015

They say that a joke is only old if you haven’t heard it before. In that vein, an article about eDiscovery is only old if you haven’t read it before. Craig Ball is currently revisiting some topics that he covered ten years ago with an updated look, making them appropriate for 1) people who weren’t working in eDiscovery ten years ago (which is probably a lot of you), 2) people who haven’t read the articles previously and 3) people who have read the articles previously, but haven’t seen his updated takes. In other words, everybody.

So far, Craig has published three revisited articles to his terrific Ball in your court blog. They are:

Starting Over, which sets the stage for the series, and covers The DNA of Data, which was the very first Ball in your court (when it was still in print form). This article discusses how electronic evidence isn’t going away and claims of inaccessible data and how technological advances have rendered claims of inaccessibility mostly moot.

Unclear on the Concept (originally published in Law Technology News in May of 2005), which discusses some of the challenges of early concept searching and related tools (when terms like “predictive coding” and “technology assisted review” hadn’t even entered our lexicon yet). Craig also pokes fun at himself for noting back then how he read Alexander Solzhenitsyn and Joyce Carol Oates in grade school. 🙂

Cowboys and Cannibals (originally published in Law Technology News in June of 2005), which discusses the need for a new email “sheriff” in town (not to be confused with U.S. Magistrate Judge John Facciola in this case) to classify emails for easier retrieval. Back then, we didn’t know just how big the challenge of Information Governance would become. His updated take concludes as follows:

“What optimism exists springs from the hope that we will move from the Wild West to Westworld, that Michael Crichton-conceived utopia where robots are gunslingers. The technology behind predictive coding will one day be baked into our IT apps, and much as it serves to protect us from spam today, it will organize our ESI in the future.”

That day is coming, hopefully sooner rather than later. And, you have to love a blog post that references Westworld, which was a terrific story and movie back in the 70s (wonder why nobody has remade that one yet?).

eDiscovery Daily has revisited topics several times as well, especially some of the topics we covered in the early days of the blog, when we didn’t have near as many followers yet. It’s new if you haven’t read it, right? I look forward to future posts in Craig’s series.

So, what do you think? How long have you been reading articles about eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Image © Metro Goldwyn Mayer

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscoveryDaily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

2014 eDiscovery Case Law Yhttps://cloudnine.com/ediscoverydaily/case-law/2014-ediscovery-case-law-year-in-review-part-3/ear in Review, Part 3

January 8, 2015

As we noted yesterday and the day before, eDiscoveryDaily published 93 posts related to eDiscovery case decisions and activities over the past year, covering 68 unique cases! Yesterday, we looked back at cases related to eDiscovery cost sharing and reimbursement, fee disputes and production format disputes. Today, let’s take a look back at cases related to privilege and inadvertent disclosures, requests for social media, cases involving technology assisted review and the case of the year – the ubiquitous Apple v. Samsung dispute.

We grouped those cases into common subject themes and will review them over the next few posts. Perhaps you missed some of these? Now is your chance to catch up!

PRIVILEGE / INADVERTENT DISCLOSURES

There were a couple of cases related to privilege issues, including one where privilege was upheld when the plaintiff purchased the defendant’s seized computer at auction! Here are two cases where disclosure of privileged documents was addressed:

Privilege Not Waived on Defendant’s Seized Computer that was Purchased by Plaintiff at Auction: In Kyko Global Inc. v. Prithvi Info. Solutions Ltd., Washington Chief District Judge Marsha J. Pechman ruled that the defendants’ did not waive their attorney-client privilege on the computer of one of the defendants purchased by plaintiffs at public auction, denied the defendants’ motion to disqualify the plaintiff’s counsel for purchasing the computer and ordered the plaintiffs to provide defendants with a copy of the hard drive within three days for the defendants to review it for privilege and provide defendants with a privilege log within seven days of the transfer.

Plaintiff Can’t “Pick” and Choose When it Comes to Privilege of Inadvertent Disclosures: In Pick v. City of Remsen, Iowa District Judge Mark W. Bennett upheld the magistrate judge’s order directing the destruction of an inadvertently-produced privileged document, an email from defense counsel to some of the defendants, after affirming the magistrate judge’s analysis of the five-step analysis to determine whether privilege was waived.

SOCIAL MEDIA

Requests for social media data in litigation continue, though there were not as many disputes over it as in years past (at least, not with cases we covered). Here are three cases related to social media data:

Plaintiff Ordered to Produce Facebook Photos and Messages as Discovery in Personal Injury Lawsuit: In Forman v. Henkin, a Motion to Compel was granted in part for a defendant who requested authorization to obtain records of the plaintiff’s private postings to Facebook.

Plaintiff Ordered to Re-Open Social Media Account for Discovery: In Chapman v. Hiland Operating, LLC, while noting that he was “skeptical” that reactivating the plaintiff’s Facebook account would produce any relevant, noncumulative information, North Dakota Magistrate Judge Charles S. Miller ordered the plaintiff to “make a reasonable, good faith attempt” to reactivate her Facebook account.

Order for Financial Records and Facebook Conversations Modified Due to Privacy Rights: In Stallings v. City of Johnston City, Illinois Chief District Judge David R. Herndon modified an earlier order by a magistrate judge in response to the plaintiff’s appeal, claiming that the order violated the privacy rights of the plaintiff, and of minor children with whom the plaintiff had held conversations on Facebook.

TECHNOLOGY ASSISTED REVIEW

Technology assisted review continued to be discussed and debated between parties in 2014, with some disputes involving how technology assisted review would be conducted as opposed to whether it would be conducted at all. Courts continued to endorse technology assisted review and predictive coding, even going so far as to suggest the use of it in one case. Here are six cases involving the use of technology assisted review in 2014:

Court Rules that Unilateral Predictive Coding is Not Progressive: In In Progressive Cas. Ins. Co. v. Delaney, Nevada Magistrate Judge Peggy A. Leen determined that the plaintiff’s unannounced shift from the agreed upon discovery methodology, to a predictive coding methodology for privilege review was not cooperative. Therefore, the plaintiff was ordered to produce documents that met agreed-upon search terms without conducting a privilege review first.

Court Rules in Dispute Between Parties Regarding ESI Protocol, Suggests Predictive Coding: In a dispute over ESI protocols in FDIC v. Bowden, Georgia Magistrate Judge G. R. Smith approved the ESI protocol from the FDIC and suggested the parties consider the use of predictive coding.

Court Sides with Defendant in Dispute over Predictive Coding that Plaintiff Requested: In the case In re Bridgepoint Educ., Inc., Securities Litigation, California Magistrate Judge Jill L. Burkhardt ruled that expanding the scope of discovery by nine months was unduly burdensome, despite the plaintiff’s request for the defendant to use predictive coding to fulfill its discovery obligation and also approved the defendants’ method of using search terms to identify responsive documents for the already reviewed three individual defendants, directing the parties to meet and confer regarding the additional search terms the plaintiffs requested.

Though it was “Switching Horses in Midstream”, Court Approves Plaintiff’s Predictive Coding Plan: In Bridgestone Americas Inc. v. Int’l Bus. Mach. Corp., Tennessee Magistrate Judge Joe B. Brown, acknowledging that he was “allowing Plaintiff to switch horses in midstream”, nonetheless ruled that that the plaintiff could use predictive coding to search documents for discovery, even though keyword search had already been performed.

Court Approves Use of Predictive Coding, Disagrees that it is an “Unproven Technology”: In Dynamo Holdings v. Commissioner of Internal Revenue, Texas Tax Court Judge Ronald Buch ruled that the petitioners “may use predictive coding in responding to respondent’s discovery request” and if “after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time”.

Court Opts for Defendant’s Plan of Review including TAR and Manual Review over Plaintiff’s TAR Only Approach: In Good v. American Water Works, West Virginia District Judge John T. Copenhaver, Jr. granted the defendants’ motion for a Rule 502(d) order that merely encouraged the incorporation and employment of time-saving computer-assisted privilege review over the plaintiffs’ proposal disallowing linear privilege review altogether.

APPLE V. SAMSUNG

Every now and then, there is a case that just has to be covered. Whether it be for the eDiscovery related issues (e.g., adverse inference sanction, inadvertent disclosures, eDiscovery cost reiumbursement) or the fact that billions of dollars were at stake or the fact that the case earned its own “gate” moniker, the Apple v. Samsung case demanded attention. Here are the six posts (just from 2014, we have more in previous years) about this case:

Quinn Emanuel Sanctioned for Inadvertent Disclosure, Samsung Escapes Sanction: California Magistrate Judge Paul S. Grewal has now handed down an order on motions for sanctions against Samsung and the Quinn Emanuel law firm in the never-ending Apple v. Samsung litigation for the inadvertent disclosure of confidential agreements that Apple had with Nokia, Ericsson, Sharp and Philips – now widely referred to as “patentgate”.

Apple Can’t Mention Inadvertent Disclosure in Samsung Case: Back in January, Quinn Emanuel Urquhart & Sullivan LLP was sanctioned for their inadvertent disclosure in the Apple vs Samsung litigation (commonly referred to as “patentgate”). California Magistrate Judge Paul S. Grewal handed down an order on motions for sanctions against Quinn Emanuel (in essence) requiring the firm to “reimburse Apple, Nokia, and their counsel for any and all costs and fees incurred in litigating this motion and the discovery associated with it”. Many felt that Samsung and Quinn Emanuel got off lightly. Now, Apple can’t even mention the inadvertent disclosure in the upcoming Samsung trial.

Apple Wins Another $119.6 Million from Samsung, But It’s Only 6% of What They Requested: Those of you who have been waiting for significant news to report from the Apple v. Samsung litigation, your wait is over! As reported last week in The Recorder, a California Federal jury ordered Samsung on Friday to pay Apple $119.6 million for infringing three of Apple’s iPhone patents. However, the award was a fraction of the nearly $2.2 billion Apple was requesting.

Samsung and Quinn Emanuel Ordered to Pay Over $2 Million for “Patentgate” Disclosure: Remember the “patentgate” disclosure last year (by Samsung and their outside counsel firm of Quinn Emanuel Urquhart & Sullivan LLP) of confidential agreements that Apple had with Nokia? Did you think they were going to avoid having to pay for that disclosure? The answer is no.

Court Refuses to Ban Samsung from Selling Products Found to Have Infringed on Apple Products: Apple may have won several battles with Samsung, including ultimately being awarded over $1 billion in verdicts, as well as a $2 million sanction for the inadvertent disclosure of its outside counsel firm (Quinn Emanuel Urquhart & Sullivan LLP) commonly known as “patentgate”. But, Samsung has may have won the war with the court’s refusal to ban Samsung from selling products that were found to have infringed on Apple products.

Apple Recovers Part, But Not All, of its Requested eDiscovery Costs from Samsung: Apple won several battles with Samsung, including ultimately being awarded over $1 billion in verdicts, as well as a $2 million sanction for the inadvertent disclosure of its outside counsel firm (Quinn Emanuel Urquhart & Sullivan LLP) commonly known as “patentgate”, but ultimately may have lost the war when the court refused to ban Samsung from selling products that were found to have infringed on Apple products. Now, they’re fighting over relative chicken-feed in terms of a few million that Apple sought to recover in eDiscovery costs.

Tomorrow, we will cover cases related to the most common theme of the year (three guesses and the first two don’t count). Stay tuned!

So, what do you think? Did you miss any of these? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscoveryDaily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

EDRM Updates Statistical Sampling Applied to Electronic Discovery Guide – eDiscovery Trends

December 9, 2014

Over two years ago, we covered EDRM’s initial announcement of a new guide called Statistical Sampling Applied to Electronic Discovery. Now, they have announced an updated version of the guide.

The release of EDRM’s Statistical Sampling Applied to Electronic Discovery, Release 2, announced last week and published on the EDRM website, is open for public comment until January 9, 2015, after which any input received will be reviewed and considered for incorporation before the updated materials are finalized.

As EDRM notes in their announcement, “The updated materials provide guidance regarding the use of statistical sampling in e-discovery. Much of the information is definitional and conceptual and intended for a broad audience. Other materials (including an accompanying spreadsheet) provide additional information, particularly technical information, for e-discovery practitioners who are responsible for developing further expertise in this area.”

The expanded Guide is comprised of ten sections (most of which have several sub-sections), as follows:

Introduction
Estimating Proportions within a Binary Population
Acceptance Sampling
Sampling in the Context of the Information Retrieval Grid – Recall, Precision and Elusion
Seed Set Selection in Machine Learning
Guidelines and Considerations
Additional Guidance on Statistical Theory
Calculating Confidence Levels, Confidence Intervals and Sample Sizes
Acceptance Sampling
Examples in the Accompanying Excel Spreadsheet

The guide ranges from the introductory and explanation of basic statistical terms (such as sample size, margin of error and confidence level) to more advanced concepts such as binomial distribution and hypergeometric distribution. Bring your brain.

As section 10 indicates, there is also an accompanying Excel spreadsheet which can be downloaded from the page, EDRM Statistics Examples 20141023.xlsm, which implements relevant calculations supporting Sections 7, 8 and 9. The spreadsheet was developed using Microsoft Excel 2013 and is an .xlsm file, meaning that it contains VBA code (macros), so you may have to adjust your security settings in order to view and use them. You’ll also want to read the guide first (especially sections 7 thru 10) as the Excel workbook is a bit cryptic.

Comments can be posted at the bottom of the EDRM Statistical Sampling page, or emailed to the group at mail@edrm.net or you can fill out their comment form here.

One thing that I noticed is that the old guide, from April of 2012, is still on the EDRM site. It might be a good idea to archive that page to avoid confusion with the new guide.

So, what do you think? Do you perform statistical sampling to verify results within your eDiscovery process? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Opts for Defendant’s Plan of Review including TAR and Manual Review over Plaintiff’s TAR Only Approach – eDiscovery Case Law

November 24, 2014

In Good v. American Water Works, 2:14-01374 (S.D. W. Vir. Oct. 29, 2014), West Virginia District Judge John T. Copenhaver, Jr. granted the defendants' motion for a Rule 502(d) order that merely encouraged the incorporation and employment of time-saving computer-assisted privilege review over the plaintiffs’ proposal disallowing linear privilege review altogether.

Case Background

In this class action litigation involving the Freedom Industries chemical spill, the parties met and conferred, agreeing on all but one discovery issue: privilege review and 502(d) clawbacks. The defendants proposed that the Rule 502(d) order merely encourage the incorporation and employment of computer-assisted privilege review, while the plaintiffs proposed that the order “limit privilege review to what a computer can accomplish, disallowing linear (aka ‘eyes on’) privilege review altogether”.

The plaintiffs would agree only to a pure quick peek/claw-back arrangement, which would place never-reviewed, never privilege-logged documents in their hands as quickly as physically possible at the expense of any opportunity for care on the part of a producing party to protect a client's privileged and work product protected information. On the other hand, the defendants did not wish to forego completely the option to manually review documents for privilege and work product protection.

The plaintiffs argued that if they were to proceed with a manual privilege review, then only 502(b) protection – the inadvertent waiver rule – should apply, and not 502(d) protection, which offers more expansive protection against privilege waivers.

Judge’s Ruling

Judge Copenhaver noted that “[t]he defendants have chosen a course that would allow them the opportunity to conduct some level of human due diligence prior to disclosing vast amounts of information, some portion of which might be privileged. They also appear to desire a more predictable clawback approach without facing the uncertainty inherent in the Rule 502(b) factoring analysis. Nothing in Rule 502 prohibits that course. And the parties need not agree in order for that approach to be adopted”.

Therefore, despite the fact that the plaintiffs were “willing to agree to an order that provides that the privilege or protection will not be waived and that no other harm will come to the Defendants if Plaintiffs are permitted to see privileged or work product protected documents”, Judge Copenhaver ruled that “[i]nasmuch as defendants' cautious approach is not prohibited by the text of Rule 502, and they appear ready to move expeditiously in producing documents in the case, their desired approach is a reasonable one.” As a result, he entered their proposed Rule 502(d) order, “with the expectation that the defendants will marshal the resources necessary to assure that the delay occasioned by manual review of portions of designated categories will uniformly be minimized so that disclosure of the entirety of even the most sensitive categories is accomplished quickly.”

So, what do you think? Should the defendants have retained the right to manual review or should the plaintiffs’ proposed approach have been adopted? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

How Mature is Your Organization in Handling eDiscovery? – eDiscovery Best Practices

October 7, 2014

A new self-assessment resource from EDRM helps you answer that question.

A few days ago, EDRM announced the release of the EDRM eDiscovery Maturity Self-Assessment Test (eMSAT-1), the “first self-assessment resource to help organizations measure their eDiscovery maturity” (according to their press release linked here).

As stated in the press release, eMSAT-1 is a downloadable Excel workbook containing 25 worksheets (actually 27 worksheets when you count the Summary sheet and the List sheet of valid choices at the end) organized into seven sections covering various aspects of the e-discovery process. Complete the worksheets and the assessment results are displayed in summary form at the beginning of the spreadsheet. eMSAT-1 is the first of several resources and tools being developed by the EDRM Metrics group, led by Clark and Dera Nevin, with assistance from a diverse collection of industry professionals, as part of an ambitious Maturity Model project.

The seven sections covered by the workbook are:

General Information Governance: Contains ten questions to answer regarding your organization’s handling of information governance.
Data Identification, Preservation & Collection: Contains five questions to answer regarding your organization’s handling of these “left side” phases.
Data Processing & Hosting: Contains three questions to answer regarding your organization’s handling of processing, early data assessment and hosting.
Data Review & Analysis: Contains two questions to answer regarding your organization’s handling of search and review.
Data Production: Contains two questions to answer regarding your organization’s handling of production and protecting privileged information.
Personnel & Support: Contains two questions to answer regarding your organization’s hiring, training and procurement processes.
Project Conclusion: Contains one question to answer regarding your organization’s processes for managing data once a matter has concluded.

Each question is a separate sheet, with five answers ranked from 1 to 5 to reflect your organization’s maturity in that area (with descriptions to associate with each level of maturity). Default value of 1 for each question. The five answers are:

1: No Process, Reactive
2: Fragmented Process
3: Standardized Process, Not Enforced
4: Standardized Process, Enforced
5: Actively Managed Process, Proactive

Once you answer all the questions, the Summary sheet shows your overall average, as well as your average for each section. It’s an easy workbook to use with input areas defined by cells in yellow. The whole workbook is editable, so perhaps the next edition could lock down the calculated only cells. Nonetheless, the workbook is intuitive and provides a nice exercise for an organization to grade their level of eDiscovery maturity.

You can download a copy of the eMSAT-1 Excel workbook from here, as well as get more information on how to use it (the page also describes how to provide feedback to make the next iterations even better).

The EDRM Maturity Model Self-Assessment Test is the fourth release in recent months by the EDRM Metrics team. In June 2013, the new Metrics Model was released, in November 2013 a supporting glossary of terms for the Metrics Model was published and in November 2013 the EDRM Budget Calculators project kicked off (with four calculators covered by us here, here, here and here). They’ve been busy.

So, what do you think? How mature is your organization in handling eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Approves Use of Predictive Coding, Disagrees that it is an “Unproven Technology” – eDiscovery Case Law

October 6, 2014

In Dynamo Holdings v. Commissioner of Internal Revenue, Docket Nos. 2685-11, 8393-12 (U.S. Tax Ct. Sept 17, 2014), Texas Tax Court Judge Ronald Buch ruled that the petitioners “may use predictive coding in responding to respondent's discovery request” and if “after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time”.

The cases involved various transfers from one entity to a related entity where the respondent determined that the transfers were disguised gifts to the petitioner's owners and the petitioners asserted that the transfers were loans.

The respondent requested for the petitioners to produce the electronically stored information (ESI) contained on two specified backup storage tapes or simply produce the tapes themselves. The petitioners asserted that it would "take many months and cost at least $450,000 to do so", requesting that the Court deny the respondent's motion as a "fishing expedition" in search of new issues that could be raised in these or other cases. Alternatively, the petitioners requested that the Court let them use predictive coding to efficiently and economically identify the non-privileged information responsive to respondent's discovery request. The respondent opposed the petitioners' request to use predictive coding, calling it "unproven technology" and added that petitioners could simply give him access to all data on the two tapes and preserve the right (through a "clawback agreement") to later claim that some or all of the data is privileged.

Judge Buch called the request to use predictive coding “somewhat unusual” and stated that “although it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery… Yet that is, in essence, what the parties are asking the Court to consider – whether document review should be done by humans or with the assistance of computers. Respondent fears an incomplete response to his discovery. If respondent believes that the ultimate discovery response is incomplete and can support that belief, he can file another motion to compel at that time.”

With regard to the respondent’s categorization of predictive coding as “unproven technology”, Judge Buch stated “We disagree. Although predictive coding is a relatively new technique, and a technique that has yet to be sanctioned (let alone mentioned) by this Court in a published Opinion, the understanding of e-discovery and electronic media has advanced significantly in the last few years, thus making predictive coding more acceptable in the technology industry than it may have previously been. In fact, we understand that the technology industry now considers predictive coding to be widely accepted for limiting e-discovery to relevant documents and effecting discovery of ESI without an undue burden.”

As a result, Judge Buch ruled that “[p]etitioners may use predictive coding in responding to respondent's discovery request. If, after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time.”

So, what do you think? Should predictive coding have been allowed in this case? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Good Processing Requires a Sound Process – Best of eDiscovery Daily

October 3, 2014

Home at last! Today, we are recovering from our trip, after arriving back home one day late and without our luggage. Satan, thy name is Lufthansa! Anyway, for these past two weeks except for Jane Gennarelli’s Throwback Thursday series, we have been re-publishing some of our more popular and frequently referenced posts. Today’s post is a topic that comes up often with our clients. Enjoy! New posts next week!

As we discussed Wednesday, working with electronic files in a review tool is NOT just simply a matter of loading the files and getting started. Electronic files are diverse and can represent a whole collection of issues to address in order to process them for loading. To address those issues effectively, processing requires a sound process.

eDiscovery providers like (shameless plus warning!) CloudNine Discovery process electronic files regularly to enable their clients to work with those files during review and production. As a result, we are aware of some of the information that must be provided by the client to ensure that the resulting processed data meets their needs and have created an EDD processing spec sheet to gather that information before processing. Examples of information we collect from our clients:

Do you need de-duplication? If so, should it performed at the case or the custodian level?
Should Outlook emails be extracted in MSG or HTM format?
What time zone should we use for email extraction? Typically, it’s the local time zone of the client or Greenwich Mean Time (GMT). If you don’t think that matters, consider this example.
Should we perform Optical Character Recognition (OCR) for image-only files that don’t have corresponding text? If we don’t OCR those files, these could be responsive files that are missed during searching.
If any password-protected files are encountered, should we attempt to crack those passwords or log them as exception files?
Should the collection be culled based on a responsive date range?
Should the collection be culled based on key terms?

Those are some general examples for native processing. If the client requests creation of image files (many still do, despite the well documented advantages of native files), there are a number of additional questions we ask regarding the image processing. Some examples:

Generate as single-page TIFF, multi-page TIFF, text-searchable PDF or non text-searchable PDF?
Should color images be created when appropriate?
Should we generate placeholder images for unsupported or corrupt files that cannot be repaired?
Should we create images of Excel files? If so, we proceed to ask a series of questions about formatting preferences, including orientation (portrait or landscape), scaling options (auto-size columns or fit to page), printing gridlines, printing hidden rows/columns/sheets, etc.
Should we endorse the images? If so, how?

Those are just some examples. Questions about print format options for Excel, Word and PowerPoint take up almost a full page by themselves – there are a lot of formatting options for those files and we identify default parameters that we typically use. Don’t get me started.

We also ask questions about load file generation (if the data is not being loaded into our own review tool, OnDemand®), including what load file format is preferred and parameters associated with the desired load file format.

This isn’t a comprehensive list of questions we ask, just a sample to illustrate how many decisions must be made to effectively process electronic data. Processing data is not just a matter of feeding native electronic files into the processing tool and generating results, it requires a sound process to ensure that the resulting output will meet the needs of the case.

So, what do you think? How do you handle processing of electronic files? Please share any comments you might have or if you’d like to know more about a particular topic.

P.S. – No hamsters were harmed in the making of this blog post.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

The Files are Already Electronic, How Hard Can They Be to Load? – Best of eDiscovery Daily

October 1, 2014

Come fly with me! Today we are winding our way back home from Paris, by way of Frankfurt. For the next two weeks except for Jane Gennarelli’s Throwback Thursday series, we will be re-publishing some of our more popular and frequently referenced posts. Today’s post is a topic that relates to a question that I get asked often. Enjoy!

Since hard copy discovery became electronic discovery, I’ve worked with a number of clients who expect that working with electronic files in a review tool is simply a matter of loading the files and getting started. Unfortunately, it’s not that simple!

Back when most discovery was paper based, the usefulness of the documents was understandably limited. Documents were paper and they all required conversion to image to be viewed electronically, optical character recognition (OCR) to capture their text (though not 100% accurately) and coding (i.e., data entry) to capture key data elements (e.g., author, recipient, subject, document date, document type, names mentioned, etc.). It was a problem, but it was a consistent problem – all documents needed the same treatment to make them searchable and usable electronically.

Though electronic files are already electronic, that doesn’t mean that they’re ready for review as is. They don’t just represent one problem, they can represent a whole collection of problems. For example:

Image only electronic files such as TIFF or image-only PDF files may be electronic, but they still have no searchable text. They still require OCR to generate searchable text to enable them to be effectively searched. It’s important to account for image-only files when self-collecting as keyword searches will miss these files.
Outlook Emails are typically stored in a “container” file like an EDB (Exchange Database), OST (Outlook Offline Storage Table) or PST (Outlook Personal Storage Table). To work with the emails individually, they typically require processing to break them out into individual MSG (Outlook MSG Files). That processing is also necessary to break out the attachments from the emails so that they can be reviewed or categorized individually, if required. And, if the emails are stored in Lotus Notes, there is no equivalent single message format, so those emails generally require conversion to HTML format during processing.
Databases are large, structured collections of data, but they don’t relate easily to a document format, so they require some analysis to determine if, and in what form, they should be produced.
In almost every collection, there are some files that cannot be processed or searched. Corrupt files, password protected files and other types of exception files are frequent components of your ESI collection and it can become very expensive to make these files searchable or reviewable.

These are just a few examples of why working with electronic files for review isn’t necessarily straightforward. Of course, when processed correctly, electronic files include considerable metadata that provides useful information about how and when the files were created and used, and by whom. They’re way more useful than paper documents. So, it’s still preferable to work with electronic files instead of hard copy files whenever they are available. But, despite what you might think, that doesn’t make them ready to review as is.

So, what do you think? Have you encountered difficulties or challenges when processing electronic files? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Is Technology Assisted Review Older than the US Government? – eDiscovery Trends

September 19, 2014

A lot of people consider Technology Assisted Review (TAR) and Predictive Coding (PC) to be new technology. We attempted to debunk that as myth last year after our third annual thought leader interview series by summarizing comments from some of the thought leaders that noted that TAR and PC really just apply artificial intelligence to the review process. But, the foundation for TAR may go way farther back than you might think.

In the BIA blog, Technology Assisted Review: It’s not as new as you think it is, Robin Athlyn Thompson and Brian Schrader take a look at the origins of at least one theory behind TAR. Called the “Naive Bayes classifier”, it’s based on theorems that were essentially introduced to the public in 1812. But, the theorems existed quite a bit earlier than that.

Bayes’s theorem is named after Rev. Thomas Bayes (who died in 1761), who first showed how to use new evidence to update beliefs. He lived so long ago, that there is no known widely accepted portrait of him. His friend Richard Price edited and presented this work in 1763, after Bayes’s death, as An Essay towards solving a Problem in the Doctrine of Chances. Bayes’ algorithm remained unknown until it was independently rediscovered and further developed by Pierre-Simon Laplace, who first published the modern formulation in his 1812 Théorie analytique des probabilities (Analytic theory of probabilities).

Thompson and Schrader go on to discuss more recent uses of artificial intelligence algorithms to map trends, including Amazon’s More Like This functionality that Amazon uses to recommend other items that you may like, based on previous purchases. That technology has been around for nearly two decades – can you believe it’s been that long? – and is one of the key factors for Amazon’s success over that time.

So, don’t scoff at the use of TAR because it’s “new technology”, that thinking is “naïve”. Some of the foundation statistical theories for TAR go further back than the birth of our country.

So, what do you think? Has your organization used technology assisted review on a case yet? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Though it was “Switching Horses in Midstream”, Court Approves Plaintiff’s Predictive Coding Plan – eDiscovery Case Law

September 8, 2014

In Bridgestone Americas Inc. v. Int’l Bus. Mach. Corp., No. 3:13-1196 (M.D. Tenn. July 22, 2014), Tennessee Magistrate Judge Joe B. Brown, acknowledging that he was “allowing Plaintiff to switch horses in midstream”, nonetheless ruled that that the plaintiff could use predictive coding to search documents for discovery, even though keyword search had already been performed.

In this case where the plaintiff sued the defendant for a $75 million computer system that it claimed threw its “entire business operation into chaos”, the plaintiff requested that the court allow the use of predictive coding in reviewing over two million documents. The defendant objected, noting that the request was an unwarranted change to the original case management order that did not include predictive coding, and that it would be unfair to use predictive coding after an initial screening had been done with keyword search terms.

Judge Brown conducted a lengthy telephone conference with the parties on June 25 and, began the analysis in his order by observing that “[p]redictive coding is a rapidly developing field in which the Sedona Conference has devoted a good deal of time and effort to, and has provided various best practices suggestions”, also noting that “Magistrate Judge Peck has written an excellent article on the subject and has issued opinions concerning predictive coding.” “In the final analysis”, Judge Brown continued, “the uses of predictive coding is a judgment call, hopefully keeping in mind the exhortation of Rule 26 that discovery be tailored by the court to be as efficient and cost-effective as possible.”

As a result, noting that “we are talking about millions of documents to be reviewed with costs likewise in the millions”, Judge Brown permitted the plaintiff “to use predictive coding on the documents that they have presently identified, based on the search terms Defendant provided.” Judge Brown acknowledged that he was “allowing Plaintiff to switch horses in midstream”, so “openness and transparency in what Plaintiff is doing will be of critical importance.”

This case has similar circumstances to Progressive Cas. Ins. Co. v. Delaney, where that plaintiff also desired to shift from the agreed upon discovery methodology for privilege review to a predictive coding methodology. However, in that case, the plaintiff did not consult with either the court or the requesting party regarding their intentions to change review methodology and the plaintiff’s lack of transparency and lack of cooperation resulted in the plaintiff being ordered to produce documents according to the agreed upon methodology. It pays to cooperate!

So, what do you think? Should the plaintiff have been allowed to shift from the agreed upon methodology or did the volume of the collection warrant the switch? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Analysis