Review Archives

eDiscovery Trends: Sampling within eDiscovery Software

January 17, 2012

Those of you who have been following this blog since early last year may remember that we published a three part series regarding testing your eDiscovery searches using sampling (as part of the “STARR” approach discussed on this blog about a year ago). We discussed how to determine the appropriate sample size to test your search, using a sample size calculator (freely available on the web). We also discussed how to make sure the sample size is randomly selected (again referencing a site freely available on the web for generating the random set). We even walked through an example of how you can test and refine a search using sampling, saving tens of thousands in review costs with defensible results.

Instead of having to go to all of these external sites to manually size and generate your random sample set, it’s even better when the eDiscovery ECA or review software you’re using handles that process for you. The latest version of FirstPass®, powered by Venio FPR™, does exactly that. Version 3.5.1.2 of FirstPass has introduced a sampling module that provides a wizard that walks you through the process of creating a sample set to review to test your searches. What could be easier?

The wizard begins by providing a dialog to enable the user to select the sampling population. You can choose from tagged documents from one or more tags, documents in saved search results, documents from one or more selected custodians or all documents in the database. When choosing tags, you can choose ANY of the selected tags, ALL of the selected tags, or even choose documents NOT in the selected tags (for example, enabling you to test the documents not tagged as responsive to confirm that responsive documents weren’t missed in your search).

You can then specify your confidence level (e.g., 95% confidence level) and confidence interval (a.k.a., margin of error – e.g., 4%) using slider bars. As you slide the bars to the desired level, the application shows you how that will affect the size of the sample to be retrieved. You can then name the sample and describe its purpose, then identify whether you want to view the sample set immediately, tag it or place it into a folder. Once you’ve identified the preferred option for handling your sample set, the wizard gives you a summary form for displaying your choices. Once you click the Finish button, it creates the sample and gives you a form to show you what it did. Then, if you chose to view the sample set immediately, it will display the sample set (if not, you can then retrieve the tag or folder containing your sample set).

By managing this process within the software, it saves considerable time outside the application having to identify the sample size and create a randomly selected set of IDs, then go back into the application to retrieve and tag those items as belonging to the sample set (which is how I used to do it). The end result is simplified and streamlined.

So, what do you think? Is sample set generation within the ECA or review tool a useful feature? Please share any comments you might have or if you’d like to know more about a particular topic.

Full disclosure: I work for CloudNine Discovery, which provides SaaS-based eDiscovery review applications FirstPass® (for first pass review) and OnDemand® (for linear review and production).

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Our 2012 Predictions

January 12, 2012

Yesterday, we evaluated what others are saying and noted popular eDiscovery prediction trends for the coming year. It’s interesting to identify common trends among the prognosticators and also the unique predictions as well.

But we promised our own predictions for today, so here they are. One of the nice things about writing and editing a daily eDiscovery blog is that it forces you to stay abreast of what’s going on in the industry. Based on the numerous stories we’ve read (many of which we’ve also written about), and in David Letterman “Top 10” fashion, here are our eDiscovery predictions for 2012:

Still More ESI in the Cloud: Frankly, this is like predicting “the Sun will be hot in 2012”. Given the predictions in cloud growth by Forrester and Gartner, it seems inevitable that organizations will continue to migrate more data and applications to “the cloud”. Even if some organizations continue to resist the cloud movement, those organizations still have to address the continued growth in usage of social media sites in business (which, last I checked, are based in the cloud). It’s inevitable.
More eDiscovery Technology in the Cloud As Well: We will continue to see more cloud offerings for eDiscovery technology, ranging from information governance to preservation and collection to review and production. With the need for corporations to share potentially responsive ESI with one or more outside counsel firms, experts and even opposing counsel, cloud based Software-as-a-Service (SaaS) applications are a logical choice for sharing that information effortlessly without having to buy software, hardware and provide infrastructure to do so. Every year at LegalTech, there seems to be a few more eDiscovery cloud providers and this year should be no different.
Self-Service in the Cloud: So, organizations are seeing the benefits of the cloud not only for storing ESI, but also managing it during Discovery. It’s the cost effective alternative. But, organizations are demanding the control of a desktop application within their eDiscovery applications. The ability to load your own data, add your own users and maintain their rights, create your own data fields are just a few of the capabilities that organizations expect to be able to do themselves. And, more providers are responding to those needs. That trend will continue this year.
Technology Assisted Review: This was the most popular prediction among the pundits we reviewed. The amount of data in the world continues to explode, as there were 988 exabytes in the whole world as of 2010 and Cisco predicts that IP traffic over data networks will reach 4.8 zettabytes (each zettabyte is 1,000 exabytes) by 2015. More than five times the data in five years. Even in the smaller cases, there’s simply too much data to not use technology to get through it all. Whether it’s predictive coding, conceptual clustering or some other technology, it’s required to enable attorneys manage the review more effectively and efficiently.
Greater Adoption of eDiscovery Technology for Smaller Cases: As each gigabyte of data is between 50,000 and 100,000 pages, a “small” case of 4 GB (or two max size PST files in Outlook® 2003) can still be 300,000 pages or more. As “small” cases are no longer that small, attorneys are forced to embrace eDiscovery technology for the smaller cases as well. And, eDiscovery providers are taking note.
Continued Focus on International eDiscovery: So, cases are larger and there’s more data in the cloud, which leads to more cases where Discovery of ESI internationally becomes an issue. The Sedona Conference® just issued in December the Public Comment Version of The Sedona Conference® International Principles on Discovery, Disclosure & Data Protection: Best Practices, Recommendations & Principles for Addressing the Preservation & Discovery of Protected Data in U.S. Litigation, illustrating how important an issue this is becoming for eDiscovery.
Prevailing Parties Awarded eDiscovery Costs: Shifting to the courtroom, we have started to see more cases where the prevailing party is awarded their eDiscovery costs as part of their award. As organizations have pushed for more proportionality in the Discovery process, courts have taken it upon themselves to impose that proportionality through taxing the “losers” for reimbursement of costs, causing prevailing defendants to say: “Sue me and lose? Pay my costs!”.
Continued Efforts and Progress on Rules Changes: Speaking of proportionality, there will be continued efforts and progress on changes to the Federal Rules of Civil Procedure as organizations push for clarity on preservation and other obligations to attempt to bring spiraling eDiscovery costs under control. It will take time, but progress will be made toward that goal this year.
Greater Price/Cost Control Pressure on eDiscovery Services: In the meantime, while waiting for legislative relief, organizations will expect the cost for eDiscovery services to be more affordable and predictable. In order to accommodate larger amounts of data, eDiscovery providers will need to offer simplified and attractive pricing alternatives.
Big Player Consolidation Continues, But Plenty of Smaller Players Available: In 2011, we saw HP acquire Autonomy and Symantec acquire Clearwell, continuing a trend of acquisitions of the “big players” in the industry. This trend will continue, but there is still plenty of room for the “little guy” as smaller providers have been pooling resources to compete, creating an interesting dichotomy in the industry of few big and many small providers in eDiscovery.

So, what do you think? Care to offer your own predictions? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Trends: 2012 Predictions – By The Numbers

January 11, 2012

With a nod to Nick Bakay, “It’s all so simple when you break things down scientifically.”

The late December/early January time frame is always when various people in eDiscovery make their annual predictions as to what trends to expect in the coming year. I know what you’re thinking – “oh no, not another set of eDiscovery predictions!” However, at eDiscovery Daily, we do things a little bit differently. We like to take a look at other predictions and see if we can spot some common trends among those before offering some of our own (consider it the ultimate “cheat sheet”). So, as I did last year, I went “googling” for 2012 eDiscovery predictions, and organized the predictions into common themes. I found eDiscovery predictions here, here, here, here, here, here and Applied Discovery. Oh, and also here, here and here. Ten sets of predictions in all! Whew!

A couple of quick comments: 1) Not all of these are from the original sources, but the links above attribute the original sources when they are re-prints. If I have failed to accurately attribute the original source for a set of predictions, please feel free to comment. 2) This is probably not an exhaustive list of predictions (I have other duties in my “day job”, so I couldn’t search forever), so I apologize if I’ve left anybody’s published predictions out. Again, feel free to comment if you’re aware of other predictions.

Here are some of the common themes:

Technology Assisted Review: Nine out of ten “prognosticators” (up from 2 out of 7 last year) predicted a greater emphasis/adoption of technological approaches. While some equate technology assisted review with predictive coding, other technology approaches such as conceptual clustering are also increasing in popularity. Clearly, as the amount of data associated with the typical litigation rises dramatically, technology is playing a greater role to enable attorneys manage the review more effectively and efficiently.
eDiscovery Best Practices Combining People and Technology: Seven out of ten “augurs” also had predictions related to various themes associated with eDiscovery best practices, especially processes that combine people and technology. Some have categorized it as a “maturation” of the eDiscovery process, with corporations becoming smarter about eDiscovery and integrating it into core business practices. We’ve had numerous posts regarding to eDiscovery best practices in the past year, click here for a selection of them.
Social Media Discovery: Six “pundits” forecasted a continued growth in sources and issues related to social media discovery. Bet you didn’t see that one coming! For a look back at cases from 2011 dealing with social media issues, click here.
Information Governance: Five “soothsayers” presaged various themes related to the promotion of information governance practices and programs, ranging from a simple “no more data hoarding” to an “emergence of Information Management platforms”. For our posts related to Information Governance and management issues, click here.
Cloud Computing: Five “mediums” (but are they happy mediums?) predict that ESI and eDiscovery will continue to move to the cloud. Frankly, given the predictions in cloud growth by Forrester and Gartner, I’m surprised that there were only five predictions. Perhaps predicting growth of the cloud has become “old hat”.
Focus on eDiscovery Rules / Court Guidance: Four “prophets” (yes, I still have my thesaurus!) expect courts to provide greater guidance on eDiscovery best practices in the coming year via a combination of case law and pilot programs/model orders to establish expectations up front.
Complex Data Collection: Four “psychics” also predicted that data collection will continue to become more complex as data sources abound, the custodian-based collection model comes under stress and self-collection gives way to more automated techniques.

The “others receiving votes” category (three predicting each of these) included cost shifting and increased awards of eDiscovery costs to the prevailing party in litigation, flexible eDiscovery pricing and predictable or reduced costs, continued focus on international discovery and continued debate on potential new eDiscovery rules. Two each predicted continued consolidation of eDiscovery providers, de-emphasis on use of backup tapes, de-emphasis on use of eMail, multi-matter eDiscovery management (to leverage knowledge gained in previous cases), risk assessment /statistical analysis and more single platform solutions. And, one predicted more action on eDiscovery certifications.

Some interesting predictions. Tune in tomorrow for ours!

So, what do you think? Care to offer your own “hunches” from your crystal ball? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Year in Review: eDiscovery Case Law, Part 4

January 6, 2012

As we noted the past three days, eDiscovery Daily has published 65 posts related to eDiscovery case decisions and activities over the past year, covering 50 unique cases! Yesterday, we looked back at cases related to discovery of social media. One final set of cases to review.

We grouped those cases into common subject themes and have been reviewing them over the next few posts. Perhaps you missed some of these? Now is your chance to catch up!

SANCTIONS / SPOLIATION

Behold the king! I’ll bet that you won’t be surprised that the topic with the largest number of case law decisions (by far!) related to eDiscovery are those related to sanctions and spoliation issues. Late in 2010, eDiscovery Daily reported on a Duke Law Journal article that indicated back then that sanctions were at an all-time high and the number of cases with sanction awards remains high.

Of the 50 cases we covered this past year, over a third of them (17 total cases) related to sanctions and spoliation issues. Here they are. And, as you’ll see by the first case (and a few others), sanctions requested are not always granted. Then again, sometimes both sides get sanctioned!

No Sanctions for Scrubbing Computers Assumed to be Imaged. In this case, data relevant to the case was lost when computers were scrubbed and sold by the defendants with the permission of the court-appointed Receiver, based on the Receiver’s mistaken belief that all relevant computers had been imaged and instruction to the defendants to scrub all computers before selling. Because of the loss of this data, defendants filed a motion for spoliation sanctions for what they described as “the FTC’s bad-faith destruction of Defendants’ computer systems.” Was the motion granted?

Spoliate Evidence, Don’t Go to Jail, but Pay a Million Dollars. Defendant Mark Pappas, President of Creative Pipe, Inc., was ordered by Magistrate Judge Paul W. Grimm to "be imprisoned for a period not to exceed two years, unless and until he pays to Plaintiff the attorney's fees and costs". However, ruling on the defendants’ appeal, District Court Judge Marvin J. Garbis declined to adopt the order regarding incarceration, stating it was not "appropriate to Order Defendant Pappas incarcerated for future possible failure to comply with his obligation to make payment…". So, how much was he ordered to pay? Now we know. That decision was affirmed here.

Deliberately Produce Wrong Cell Phone, Get Sanctioned. In this case, the plaintiff originally resisted production of a laptop and a cell phone for examination, but ultimately produced a laptop and cell phone. The problem with that production? After examination, it was determined that neither device was in use during the relevant time period and the actual devices used during that time frame were no longer in plaintiff’s possession. When requested to explain as to why this was not disclosed initially, the plaintiff’s attorney explained that he was torn between his “competing duties” of protecting his client and candor to the court. Really?

Destroy Data, Pay $1 Million, Lose Case. A federal judge in Chicago has levied sanctions against Rosenthal Collins Group LLC and granted a default judgment to the defendant for misconduct in a patent infringement case, also ordering the Chicago-based futures broker's counsel to pay "the costs and attorneys fees incurred in litigating this motion" where plaintiff’s agent modified metadata related to relevant source code and wiped several relevant disks and devices prior to their production and where the court found counsel participated in "presenting misleading, false information, materially altered evidence and willful non-compliance with the Court’s orders."

Conclusion of Case Does Not Preclude Later Sanctions. In this products liability case that had been settled a year earlier, the plaintiff sought to re-open the case and requested sanctions alleging the defendant systematically destroyed evidence, failed to produce relevant documents and committed other discovery violations in bad faith. As Yogi Berra would say, “It ain’t over ‘til it’s over”.

Written Litigation Hold Notice Not Required. The Pension Committee case was one of the most important cases of 2010 (or any year, for that matter). So, perhaps it’s not surprising that it is starting to become frequently cited by those looking for sanction for failure to issue a written litigation hold. In this case, the defendant cited Pension Committee, arguing that plaintiff’s failure to issue a written litigation hold and subsequent failure to produce three allegedly relevant emails allowed for a presumption that relevant evidence was lost, thereby warranting spoliation sanctions. Was the court’s ruling consistent with Pension Committee?

No Sanctions Ordered for Failure to Preserve Backups. A sanctions motion has been dismissed by the U.S. District Court of Texas in a recent case involving electronic backups and email records, on the grounds that there was no duty to preserve backup tapes and no bad faith in overwriting records.

Discovery Violations Result in Sanctions Against Plaintiff and Counsel. Both the plaintiff and plaintiff's counsel have been ordered to pay sanctions for discovery abuses in a lawsuit in Washington court that was dismissed with prejudice on June 8, 2011.

Meet and Confer is Too Late for Preservation Hold. A US District court in Indiana ruled on June 28 in favor of a motion for an Order to Secure Evidence in an employment discrimination lawsuit. The defendant had given the plaintiff reason to believe that emails and other relevant documents might be destroyed prior to Rule 26(f) meeting between the parties or Rule 16(b) discovery conference with the court. As a result, the plaintiff formally requested a litigation hold on all potentially relevant documents, which was approved by US Magistrate Judge Andrew Rodovich.

Court Orders Sanctions in Response to "Callous and Careless Attitude" of Defendant in Discovery. A Special Master determined that multiple discovery failures on the part of the defendant in an indemnity action were due to discovery procedures "wholly devoid of competence, yet only once motivated by guile". Accordingly, the court ordered sanctions against the defendant and also ordered the defendant to pay all costs associated with its discovery failures, including plaintiff's attorney fees and costs.

Court Upholds Sanctions for Intentional Spoliation of Unallocated Space Data. The Supreme Court of Delaware recently upheld the sanctions against the defendant for wiping the unallocated space on his company’s computer system, despite a court order prohibiting such destruction. In this case, Arie Genger, CEO of Trans-Resources, Inc., argued that sanctions against him were unreasonable and made a motion for the court to overturn its previous decision regarding spoliation of discovery materials. Instead, after due process, the court upheld its earlier decision.

Sanctions for Spoliation, Even When Much of the Data Was Restored. A Virginia court recently ordered sanctions against the defendant in a case of deliberate spoliation of electronic discovery documents. In this case, the defendant was found to have committed spoliation "in bad faith" in a manner that constituted a "violation of duty… to the Court and the judicial process."

"Untimely" Motion for Sanctions for Spoliation Denied. A recent ruling by the US District Court of Tennessee has denied a motion for sanctions for spoliation on the grounds that the motion was "untimely." In this case, the plaintiff argued that the defendants' admitted failure to preserve evidence "warrants a harsh penalty," but the court found in favor of the defense that the motion was untimely.

Defendant Sanctioned for Abandonment and Sale of Server; Defendants' Counsel Unaware of Spoliation. An Illinois District Court ordered heavy sanctions against the defense for spoliation "willfully and in bad faith" of documents stored on a server, in a case revolving around damages sought for breach of loan agreements.

Facebook Spoliation Significantly Mitigates Plaintiff’s Win. In this case with both social media and spoliation issues, monetary sanctions were ordered against the plaintiff and his counsel for significant discovery violations. Those violations included intentional deletion of pictures on the plaintiff’s Facebook page as instructed by his Counsel as well as subsequent efforts to cover those instructions up, among others.

Lilly Fails to Meet its eDiscovery Burden, Sanctions Ordered. In this case, a Tennessee District court found that “Lilly failed to take reasonable steps to preserve, search for, and collect potentially relevant information, particularly electronic data, after its duty to preserve evidence was triggered by being served with the complaint.” As a result, the court ordered sanctions against Lilly. How far did the court go with those sanctions?

Court Grants Adverse Inference Sanctions Against BOTH Sides. Have you ever seen the video where two boxers knock each other out at the same time? That’s similar to what happened in this case. In this case, the court addressed the parties’ cross motions for sanctions, ordering an adverse inference for the defendants’ failure to preserve relevant video surveillance footage, as well as an adverse inference for the plaintiff’s failure to preserve relevant witness statements. The court also awarded defendants attorneys’ fees and costs and ordered re-deposition of several witnesses at the plaintiff’s expense due to other plaintiff spoliation findings.

Next week, we will begin looking ahead at 2012 and expected eDiscovery trends for the coming year.

So, what do you think? Of all of the cases that we have recapped over the past four days, which case do you think was the most significant? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Law: Plaintiff Not Required to Review Millions of Pages of Unallocated Space

December 28, 2011

While plaintiff “should have known better than to agree to search terms” that arguably resulted in recovery from unallocated space files of 65 million pages of documents for plaintiff to review for privilege, a magistrate judge in I-Med Pharma, Inc. v. Biomatrix, Inc., No. 03-3677 (DRD), (D.N.J. Dec. 9, 2011) properly excused plaintiff from its stipulation to produce such documents after reviewing them for privilege.

Plaintiff alleged that defendants breached a distribution agreement relating to eye-drops after one of the defendants was acquired by another defendant. A stipulation among the parties provided for a keyword search by defendants’ expert of plaintiff’s computer network, servers, and related storage devices using English and French terms, including “claim”, “revenue*”, and “profit*”. The search resulted in over 64 million hits just in unallocated space of plaintiff’s computer systems.

District Judge Dickinson Debevoise affirmed a magistrate judge’s order excusing plaintiff from a privilege review of the estimated equivalent of 65 million documents in the unallocated space that contained an agreed search term. Judge Debevoise stated its concern over the cost of such a review:

“A privilege review of 65 million documents is no small undertaking. Even if junior attorneys are engaged, heavily discounted rates are negotiated, and all parties work diligently and efficiently, even a cursory review of that many documents will consume large amounts of attorney time and cost millions of dollars.”

Judge Debevoise rejected defendant’s suggestion that plaintiff could simply review documents with the word “privileged” and produce everything else:

“Even when dealing with intact files, potentially privileged information may often be found in emails, memoranda, presentations, or other documents that are not explicitly flagged as privileged or confidential. And since the data searched here is likely to contain fragmented or otherwise incomplete documents, it is entirely possible for privileged material to be found without its original identifying information.”

Defendants had not shown that relevant, non-duplicate information likely would be found in the unallocated space, according to the court. Thus, plaintiff should have known better than to agree on the search terms, but requiring a privilege review of the results would not be fair or just. Judge Debevoise added a list of factors that parties should consider in evaluating reasonableness of search terms:

“In evaluating whether a set of search terms are reasonable, a party should consider a variety of factors, including: (1) the scope of documents searched and whether the search is restricted to specific computers, file systems, or document custodians; (2) any date restrictions imposed on the search; (3) whether the search terms contain proper names, uncommon abbreviations, or other terms unlikely to occur in irrelevant documents; (4) whether operators such as "and", "not", or "near" are used to restrict the universe of possible results; (5) whether the number of results obtained could be practically reviewed given the economics of the case and the amount of money at issue.”

So, what do you think? Did common sense prevail or should the plaintiff have been held to the agreement? Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery (free subscription required).

eDiscovery Case Law: Award for Database Costs Reversed Due to Cost Sharing Agreement

December 27, 2011

An award of costs of $938,957.72, including the winning party’s agreed half share of the cost of a database or $234,702.43, was reversed in Synopsys, Inc. v. Ricoh Co. (In re Ricoh Co. Patent Litigation), No. 2011-1199 (Fed. Cir. Nov. 23, 2011). While the cost of the database could have been taxed to the losing party, the agreement between the parties on cost sharing controlled the ultimate taxation of costs.

After almost seven years of litigation, Synopsys obtained summary judgment and a declaration in Ricoh’s action against seven Synopsys customers that a Ricoh software patent on integrated circuits had not been infringed. During the litigation, Ricoh and Synopsis were unable to agree on a form of production of Synopsis email with its customers, and Ricoh suggested using an electronic discovery company to compile and maintain a database of the email. Synopsis agreed to use of the company’s services and to pay half the cost of the database. After Synopsis obtained summary judgment, the district court approved items in the Synopsis bill of costs totaling $938,957.72, including $234,702.43 for Synopsis’ half share of the cost of the database and $234,702.43 for document production costs.

The court on appeal of the taxation of costs agreed that 28 U.S.C.S. § 1920 provided for recovery of the cost of the database, which was used to produce email in its native format. According to the court, “electronic production of documents can constitute ‘exemplification’ or ‘making copies’ under section 1920(4).” However, the parties had entered into an agreement on splitting the cost of the database and nothing in the 14-page agreement or communications regarding the agreement indicated that the agreement was anything other than a final agreement on the costs of the database. Faced with “scant authority from other circuits as to whether a cost-sharing agreement between parties to litigation is controlling as to the ultimate taxation of costs,” the court concluded the parties’ cost-sharing agreement was controlling. It reversed the district court’s award of $234,702.43 for Synopsis’ half share of the cost of the database.

The court also reversed and remanded the award of an additional $234,702.43 for document production costs because those costs were not adequately documented. For example, many of the invoices simply stated “document production” and did not indicate shipment to opposing counsel. The court stated that the “document production” phrase “does not automatically signify that the copies were produced to opposing counsel.”

So, what do you think? Should the agreement between parties have superseded the award? Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery (free subscription required).

eDiscovery Trends: Bennett Borden

December 15, 2011

This is the second of our Holiday Thought Leader Interview series. I interviewed several thought leaders to get their perspectives on various eDiscovery topics.

Today's thought leader is Bennett B. Borden. Bennett is the co-chair of Williams Mullen’s eDiscovery and Information Governance Section. Based in Richmond, Va., his practice is focused on Electronic Discovery and Information Law. He has published several papers on the use of predictive coding in litigation. Bennett is not only an advocate for predictive coding in review, but has reorganized his own litigation team to more effectively use advanced computer technology to improve eDiscovery.

You have written extensively about the ways that the traditional, or linear review process is broken. Most of our readers understand the issue, but how well has the profession at large grappled with this? Are the problems well understood?

The problem with the expense of document review is well understood, but how to solve it is less well known. Fortunately, there is some great research being done by both academics and practitioners that is helping shed light on both the problem and the solution. In addition to the research we’ve written about in The Demise of Linear Review and Why Document Review is Broken, some very informative research has come out of the TREC Legal Track and subsequent papers by Maura R. Grossman and Gordon V. Cormack, as well as by Jason R. Baron, the eDiscovery Institute, Douglas W. Oard and Herbert L. Roitblat, among others. Because of this important research, the eDiscovery bar is becoming increasingly aware of how document review and, more importantly, fact development can be more effective and less costly through the use of advanced technology and artful strategy.

You are a proponent of computer-assisted review- is computer search technology truly mature? Is it a defensible strategy for review?

Absolutely. In fact, I would argue that computer-assisted review is actually more defensible than traditional linear review. By computer-assisted review, I mean the utilization of advanced search technologies beyond mere search terms (e.g., topic modeling, clustering, meaning-based search, predictive coding, latent semantic analysis, probabilistic latent semantic analysis, Bayesian probability) to more intelligently address a data set. These technologies, to a greater or lesser extent, group documents based upon similarities, which allows a reviewer to address the same kinds of documents in the same way.

Computers are vastly superior to humans in quickly finding similarities (and dissimilarities) within data. And, the similarities that computers are able to find have advanced beyond mere content (through search terms) to include many other aspects of data, such as correspondents, domains, dates, times, location, communication patterns, etc. Because the technology can now recognize and address all of these aspects of data, the resulting groupings of documents is more granular and internally cohesive. This means that the reviewer makes fewer and more consistent choices across similar documents, leading to a faster, cheaper, better and more defensible review.

How has the use of [computer-assisted review] predictive coding changed the way you tackle a case? Does it let you deploy your resources in new ways?

I have significantly changed how I address a case as both technology and the law have advanced. Although there is a vast amount of data that might be discoverable in a particular case, less than 1 percent of that data is ever used in the case or truly advances its resolution. The resources I deploy focus on identifying that 1 percent, and avoiding the burden and expense largely wasted on the 99 percent. Part of this is done through developing, negotiating and obtaining reasonable and iterative eDiscovery protocols that focus on the critical data first. EDiscovery law has developed at a rapid pace and provides the tools to develop and defend these kinds of protocols. An important part of these protocols is the effective use of computer-assisted review.

Lately there has been a lot of attention given to the idea that computer-assisted review will replace attorneys in litigation. How much truth is there to that idea? How will computer-assisted review affect the role of attorneys?

Technology improves productivity, reducing the time required to accomplish a task. This is no less true of computer-assisted review. The 2006 amendments to the Federal Rules of Civil Procedure caused a massive increase in the number of attorneys devoted to the review of documents. As search technology and the review tools that employ them continue to improve, the demand for attorneys devoted to review will obviously decline.

But this is not a bad thing. Traditional linear document review is horrifically tedious and boring, and it is not the best use of legal education and experience. Fundamentally, litigators develop facts and apply the law to those facts to determine a client’s position to advise them to act accordingly. Computer-assisted review allows us to get at the most relevant facts more quickly, reducing both the scope and duration of litigation. This is what lawyers should be focused on accomplishing, and computer-assisted review can help them do so.

With the rise of computer-assisted review, do lawyers need to learn new skills? Do lawyers need to be computer scientists or statisticians to play a role?

Lawyers do not need to be computer scientists or statisticians, but they certainly need to have a good understanding of how information is created, how it is stored, and how to get at it. In fact, lawyers who do not have this understanding, whether alone or in conjunction with advisory staff, are simply not serving their clients competently.

You’ve suggested that lawyers involved in computer-assisted review enjoy the work more than in the traditional manual review process. Why do you think that is?

I think it is because the lawyers are using their legal expertise to pursue lines of investigation and develop the facts surrounding them, as opposed to simply playing a monotonous game of memory match. Our strategy of review is to use very talented lawyers to address a data set using technological and strategic means to get to the facts that matter. While doing so our lawyers uncover meaning within a huge volume of information and weave it into a story that resolves the matter. This is exciting and meaningful work that has had significant impact on our clients’ litigation budgets.

How is computer assisted review changing the competitive landscape? Does it provide an opportunity for small firms to compete that maybe didn’t exist a few years ago?

We live in the information age, and lawyers, especially litigators, fundamentally deal in information. In this age it is easier than ever to get to the facts that matter, because more facts (and more granular facts) exist within electronic information. The lawyer who knows how to get at the facts that matter is simply a more effective lawyer. The information age has fundamentally changed the competitive landscape. Small companies are able to achieve immense success through the skillful application of technology. The same is true of law firms. Smaller firms that consciously develop and nimbly utilize the technological advantages available to them have every opportunity to excel, perhaps even more so than larger, highly-leveraged firms. It is no longer about size and head-count, it’s about knowing how to get at the facts that matter, and winning cases by doing so.

Thanks, Bennett, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Jason R. Baron

December 14, 2011

This is the first of the Holiday Thought Leader Interview series. I interviewed several thought leaders to get their perspectives on various eDiscovery topics.

Today’s thought leader is Jason R. Baron. Jason has served as the National Archives' Director of Litigation since May 2000 and has been involved in high-profile cases for the federal government. His background in eDiscovery dates to the Reagan Administration, when he helped retain backup tapes containing Iran-Contra records from the National Security Council as the Justice Department’s lead counsel. Later, as director of litigation for the U.S. National Archives and Records Administration, Jason was assigned a request to review documents pertaining to tobacco litigation in U.S. v. Philip Morris.

He currently serves as The Sedona Conference Co-Chair of the Working Group on Electronic Document Retention and Production. Baron is also one of the founding coordinators of the TREC Legal Track, a search project organized through the National Institute of Standards and Technology to evaluate search protocols used in eDiscovery. This year, Jason was awarded the Emmett Leahy Award for Outstanding Contributions and Accomplishments in the Records and Information Management Profession.

You were recently awarded the prestigious Emmett Leahy Award for excellence in records management. Is it unusual that a lawyer wins such an award? Or is the job of the litigator and records manager becoming inextricably linked?

Yes, it was unusual: I am the first federal lawyer to win the Emmett Leahy award, and only the second lawyer to have done so in the 40-odd years that the award has been given out. But my career path in the federal government has been a bit unusual as well: I spent seven years working as lead counsel on the original White House PROFS email case (Armstrong v. EOP), followed by more than a decade worrying about records-related matters for the government as Director of Litigation at NARA. So with respect to records and information management, I long ago passed at least the Malcolm Gladwell test in "Outliers" where he says one needs to spend 10,000 hours working on anything to develop a level of "expertise." As to the second part of your question, I absolutely believe that to be a good litigation attorney these days one needs to know something about information management and eDiscovery — since all evidence is "born digital" and lots of it needs to be searched for electronically. As you know, I also have been a longtime advocate of a greater linking between the fields of information retrieval and eDiscovery.

In your acceptance speech you spoke about the dangers of information overload and the possibility that it will make it difficult for people to find important information. How optimistic that we can avoid this dystopian future? How can the legal profession help the world avoid this fate?

What I said was that in a world of greater and greater retention of electronically stored information, we need to leverage artificial intelligence and specifically better search algorithms to keep up in this particular information arms race. Although Ralph Losey teased me in a recent blog post that I was being unduly negative about future information dystopias, I actually am very optimistic about the future of search technology assisting in triaging the important from the ephemeral in vast collections of archives. We can achieve this through greater use of auto-categorization and search filtering methods, as well as a having a better ability in the future to conduct meaningful searches across the enterprise (whether in the cloud or not). Lawyers can certainly advise their clients how to practice good information governance to accomplish these aims.

You were one of the founders of the TREC Legal Track research project. What do you consider that project’s achievement at this point?

The initial idea for the TREC Legal Track was to get a better handle on evaluating various types of alternative search methods and technologies, to compare them against a "baseline" of how effective lawyers were in relying on more basic forms of keyword searching. The initial results were a wake-up call, in showing lawyers that sole reliance on simple keywords and Boolean strings sometimes results in a large quantity of relevant evidence going missing. But during the half-decade of research that now has gone into the track, something else of perhaps even greater importance has emerged from the results, namely: we have a much better understanding now of what a good search process looks like, which includes a human in the loop (known in the Legal Track as a topic authority) evaluating on an ongoing, iterative basis what automated search software kicks out by way of initial results. The biggest achievement however may simply be the continued existence of the TREC Legal Track itself, still going in its 6th year in 2011, and still producing important research results, on an open, non-proprietary platform, that are fully reproducible and that benefit both the legal profession as well as the information retrieval academic world. While I stepped away after 4 years from further active involvement in the Legal Track as a coordinator, I continue to be highly impressed with the work of the current track coordinators, led by Professor Doug Oard at the University of Maryland, who was remained at the helm since the very beginning.

To what extent has TREC’s research proven the reliability of computer-assisted review in litigation? Is there a danger that the profession assumes the reliability of computer-assisted review is a settled matter?

The TREC Legal Track results I am most familiar with through calendar year 2010 have shown computer-assisted review methods finding in some cases on the order of 85% of relevant documents (a .85 recall rate) per topic while only producing 10% false positives (a .90 precision rate). Not all search methods have had these results, and there has been in fact a wide variance in success achieved, but these returns are very promising when compared with historically lower rates of recall and precision across many information retrieval studies. So the success demonstrated to date is highly encouraging. Coupled with these results has been additional research reported by Maura Grossman & Gordon Cormack, in their much-cited paper Technology-Assisted Review in EDiscovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, which makes the case for the greater accuracy and efficiency of computer-assisted review methods.

Other research conducted outside of TREC, most notably by Herbert Roitblat, Patrick Oot and Anne Kershaw, also point in a similar direction (as reported in their article Mandating Reasonableness in a Reasonable Inquiry). All of these research efforts buttress the defensibility of technology-assisted review methods in actual litigation, in the event of future challenges. Having said this, I do agree that we are still in the early days of using many of the newer predictive types of automated search methods, and I would be concerned about courts simply taking on faith the results of past research as being applicable in all legal settings. There is no question however that the use of predictive analytics, clustering algorithms, and seed sets as part of technology-assisted review methods is saving law firms money and time in performing early case assessment and for multiple other purposes, as reported in a range of eDiscovery conferences and venues — and I of course support all of these good efforts.

You have discussed the need for industry standards in eDiscovery. What benefit would standards provide?

Ever since I served as Co-Editor in Chief on The Sedona Conference Commentary on Achieving Quality in eDiscovery (2009), I have been thinking that the process for conducting good eDiscovery. That paper focused on project management, sampling, and imposing various forms of quality controls on collection, review, and production. The question is, is a good eDiscovery process capable of being fit into a maturity model of sorts, and might be useful to consider whether vendors and law firms would benefit from having their in-house eDiscovery processes audited and certified as meeting some common baseline of quality? To this end, the DESI IV workshop ("Discovery of ESI") held in Pittsburgh last June, as part of the Thirteenth International AI and Law Conference (ICAIL 2011), had as its theme exploring what types of model standards could be imposed on the eDiscovery discipline, so that we all would be able to work from some common set of benchmarks, Some 75 people attended and 20-odd papers were presented. I believe the consensus in the room was that we should be pursuing further discussions as to what an ISO 9001-type quality standard would look like as applied to the specific eDiscovery sector, much as other industry verticals have their own ISO standards for quality. Since June, I have been in touch with some eDiscovery vendors have actually undergone an audit process to achieve ISO 9001 certification. This is an area where no consensus has yet emerged as to the path forward — but I will be pursuing further discussions with DESI workshop attendees in the coming months and promise to report back in this space as to what comes of these efforts.

What sort of standards would benefit the industry? Do we need standards for pieces of the eDiscovery process, like a defensible search standard, or are you talking about a broad quality assurance process?

DESI IV started by concentrating on what would constitute a defensible search standard; however, it became clear at the workshop and over the course of the past few months that we need to think bigger, in looking across the eDiscovery life cycle as to what constitutes best practices through automation and other means. We need to remember however that eDiscovery is a very young discipline, as we're only five years out from the 2006 Rules Amendments. I don't have all the answers, by any means, on what would constitute an acceptable set of standards, but I like to ask questions and believe in a process of continuous, lifelong learning. As I said, I promise I'll let you know about what success has been achieved in this space.

Thanks, Jason, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Best Practices: When is it OK to Produce without Linear Review?

December 13, 2011

At eDiscoveryDaily, the title of our daily post usually reflects some eDiscovery news and/or analysis that we are providing our readers. However, based on a comment I received from a colleague last week, I thought I would ask a thought provoking question for this post.

There was an interesting post in the EDD Update blog a few days ago entitled Ediscovery Production Without Review, written by Albert Barsocchini, Esq. The post noted that due to “[a]dvanced analytics, judicial acceptance of computer aided coding, claw back/quick-peek agreements, and aggressive use of Rule 16 hearings”, many attorneys are choosing to produce responsive ESI without spending time and money on a final linear review.

A colleague of mine sent me an email with a link to the post and stated, “I would not hire a firm if I knew they were producing without a doc by doc review.”

Really? What if:

You collected the equivalent of 10 million pages* and still had 1.2 million potentially responsive pages after early data assessment/first pass review? (reducing 88% of the population, which is a very high culling percentage in most cases)
And your review team could review 60 pages per hour, requiring 20,000 hours to complete the responsiveness review?
And their average rate was a very reasonable $75 per hour to review, resulting in a total cost of $1.5 million to perform a doc by doc review?
And you had a clawback agreement in place so that you could claw back any inadvertently produced privileged files?

“Would you insist on a doc by doc review then?”, I asked.

Let’s face it, $1.5 million is a lot of money. That may seem like an inordinate amount of money to spend on linear review and the data volume for some large cases may be so voluminous that an effective argument might be made to rely on technology to identify the files to produce.

On the other hand, if you’re a company like Google and you inadvertently produced a document in a case potentially worth billions of dollars, $1.5 million doesn’t seem near as big an amount to spend given the risk associated with potential mistakes. Also, as the Google case and this case illustrate, there are no guarantees with regards to the ability to claw back inadvertently produced files. The cost of linear review will, especially in larger cases, need to be weighed against the potential risk of not conducting that review for the organization to determine what’s the best approach for them.

So, what do you think? Do you produce in cases where not all of the responsive documents are reviewed before production? Are there criteria that you use to determine when to conduct or forego linear review? Please share any comments you might have or if you’d like to know more about a particular topic.

*I used pages in the example to provide a frame of reference to which most attorneys can relate. While 10 million pages may seem like a large collection, at an average of 50,000 pages per GB, that is only 200 total GB. Many laptops and desktops these days have a drive that big, if not larger. Depending on your review approach, most, if not all, original native files would probably never be converted to a standard paginated document format (i.e., TIFF or PDF). So, it is unlikely that the total page count of the collection would ever be truly known.

eDiscovery Best Practices: Production is the “Ringo” of the eDiscovery Phases

December 1, 2011

Since eDiscovery Daily debuted over 14 months ago, we’ve covered a lot of case law decisions related to eDiscovery. 65 posts related to case law to date, in fact. We’ve covered cases associated with sanctions related to failure to preserve data, issues associated with incomplete collections, inadequate searching methodologies, and inadvertent disclosures of privileged documents, among other things. We’ve noted that 80% of the costs associated with eDiscovery are in the Review phase and that volume of data and sources from which to retrieve it (including social media and “cloud” repositories) are growing exponentially. Most of the “press” associated with eDiscovery ranges from the “left side of the EDRM model” (i.e., Information Management, Identification, Preservation, Collection) through the stages to prepare materials for production (i.e., Processing, Review and Analysis).

All of those phases lead to one inevitable stage in eDiscovery: Production. Yet, few people talk about the actual production step. If Preservation, Collection and Review are the “John”, “Paul” and “George” of the eDiscovery process, Production is “Ringo”.

It’s the final crucial step in the process, and if it’s not handled correctly, all of the due diligence spent in the earlier phases could mean nothing. So, it’s important to plan for production up front and to apply a number of quality control (QC) checks to the actual production set to ensure that the production process goes as smooth as possible.

Planning for Production Up Front

When discussing the production requirements with opposing counsel, it’s important to ensure that those requirements make sense, not only from a legal standpoint, but a technical standpoint as well. Involve support and IT personnel in the process of deciding those parameters as they will be the people who have to meet them. Issues to be addressed include, but not limited to:

Format of production (e.g., paper, images or native files);
Organization of files (e.g., organized by custodian, legal issue, etc.);
Numbering scheme (e.g., Bates labels for images, sequential file names for native files);
Handling of confidential and privileged documents, including log requirements and stamps to be applied;
Handling of redactions;
Format and content of production log;
Production media (e.g., CD, DVD, portable hard drive, FTP, etc.).

I was involved in a case recently where opposing counsel was requesting an unusual production format where the names of the files would be the subject line of the emails being produced (for example, “Re: Completed Contract, dated 12/01/2011”). Two issues with that approach: 1) The proposed format only addressed emails, and 2) Windows file names don’t support certain characters, such as colons (:) or slashes (/). I provided that feedback to the attorneys so that they could address with opposing counsel and hopefully agree on a revised format that made more sense. So, let the tech folks confirm the feasibility of the production parameters.

The workflow throughout the eDiscovery process should also keep in mind the end goal of meeting the agreed upon production requirements. For example, if you’re producing native files with metadata, you may need to take appropriate steps to keep the metadata intact during the collection and review process so that the metadata is not inadvertently changed. For some file types, metadata is changed merely by opening the file, so it may be necessary to collect the files in a forensically sound manner and conduct review using copies of the files to keep the originals intact.

Tomorrow, we will talk about preparing the production set and performing QC checks to ensure that the ESI being produced to the requesting party is complete and accurate.

So, what do you think? Have you had issues with production planning in your cases? Please share any comments you might have or if you’d like to know more about a particular topic.

Review