EDRM Archives

eDiscovery Best Practices: Determining Appropriate Sample Size to Test Your Search

April 1, 2011

We’ve talked about searching best practices quite a bit on this blog. One part of searching best practices (as part of the “STARR” approach I described in an earlier post) is to test your search results (both the result set and the files not retrieved) to determine whether the search you performed is effective at maximizing both precision and recall to the extent possible, so that you retrieve as many responsive files as possible without having to review too many non-responsive files. One question I often get is: how many files do you need to review to test the search?

If you remember from statistics class in high school or college, statistical sampling is choosing a percentage of the results population at random for inspection to gather information about the population as a whole. This saves considerable time, effort and cost over reviewing every item in the results population and enables you to obtain a “confidence level” that the characteristics of the population reflect your sample. Statistical sampling is a method used for everything from exit polls to predict elections to marketing surveys to poll customers on brand popularity and is a generally accepted method of drawing conclusions for an overall results population. You can sample a small portion of a large set to obtain a 95% or 99% confidence level in your findings (with a margin of error, of course).

So, does that mean you have to find your old statistics book and dust off your calculator or (gasp!) slide rule? Thankfully, no.

There are several sites that provide sample size calculators to help you determine an appropriate sample size, including this one. You’ll simply need to identify a desired confidence level (typically 95% to 99%), an acceptable margin of error (typically 5% or less) and the population size.

So, if you perform a search that retrieves 100,000 files and you want a sample size that provides a 99% confidence level with a margin of error of 5%, you’ll need to review 660 of the retrieved files to achieve that level of confidence in your sample (only 383 files if a 95% confidence level will do). If 1,000,000 files were not retrieved, you would only need to review 664 of the not retrieved files to achieve that same level of confidence (99%, with a 5% margin of error) in your sample. As you can see, the sample size doesn’t need to increase much when the population gets really large and you can review a relatively small subset to understand your collection and defend your search methodology to the court.

On Monday, we will talk about how to randomly select the files to review for your sample. Same bat time, same bat channel!

So, what do you think? Do you use sampling to test your search results? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Trends: Forbes on the Rise of Predictive Coding

March 28, 2011

First the New York Times with an article about eDiscovery, now Forbes. Who’s next, The Wall Street Journal? 😉

Forbes published a blog post entitled E-Discovery And the Rise of Predictive Coding a few days ago. Written by Ben Kerschberg, Founder of Consero Group LLC, it gets into some legal issues and considerations regarding predictive coding that are interesting. For some background on predictive coding, check out our December blog posts, here and here.

First, the author provides a very brief history of document review, starting with bankers boxes and WordPerfect and “[a]fter an interim phase best characterized by simple keyword searches and optical character recognition”, it evolved to predictive coding. OK, that’s like saying that Gone with the Wind started with various suitors courting Scarlett O’Hara and after an interim phase best characterized by the Civil War, marriage and heartache, Rhett says to Scarlett, “Frankly, my dear, I don’t give a damn.” A bit oversimplification of how review has evolved.

Nonetheless, the article gets into a couple of important legal issues raised by predictive coding. They are:

Satisfying Reasonable Search Requirements: Whether counsel can utilize the benefits of predictive coding and still meet legal obligations to conduct a reasonable search for responsive documents under the federal rules. The question is, what constitutes a reasonable search under Federal Rule 26(g)(1)(A), which requires that the responding attorney attest by signature that “with respect to a disclosure, it is complete and correct as of the time it is made”?
Protecting Privilege: Whether counsel can protect attorney-client privilege for their client when a privileged document is inadvertently disclosed. Fed. Rule of. Evidence 502 provides that a court may order that a privilege or protection is not waived by disclosure if the disclosure was inadvertent and the holder of the privilege took reasonable steps to prevent disclosure. Again, what’s reasonable?

The author concludes that the use of predictive coding is reasonable, because it a) makes document review more efficient by providing only those documents to the reviewer that have been selected by the algorithm; b) makes it more likely that responsive documents will be produced, saving time and resources; and c) refines relevant subsets for review, which can then be validated statistically.

So, what do you think? Does predictive coding enable attorneys to satisfy these legal issues? Is it reasonable? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Does Size Matter?

March 25, 2011

I admit it, with a title like “Does Size Matter?”, I’m looking for a few extra page views…. 😉

I frequently get asked how big does an ESI collection need to be to benefit from eDiscovery technology. In a recent case with one of my clients, the client had a fairly small collection – only about 4 GB. But, when a judge ruled that they had to start conducting depositions in a week, they needed to review that data in a weekend. Without FirstPass™, powered by Venio FPR™ to cull the data and OnDemand® to manage the linear review, they would not have been able to make that deadline. So, they clearly benefited from the use of eDiscovery technology in that case.

But, if you’re not facing a tight deadline, how large does your collection need to be for the use of eDiscovery technology to provide benefits?

I recently conducted a webinar regarding the benefits of First Pass Review – aka Early Case Assessment, or a more accurate term (as George Socha points out regularly), Early Data Assessment. One of the topics discussed in that webinar was the cost of review for each gigabyte (GB). Extrapolated from an analysis conducted by Anne Kershaw a few years ago (and published in the Gartner report E-Discovery: Project Planning and Budgeting 2008-2011), here is a breakdown:

Estimated Cost to Review All Documents in a GB:

Pages per GB: 75,000
Pages per Document: 4
Documents Per GB: 18,750
Review Rate: 50 documents per hour
Total Review Hours: 375
Reviewer Billing Rate: $50 per hour

Total Cost to Review Each GB: $18,750

Notes: The number of pages per GB can vary widely. Page per GB estimates tend to range from 50,000 to 100,000 pages per GB, so 75,000 pages (18,750 documents) seems an appropriate average. 50 documents reviewed per hour is considered to be a fast review rate and $50 per hour is considered to be a bargain price. eDiscovery Daily provided an earlier estimate of $16,650 per GB based on assumptions of 20,000 documents per GB and 60 documents reviewed per hour – the assumptions may change somewhat, but, either way, the cost for attorney review of each GB could be expected to range from at least $16,000 to $18,000, possibly more.

Advanced culling and searching capabilities of First Pass Review tools like FirstPass can enable you to cull out 70-80% of most collections as clearly non-responsive without having to conduct attorney review on those files. If you have merely a 2 GB collection and assume the lowest review cost above of $16,000 per GB, the use of a First Pass Review tool to cull out 70% of the collection can save $22,400 in attorney review costs. Is that worth it?

So, what do you think? Do you use eDiscovery technology for only the really large cases or ALL cases? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Is Disclosure of Search Terms Required?

March 24, 2011

I read a terrific article a couple of days ago from the New York Law Journal via Law Technology News entitled Search Terms Are More Than Mere Words, that had some interesting takes about the disclosure of search terms in eDiscovery. The article was written by David J. Kessler, Robert D. Owen, and Emily Johnston of Fulbright & Jaworski. The primary emphasis of the article was with regard to the forced disclosure of search terms by courts.

In the age of “meet and confer”, it has become much more common for parties to agree to exchange search terms in a case to limit costs and increase transparency. However, as the authors correctly note, search terms reflect counsel’s strategy for the case and, therefore, work product. Their position is that courts should not force disclosure of search terms and that disclosure of terms is “not appropriate under the Federal Rules of Civil Procedure”. The article provides a compelling argument as to why forced disclosure is not appropriate and provides some good case cites where courts have accepted or rejected requests to compel provision of search terms. I won’t try to recap them all here – check out the article for more information.

So, should disclosure of search terms be generally required? If not, what does that mean in terms of utilizing a defensible approach to searching?

Personally, I agree with the authors that forced disclosure of search terms is generally not appropriate, as it does reflect strategy and work product. However, there is an obligation for each party to preserve, collect, review and produce all relevant materials to the best of their ability (that are not privileged, of course). Searching is an integral part of that process. And, the article does note that “chosen terms may come under scrutiny if there is a defect in the production”, though “[m]ere speculation or unfounded accusations” should not lead to a requirement to disclose search terms.

With that said, the biggest component of most eDiscovery collections today is email, and that email often reflects discussions between parties in the case. In these cases, it’s much easier for opposing counsel to identify legitimate defects in the production because they have some of the same correspondence and documents and can often easily spot discrepancies in the production set. If they identify legitimate omissions from the production, those omissions could cause the court to call into question your search procedures. Therefore, it’s important to conduct a defensible approach to searching (such as the “STARR” approach I described in an earlier post) to be able to defend yourself if those questions arise. Demonstrating a defensible approach to searching will offer the best chance to preserve your rights to protect your work product of search terms that reflect your case strategy.

So, what do you think? Do you think that forced disclosure of search terms is appropriate? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: What is “Reduping?”

March 22, 2011

As emails are sent out to multiple custodians, deduplication (or “deduping”) has become a common practice to eliminate multiple copies of the same email or file from the review collection, saving considerable review costs and ensuring consistency by not having different reviewers apply different responsiveness or privilege determinations to the same file (e.g., one copy of a file designated as privileged while the other is not may cause a privileged file to slip into the production set). Deduping can be performed either across custodians in a case or within each custodian.

Everyone who works in electronic discovery knows what “deduping” is. But how many of you know what “reduping” is? Here’s the answer:

“Reduping” is the process of re-introducing duplicates back into the population for production after completing review. There are a couple of reasons why a producing party may want to “redupe” the collection after review:

Deduping Not Requested by Receiving Party: As opposing parties in many cases still don’t conduct a meet and confer or discuss specifications for production, they may not have discussed whether or not to include duplicates in the production set. In those cases, the producing party may choose to produce the duplicates, giving the receiving party more files to review and driving up their costs. The attitude of the producing party can be “hey, they didn’t specify, so we’ll give them more than they asked for.”
Receiving Party May Want to See Who Has Copies of Specific Files: Sometimes, the receiving party does request that “dupes” are identified, but only within custodians, not across them. In those cases, it’s because they want to see who had a copy of a specific email or file. However, the producing party still doesn’t want to review the duplicates (because of increasing costs and the possibility of inconsistent designations), so they review a deduped collection and then redupe after review is complete.

Many review applications support the capability for reduping. For example, FirstPass™, powered by Venio FPR™, suppresses the duplicates from review, but applies the same tags to the duplicates of any files tagged during first pass review. When it’s time to export the collection, to either move the potentially responsive files on to linear review (in a product like OnDemand®) or straight to production, the user can decide at that time whether or not to export the dupes. Those dupes have the same designations as the primary copies, ensuring consistency in handling them downstream.

So, what do you think? Does your review tool support “reduping”? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Law: Read Inadvertent Email, Get Disqualified from Case

March 18, 2011

Lesson of the day: When you receive an inadvertently sent privileged email, read it and don’t disclose receipt of it, you can get kicked off the case.

In Terraphase Engineering, Inc., et al. v. Arcadis, U.S., Inc, the court disqualified defendant’s in-house and outside counsel for their handling of a disputed privileged email that was inadvertently sent by the plaintiffs’ counsel to the defendant and shared with defendant’s outside counsel. For more information regarding this case, check out this Law Technology News article.

When a group of employees left Arcadis to form a competing company, relations between the two soured quickly and led to litigation. Just prior to filing their lawsuit, the plaintiffs’ attorney sent a strategy email to his clients, which contained an attachment that, according to the former employees, included “Plaintiffs’ privileged recitation of background and comments to and from legal counsel.” Unfortunately for the attorney (or maybe fortunately, as it turned out), the email system’s auto-complete function (which completes a saved email address as soon as you begin entering it) entered an old Arcadis email address for one of the employees, which wasn’t caught before sending. The email and the attachment went directly to Arcadis, which had been monitoring the plaintiffs’ email accounts since they resigned from the company.

Arcadis’ in-house counsel read the email and the attached document and apparently shared the email with their general counsel and Arcadis’ outside counsel (Gordon & Rees, LLP), neither of whom notified the plaintiffs’ attorney that they had received the email. Arcadis’ counterclaim contained certain information that caused the plaintiffs to suspect that Arcadis and its counsel had reviewed their privileged communications, and Arcadis, when confronted, acknowledged that it had received the email and agreed to destroy all copies, but refused to identify who reviewed the e-mail. Eventually, the plaintiffs filed a motion for a protective order to disqualify Arcadis’ counsel and prevent Arcadis from using the email or the attachment during the case, stipulating that attorneys are prohibited from using privileged material that they receive from an opposing party, and are under an ethical obligation to immediately notify the opposing party when such information is received.

Arcadis opposed the motion, arguing that in-house and outside counsel only conducted a cursory review of the email and attachment, and stated that it was not privileged because it was sent “unsolicited” to the plaintiff’s work e-mail, in which he had no reasonable expectation of privacy. Arcadis also argued because the information itself was not privileged and would be disclosed during discovery, the plaintiffs would suffer no irreparable harm. And, since there was no active litigation between the parties when Arcadis received the email, they argued that the rules of professional conduct did not apply.

The court rejected Arcadis’ arguments and ruled for the plaintiffs, disqualifying Arcadis’ outside counsel and the in-house counsel who reviewed the emails, also ruling that Arcadis’ general counsel must be “removed from all aspects of the day-to-day management of the case, including . . . making any substantive or strategic decisions with regard to the case.”. Arcadis was also ordered to dismiss its counterclaim and the plaintiffs were awarded their costs and fees in connection with bringing the motion against Arcadis.

A copy of the order can be found here.

So, what do you think? Have you ever been burned by an inadvertently sent email? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Law: Deliberately Produce Wrong Cell Phone, Get Sanctioned

March 17, 2011

In Moreno v. Ostly, No. A127780, (Cal. Ct. App. Feb. 22, 2011), the California Court of Appeals affirmed the trial court’s award of monetary sanctions imposed against the plaintiff and her law firm in the amount of $13,500 for counsel and plaintiff’s discovery misconduct related to the preservation of text messages.

The plaintiff sued her former law firm employer alleging sexual harassment, retaliation and failure to pay back wages. She claimed that a partner at the firm “forced himself on her sexually” on a daily basis and that she was fired when she notified the partner that she wished to sever the “intimate aspect of their relationship.” In discovery, defendants sought copies of relevant e-mails and text messages between the plaintiff and the partner. After the parties' meet and confer efforts failed, the court ordered the plaintiff to produce her personal computer and cell phone for inspection. The inspection revealed that the cell phone produced was different from the one plaintiff had during her course of employment. When questioned regarding the discrepancy, plaintiff’s counsel responded that the defendants would have to undertake further discovery efforts to determine what happened to the relevant equipment. The plaintiff’s attorney conceded that many of the text messages on the prior phone had been used against the defendants before the EEOC, but had not been preserved prior to the disposal of the cell phone.

The defendants filed a motion for terminating and monetary sanctions or, in the alternative, a willful suppression of evidence jury instruction. The trial court awarded monetary sanctions, finding the plaintiff and her counsel deliberately withheld the fact that the plaintiff failed to preserve her cell phone data, causing opposing counsel and the court to expend unnecessary resources. The court found plaintiff’s counsel’s conduct willful and his explanation citing a conflict between the duty of loyalty to the client and the duty of candor to opposing counsel and the court “not very credible.”

The court of appeals concluded the trial court's award of monetary sanctions was supported by substantial evidence, and was well within the discretion of the court.

So, what do you think? Are you aware of any other blatant examples of evasive discovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: eDiscovery Case Law Update, by Littler Mendelson P.C.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Despite What NY Times Says, Lawyers Not Going Away

March 14, 2011

There was a TV commercial in the mid-80’s where a soap opera actor delivered the line “I’m not a doctor, but I play one on TV”. Can you remember the product it was advertising (without clicking on the link)? If so, you win the trivia award of the day! 😉

I’m a technologist who has been working in litigation support and eDiscovery for over twenty years. If you’ve been reading eDiscovery Daily for awhile, you’ve probably noticed that I’ve written several posts regarding significant case law as it pertains to eDiscovery. I often feel that I should offer a disclaimer before each of these posts saying “I’m not a lawyer, but I play one on the Web”. As the disclaimer at the bottom of the page stipulates, these posts aren’t meant to provide legal advice and it is not my intention to do so, but merely to identify cases that may be of interest to our readers and I try to provide a basic recap of these cases and leave it at that. As Clint Eastwood once said, “A man’s got to know his limitations”.

A few days ago, The New York Times published an article entitled Armies of Expensive Lawyers, Replaced by Cheaper Software which discussed how, using ‘artificial intelligence, “e-discovery” software can analyze documents in a fraction of the time for a fraction of the cost’ (extraneous comma in the title notwithstanding). The article goes on to discuss linguistic and sociological techniques for retrieval of relevant information and discusses how the Enron Corpus, available in a number of forms, including through EDRM, has enabled software providers to make great strides in analytical capabilities using this large base of data to use in testing. It also discusses whether this will precipitate a march to the unemployment line for scores of attorneys.

A number of articles and posts since then have offered commentary as to whether that will be the case. Technology tools will certainly reduce document populations significantly, but, as the article noted, “[t]he documents that the process kicks out still have to be read by someone”. Not only that, the article still makes the assumption that people too often make with search technology – that it’s a “push a button and get your answer” approach to identifying relevant documents. But, as has been noted in several cases and also here on this blog, searching is an iterative process where sampling the search results is recommended to confirm that the search maximizes recall and precision to the extent possible. Who do you think is going to perform that sampling? Lawyers – that’s who (working with technologists like me, of course!). And, some searches will require multiple iterations of sampling and analysis before the search is optimized.

Therefore, while the “armies” of lawyers many not need near as many members of the infantry, they will still need plenty of corporals, sergeants, captains, colonels and generals. And, for those entry-level reviewing attorneys that no longer have a place on review projects? Well, we could always use a few more doctors on TV, right? 😉

So, what do you think? Are you a review attorney that has been impacted by technology – positively or negatively? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Law: Spoliate Evidence, Don’t Go to Jail, but Pay a Million Dollars

March 11, 2011

As previously referenced in eDiscovery Daily, defendant Mark Pappas, President of Creative Pipe, Inc., was ordered by Magistrate Judge Paul W. Grimm to “be imprisoned for a period not to exceed two years, unless and until he pays to Plaintiff the attorney's fees and costs that will be awarded to Plaintiff as the prevailing party pursuant to Fed. R. Civ. P. 37(b)(2)(C).”. Judge Grimm found that “Defendants…deleted, destroyed, and otherwise failed to preserve evidence; and repeatedly misrepresented the completeness of their discovery production to opposing counsel and the Court.”

However, ruling on the defendants’ appeal, District Court Judge Marvin J. Garbis declined to adopt the order regarding incarceration, stating: “[T]he court does not find it appropriate to Order Defendant Pappas incarcerated for future possible failure to comply with his obligation to make payment of an amount to be determined in the course of further proceedings.”

So, how much is he ordered to pay? Now we know.

On January 24, 2011, Judge Grimm entered an order awarding a total of $1,049,850.04 in “attorney’s fees and costs associated with all discovery that would not have been un[der]taken but for Defendants' spoliation, as well as the briefings and hearings regarding Plaintiff’s Motion for Sanctions.” Judge Grimm explained, “the willful loss or destruction of relevant evidence taints the entire discovery and motions practice.” So, the court found that “Defendants’ first spoliation efforts corresponded with the beginning of litigation” and that “Defendants’ misconduct affected the entire discovery process since the commencement of this case.”

As a result, the court awarded $901,553.00 in attorney’s fees and $148,297.04 in costs. Those costs included $95,969.04 for the Plaintiff’s computer forensic consultant that was “initially hired . . . to address the early evidence of spoliation by Defendants and to prevent further destruction of data”. The Plaintiff’s forensic consultant also provided processing services and participated in the preparation of plaintiff’s search and collection protocol, which the court found “pertained to Defendants’ spoliation efforts.”

So, what do you think? Will the defendant pay? Or will he be subject to possible jail time yet again? Please share any comments you might have or if you’d like to know more about a particular topic.

Working Successfully with eDiscovery and Litigation Support Service Providers: Introduction

March 10, 2011

If you work in a law firm or a corporate legal department, there will be times when you turn to a service provider to help with handling discovery materials – regardless of the technology and staff resources that you have. You might look to a service provider to handle work that your department doesn’t do. Or maybe your own resources are tied up and you just need more capacity.

Very often, service providers become key members of the litigation team, and critical to the team’s success. There is, however, a lot that can go wrong – just with the slightest miscommunication. It is, therefore, important that you have an effective plan in place for engaging service providers when you need help, for working effectively with service provider project staff, and for seamlessly incorporating the work product into the case workflow.

There are dozens – if not hundreds – of service providers to choose from for any given task on any given case. Where do you start? How do you find the one that’s right for your case? How do you communicate effectively with that service provider? How do you ensure high quality work, that’s delivered on time and within budget? We’ll be answering all of these questions in this blog series. We’re going to cover:

Evaluating and Selecting a Service Provider
Preventing Problems and Monitoring Work
Establishing and Managing a Preferred Service Provider Program in Your Firm
Types of Service Providers and Questions to Ask Each Type

In the next post in this series, we’ll start with what you should be looking for when you select a service provider.

What has been your experience with service provider work? Do you have good or bad experiences you can tell us about? Please share any comments you might have and let us know if you’d like to know more about an eDiscovery topic.

EDRM