Electronic Discovery

eDiscovery Trends: 2011 eDiscovery Errors Survey

 

As noted in Legal IT Professionals on Friday, LDM Global on Friday announced the results of its 2011 eDiscovery Errors survey. The company asked a selection of industry professionals their views on which errors they experienced most often during the discovery process. Results were collected from across the USA, Europe and Australia.

According to Scott Merrick, LDM Global Marketing Director and survey author, “Our goal was to find out what the real, day to day issues and problems are around the discovery process.”  He also noted that “Of particular interest was the ongoing challenge of good communication. Technology has not solved that challenge and it remains at the forefront of where mistakes are made.”

The respondents of the survey were broken down into the following groups: Litigation Support Professionals 47%, Lawyers 30%, Paralegals 11%, IT Professionals 9% and Others 3%.  Geographically, the United States and Europe had 46% of the respondents each, with the remaining 8% of respondents coming from Australia.  LDM Global did not identify the total number of respondents to the survey.

For each question about errors, respondents were asked to classify the error as “frequently occurs”, “occasionally occurs”, “not very common” or “never occurs”.  Based on responses, the most common errors are:

  • Failure to Effectively Communicate across Teams: 50% of the respondents identified this error as one that frequently occurs
  • An Inadequate Data Retention Policy: 47% of the respondents identified this error as one that frequently occurs
  • Not Collecting all Pertinent Data: 41% of the respondents identified this error as one that frequently occurs
  • Failure to Perform Critical Quality Control (i.e., sampling): 40% of the respondents identified this error as one that frequently occurs
  • Badly Thought Out, or Badly Implemented, Policy: 40% of the respondents identified this error as one that frequently occurs

Perhaps one of the most surprising results is that only 14% of respondents identified Spoliation of evidence, or the inability to preserve relevant emails as an error that frequently occurs.  So, why are there so many cases in which sanctions have been issued for that very issue?  Interesting…

For complete survey results, go to LDMGlobal.com.

So, what do you think?  What are the most common eDiscovery errors that your organization has encountered?   Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Trends: Sedona Conference Database Principles

 

A few months ago, eDiscovery Daily posted about discovery of databases and how few legal teams understand database discovery and know how to handle it.  We provided a little pop quiz to test your knowledge of databases, with the answers here.

Last month, The Sedona Conference® Working Group on Electronic Document Retention & Production (WG1) published the Public Comment Version of The Sedona Conference® Database Principles – Addressing the Preservation & Production of Databases &Database Information in Civil Litigation to provide guidance and recommendations to both requesting and producing parties to simplify discovery of databases and information derived from databases.  You can download the publication here.

As noted in the Executive Overview of the publication, some of the issues that make database discovery so challenging include:

  • More enterprise-level information is being stored in searchable data repositories, rather than in discrete electronic files,
  • The diverse and complicated ways in which database information can be stored has made it difficult to develop universal “best-practice” approaches to requesting and producing information stored in databases,
  • Retention guidelines that make sense for archival databases (databases that add new information without deleting past records) rapidly break down when applied to transactional databases where much of the system’s data may be retained for a limited time – as short as thirty days or even thirty seconds.

The commentary is broken into three primary sections:

  • Section I: Introduction to databases and database theory,
  • Section II: Application of The Sedona Principles, designed for all forms of ESI, to discovery of databases,
  • Section III: Proposal of six new Principles that pertain specifically to databases with commentary to support the Working Group’s recommendations.  The principles are stated as follows:
    • Absent a specific showing of need or relevance, a requesting party is entitled only to database fields that contain relevant information, not the entire database in which the information resides or the underlying database application or database engine.
    • Due to differences in the way that information is stored or programmed into a database, not all information in a database may be equally accessible, and a party’s request for such information must be analyzed for relevance and proportionality.
    • Requesting and responding parties should use empirical information, such as that generated from test queries and pilot projects, to ascertain the burden to produce information stored in databases and to reach consensus on the scope of discovery.
    • A responding party must use reasonable measures to validate ESI collected from database systems to ensure completeness and accuracy of the data acquisition.
    • Verifying information that has been correctly exported from a larger database or repository is a separate analysis from establishing the accuracy, authenticity, or admissibility of the substantive information contained within the data.
    • The way in which a requesting party intends to use database information is an important factor in determining an appropriate format of production.

To submit a public comment, you can download a public comment form here, complete it and fax (yes, fax) it to The Sedona Conference® at 928-284-4240.  You can also email a general comment to them at tsc@sedona.net.

eDiscovery Daily will be delving into this document in more detail in future posts.  Stay tuned!

So, what do you think?  Do you have a need for guidelines for database discovery?   Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Law: Conclusion of Case Does Not Preclude Later Sanctions

In Green v. Blitz U.S.A., Inc., (E.D. Tex. Mar. 1, 2011), the defendant in a product liability action that had been settled over a year earlier was sanctioned for “blatant discovery abuses” prior to the settlement. Defendant was ordered to add $250,000 to its settlement with plaintiff, to provide a copy of the court’s order to every plaintiff in every lawsuit against defendant for the past two years or else forfeit an additional $500,000 “purging” sanction, and to include the order in its first responsive pleading in every lawsuit for the next five years in which defendant became involved.

Defendant, a manufacturer of gasoline containers, was named in several product liability lawsuits, including this case in which plaintiff alleged that her husband’s death was caused in part by the lack of a flame arrestor on defendant’s gas cans. The jury in plaintiff’s case returned a verdict for defendant after counsel for defendant argued that “science shows” that flame arrestors did not work. The case was settled after the jury verdict for an undisclosed amount, but two years later, counsel for plaintiff sought sanctions and to have the case reopened after learning in another case against defendant that while the gas can lawsuits were underway, defendant had been instructing its employees to destroy email.

The court described defendant’s failure to implement a litigation hold as gas can cases were filed. A single employee met with other employees to ask them to look for documents, but he did not have any electronic searches made for documents and he did not consult with defendant’s information technology department on how to retrieve electronic documents.

The court held that defendant willfully violated the discovery order in the case by not producing key documents such as a handwritten note indicating a desire to install flame arrestors on gas cans and an email noting that the technology for flame arrestors existed given the common use of flame arrestors in the marine industry. “Any competent electronic discovery effort would have located this email,” according to the court, through a key word search. Defendant’s employee in charge of discovery did not conduct a key word search and, despite acknowledging that he was as computer “illiterate as they get,” did not seek help from defendant’s information technology department, which was routinely sending out instructions to employees to delete email and rotating backup tapes every two weeks while the litigation was underway.

The court declined to reopen the case since it had been closed for a year. However, based on its knowledge of the confidential settlement of the parties, the court ordered defendant to pay plaintiff an additional $250,000 as a civil contempt sanction to match the minimum amount that the settlement would have been if plaintiff had been provided documents withheld by defendant. The court also ordered a “civil purging sanction” of $500,000 which defendant could avoid upon showing proof that a copy of the court’s decision had been provided to every plaintiff in a lawsuit against defendant for the past two years. The court added a requirement that defendant include a copy of the court’s opinion in its first pleading in any lawsuit for the next five years in which defendant became a party.

As Yogi Berra would say, “It ain’t over ‘til it’s over”.

So, what do you think?  Should cases be re-opened after they’re concluded for discovery violations?  Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery (free subscription required).  For eDiscovery news and best practices, check out the Applied Discovery Blog here.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Working Successfully with eDiscovery and Litigation Support Service Providers: Checking References – General Suggestions

 

Yesterday, we talked about the importance of checking references when considering eDiscovery and Litigation Support Service Providers.  In the next blogs in this series, I’m going to suggest some questions you may consider asking when doing a reference check.  First, however, let me make some general suggestions:

  • When you ask a vendor for references, ask for clients that had a project similar in size and scope to yours.  You want to speak with people who had similar requirements regarding services and schedules, and that had document or data collections similar in size and format to yours.
  • When you ask a vendor for references, let them know that you’d like to speak with two different types of people:
    • End-users of the vendor’s work product
    • The vendor’s main, day-to-day point of contact
  • Don’t call a reference out of the blue.  Make an initial contact by email to introduce yourself and to schedule a call.  If the call is scheduled, you are more likely to get better attention and more time. 
  • Be prepared with a list of specific questions.  Don’t call and ask only general question like “Were you satisfied with the quality?” and “Did they meet your deadlines?”
  • Try to engage the person you are speaking with in conversation.  Don’t settle for yes and no answers.  When someone responds “yes” or “no” to a question, ask them to provide details.  This may uncover information that is important to you or it may trigger additional questions that you should ask.
  • Always send follow-up emails to the people with whom you speak thanking them for their time and information. 

Next we’ll talk about specific questions you can consider asking when checking references, so stay tuned.

How do you approach checking references?  Please share any comments you might have and let us know if you’d like to know more about an eDiscovery topic.

Working Successfully with eDiscovery and Litigation Support Service Providers: Is Checking References Important?

 

Over the years, I’ve been asked many times to serve as a reference for vendors with which I’ve worked.  And, I’ve taken many reference-check phone calls.  More often than not, those calls were less efficient and productive than they could have been — because they weren’t planned and good questions were not asked.   In the next blogs in this series I’ll make some suggestions for doing an effective reference check.

First, recognize that checking references is very important.  Yes, it is almost certain that a vendor will direct you to clients that are satisfied, so you know — to an extent — what to expect.  You need to speak with them anyway.  There are a few reasons for this:

  • The clients provided to you as references might have had different priorities than you do.  They may be satisfied because the vendor performed well in an area that was most important to them.  That same client, however, may be able to shed light on “minor problems” that in your case could be “major problems”.  Of course, this assumes that you ask the right questions.
  • The clients provided as references may be inexperienced in eDiscovery and litigation support, and therefore not a good judge of the vendor’s work.  Clients like this may be satisfied because they had a good relationship with the vendor staff and nothing blew up in their faces.  That doesn’t necessarily mean that the work was done well or cost-efficiently.  When you speak with references, you can get a feel for their level of experience and knowledge, and be able to determine, therefore, whether their good experience with the vendor is truly indicative of high-quality and cost-effective work.
  • The clients provided as references may not have worked with the vendor on a case that was similar in scope to yours, or they not have had requirements similar to yours.  This too, can be discerned with the right questions.

In the next posts in this blog series, I’ll suggest an approach to checking references and give you examples of questions that can uncover the information you need when doing a reference check.

Do you get valuable information when you check references?  Please share any comments you might have and let us know if you’d like to know more about an eDiscovery topic.

eDiscovery Best Practices: Your ESI Collection May Be Larger Than You Think

 

Here’s a sample scenario: You identify custodians relevant to the case and collect files from each.  Roughly 100 gigabytes (GB) of Microsoft Outlook email PST files and loose “efiles” is collected in total from the custodians.  You identify a vendor to process the files to load into a review tool, so that you can perform first pass review and, eventually, linear review and produce the files to opposing counsel.  After processing, the vendor sends you a bill – and they’ve charged you to process over 200 GB!!  What happened?!?

Did the vendor accidentally “double-bill” you?  That would be great – but no.  There’s a much more logical explanation and, unfortunately, you may wind up paying a lot more to process these files that you expected.

Many of the files in most ESI collections are stored in what are known as “archive” or “container” files.  For example, as noted above, Outlook emails are typically saved for each custodian in a personal storage (.PST) file format, which is an expanding container file. For most custodians, all of their email (and the corresponding attachments, if present) resides in a few PST files.  The scanned size for the PST file is the size of the file on disk.

Did you ever see one of those vacuum bags that you store clothes in and then suck all the air out so that the clothes won’t take as much space?  The PST file is like one of those vacuum bags – it typically stores the emails and attachments in a compressed format to save space.  When the emails and attachments are processed into a review tool, they are expanded into their normal size.  This expanded size can be 1.5 to 2 times larger than the scanned size (or more).  And, that’s what many vendors will bill on – the expanded size.

There are other types of archive container files that compress the contents – .zip and .rar files are two examples of compressed container files.  These files are often used to not only to compress files for storage on hard drives, but they are also used to compact or group a set of files when transmitting them, usually in – you guessed it – email.  With email comprising a majority of most ESI collections and the popularity of other archive container files for compressing file collections, the expanded size of your collection may be considerably larger than it appears when stored on disk.  It’s important to be prepared for that and know your options when processing that data, so you can effectively anticipate those processing costs.

So, what do you think?  Have you ever been surprised by processing costs of your ESI?   Please share any comments you might have or if you’d like to know more about a particular topic.

Working Successfully with eDiscovery and Litigation Support Service Providers: Dotting the I’s and Crossing the T’s

 

Yesterday, we talked about information to include in a Request for Proposal (RFP) to request eDiscovery and litigation support services.  Before moving forward with a service provider for a project, there are a few due diligence steps you should take to protect yourself and your case-sensitive information.

First, it may be appropriate to ask the service provider to verify that it does not have a conflict of interest.  For many eDiscovery services, this step may not be necessary.  If, however, you are asking a service provider to assist with substantive consultative help, you want to ensure that – at a minimum – it is not providing similar services to the other side in the litigation.

Once you’ve established that there is no conflict, you want to protect case information that you provide to the vendor – information in the form of communication and information in the documents and data.  Require that the vendor sign a Non-Disclosure Agreement (NDA) before communicating or transmitting sensitive and confidential information.

And finally, you and the vendor should both sign off on a Service Level Agreement (SLA) that clearly defines the work to which you’ve agreed.  A Service Level Agreement should include — at a minimum:

  • A complete description of each service to be performed
  • A complete description of each deliverable
  • A description of agreed upon performance levels (guarantees and warranties provided by the service provider; this may be in the form of quality assurance guarantees, system availability and downtime, and so on).

In addition, a service level agreement might include the following information

  • Pricing for services
  • Billing information
  • Contact information

One other important “due diligence” step is checking references.  We’ll cover that in the next posts in this series.  I’ll give you some suggestions for doing an effective reference check that will get at the information you need to know.

What due diligence steps do you take with a service provider?  Please share any comments you might have and let us know if you’d like to know more about an eDiscovery topic.

Working Successfully with eDiscovery and Litigation Support Service Providers: Information to Provide in an RFP

 

Open, two-way communication with a service provider is absolutely critical to a successful project.  It needs to start early, even before a project starts.  For many projects, it starts with the Request for Proposal (RFP).  Your goal with an RFP is to get good information from a vendor: information on pricing, information on schedule, information on approach, and information on deliverables.  To give you complete, accurate information, they need information from you.

Include this information in your RFPs:

  • Information about your Firm/Organization (location, key contacts)
  • Information about the Case (the party you represent, the case schedule)
  • Information about the Proposal Submission Process (contact information for the person who can answer questions about the RFP; contact information for those to whom the proposal should be submitted; the date the proposal is due; in what form the proposal should be delivered; any requirements you have regarding the format of the proposal)
  • Description of the Services you will Require
  • Information about the Scope of the Project  (the size of a document/data collection, types and characteristics of the documents/data)
  • Information on the Deliverable to the Vendor (when documents/data will be available to the vendor; in what form they will be delivered)
  • Description of the Deliverables you Require (formats, media, data elements, etc.)
  • Date by which the Project must be Completed (and include interim milestone dates if that’s appropriate)
  • Description of your Planned Participation in the Project (will you participate in training?  will you be onsite for any portion of the project?)
  • Description of your Preferred Method of Communication with the Service Provider
  • Description of your Requirements regarding Status Reports (how often do you require them? what information should be included?  to whom should they be submitted?)

Later in this blog series, we’ll discuss what questions you should ask in a proposal for several types of eDiscovery services.

What information do you provide to a service provider in an RFP?  Please share any comments you might have and let us know if you’d like to know more about an eDiscovery topic.

eDiscovery Best Practices: Testing Your Search Using Sampling

Friday, we talked about how to determine an appropriate sample size to test your search results as well as the items NOT retrieved by the search, using a site that provides a sample size calculator.  Yesterday, we talked about how to make sure the sample size is randomly selected.

Today, we’ll walk through an example of how you can test and refine a search using sampling.

TEST #1: Let’s say in an oil company we’re looking for documents related to oil rights.  To try to be as inclusive as possible, we will search for “oil” AND “rights”.  Here is the result:

  • Files retrieved with “oil” AND “rights”: 200,000
  • Files NOT retrieved with “oil” AND “rights”: 1,000,000

Using the site to determine an appropriate sample size that we identified before, we determine a sample size of 662 for the retrieved files and 664 for the non-retrieved files to achieve a 99% confidence level with a margin of error of 5%.  We then use this site to generate random numbers and then proceed to review each item in the retrieved and NOT retrieved items sets to determine responsiveness to the case.  Here are the results:

  • Retrieved Items: 662 reviewed, 24 responsive, 3.6% responsive rate.
  • NOT Retrieved Items: 664 reviewed, 661 non-responsive, 99.5% non-responsive rate.

Nearly every item in the NOT retrieved category was non-responsive, which is good.  But, only 3.6% of the retrieved items were responsive, which means our search was WAY over-inclusive.  At that rate, 192,800 out of 200,000 files retrieved will be NOT responsive and will be a waste of time and resource to review.  Why?  Because, as we determined during the review, almost every published and copyrighted document in our oil company has the phrase “All Rights Reserved” in the document and will be retrieved.

TEST #2: Let’s try again.  This time, we’ll conduct a phrase search for “oil rights” (which requires those words as an exact phrase).  Here is the result:

  • Files retrieved with “oil rights”: 1,500
  • Files NOT retrieved with “oil rights”: 1,198,500

This time, we determine a sample size of 461 for the retrieved files and (again) 664 for the NOT retrieved files to achieve a 99% confidence level with a margin of error of 5%.  Even though, we still have a sample size of 664 for the NOT retrieved files, we generate a new list of random numbers to review those items, as well as the 461 randomly selected retrieved items.  Here are the results:

  • Retrieved Items: 461 reviewed, 435 responsive, 94.4% responsive rate.
  • NOT Retrieved Items: 664 reviewed, 523 non-responsive, 78.8% non-responsive rate.

Nearly every item in the retrieved category was responsive, which is good.  But, only 78.8% of the NOT retrieved items were responsive, which means over 20% of the NOT retrieved items were actually responsive to the case (we also failed to retrieve 8 of the items identified as responsive in the first iteration).  So, now what?

TEST #3: If you saw this previous post, you know that proximity searching is a good alternative for finding hits that are close to each other without requiring the exact phrase.  So, this time, we’ll conduct a proximity search for “oil within 5 words of rights”.  Here is the result:

  • Files retrieved with “oil within 5 words of rights”: 5,700
  • Files NOT retrieved with “oil within 5 words of rights”: 1,194,300

This time, we determine a sample size of 595 for the retrieved files and (once again) 664 for the NOT retrieved files, generating a new list of random numbers for both sets of items.  Here are the results:

  • Retrieved Items: 595 reviewed, 542 responsive, 91.1% responsive rate.
  • NOT Retrieved Items: 664 reviewed, 655 non-responsive, 98.6% non-responsive rate.

Over 90% of the items in the retrieved category were responsive AND nearly every item in the NOT retrieved category was non-responsive, which is GREAT.  Also, all but one of the items previously identified as responsive was retrieved.  So, this is a search that appears to maximize recall and precision.

Had we proceeded with the original search, we would have reviewed 200,000 files – 192,800 of which would have been NOT responsive to the case.  By testing and refining, we only had to review 8,815 files –  3,710 sample files reviewed plus the remaining retrieved items from the third search (5,700595 = 5,105) – most of which ARE responsive to the case.  We saved tens of thousands in review costs while still retrieving most of the responsive files, using a defensible approach.

Keep in mind that this is a simple example — we’re not taking into account misspellings and other variations we may want to include in our criteria.

So, what do you think?  Do you use sampling to test your search results?   Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: A “Random” Idea on Search Sampling

 

Friday, we talked about how to determine an appropriate sample size to test your search results as well as the items NOT retrieved by the search, using a site that provides a sample size calculator.  Today, we’ll talk about how to make sure the sample size is randomly selected.

A randomly selected sample gives each file an equal chance of being reviewed and eliminates the chance of bias being introduced into the sample which might skew the results.  Merely selecting the first or last x number of items (or any other group) in the set may not reflect the population as a whole – for example, all of those items could come from a single custodian.  To ensure a fair, defensible sample, it needs to be selected randomly.

So, how do you select the numbers randomly?  Once again, the Internet helps us out here.

One site, Random.org, has a random integer generator which will randomly generate whole numbers.  You simply need to supply the number of random integers that you need to be generated, the starting number and ending number of the range within which the randomly generated numbers should fall.  The site will then generate a list of numbers that you can copy and paste into a text file or even a spreadsheet.  The site also provides an Advanced mode, which provides options for the numbers (e.g., decimal, hexadecimal), output format and how the randomization is ‘seeded’ (to generate the numbers).

In the example from Friday, you would provide 660 as the number of random integers to be generated, with a starting number of 1 and an ending number of 100,000 to get a list of random numbers for testing your search that yielded 100,000 files with hits (664, 1 and 1,000,000 respectively to get a list of numbers to test the non-hits).  You could paste the numbers into a spreadsheet, sort them and then retrieve the files by position in the result set based on the random numbers retrieved and review each of them to determine whether they reflect the intent of the search.  You’ll then have a good sense of how effective your search was, based on the random sample.  And, probably more importantly, using that random sample to test your search results will be a highly defensible method to verify your approach in court.

Tomorrow, we'll walk through a sample iteration to show how the sampling will ultimately help us refine our search.

So, what do you think?  Do you use sampling to test your search results?   Please share any comments you might have or if you’d like to know more about a particular topic.