eDiscovery Daily Blog

When Preparing Production Sets, Quality is Job 1: eDiscovery Throwback Thursdays

Here’s our latest blog post in our Throwback Thursdays series where we are revisiting some of the eDiscovery best practice posts we have covered over the years and discuss whether any of those recommended best practices have changed since we originally covered them.

This post was originally published on December 2, 2011 – when eDiscovery Daily was a little more than a year old and continues our two-part series from two Thursdays ago.  Several updates have been applied, so take note!  Enjoy!

OK, I admit I stole the “Quality is Job 1” line from an old Ford commercial;o)

Last time, we talked about addressing parameters of production up front to ensure that those requirements make sense and avoid foreseeable production problems well before the production step.  Today, we will talk about quality control (QC) mechanisms to make sure that the production is complete and accurate.

Quality Control Checks

There are a number of checks that can and should be performed on the production set, prior to producing it to the requesting party.  Here are some examples:

  • File Counts: The most obvious check you can perform is to ensure that the count of files matches the count of documents or pages you have identified to be produced. However, depending on the production, there may be multiple file counts to check:
    • Image Files: If you have agreed with opposing counsel to produce images for all documents, then there will be a count of images to confirm. If you’re producing multi-page image files (typically, PDF or TIFF), the count of images should match the count of documents being produced.  If you’re producing single-page image files (usually TIFF), then the count should match the number of pages being produced.  One notable exception that has become common since this post was originally written is that many image productions these days still often include native productions for Excel and (in some cases) PowerPoint files.  Excel files are often not formatted for printing, so they don’t print well and many parties want to see the underlying formulas, so they get produced natively.  Even in that case, a placeholder image is still produced for each Excel file, so the number of images should match the number of documents or pages (if producing single page text files, count each Excel placeholder image as one page).
    • Text Files: When producing image files, you’ll also usually be producing searchable text files, which will generally be multi-page and should match the number of documents being produced. If there are files with no text in them, you typically still produce a placeholder to indicate as such so that opposing counsel is aware that there was no text to produce.
    • Native Files: Native files (if produced) are of course at the document level, so if you are producing native files, you would want to confirm the correct count for native files you are producing. This goes for partial native file productions as well, so if you are producing images with native production for Excel files, you’ll want to make sure the total number of native files matches the number of Excel files you were expecting to produce.
    • Subset Counts: If the documents are being produced in a certain organized manner (e.g., a folder for each custodian), it’s a good idea to identify subset counts at those levels and verify those counts as well. Not only does this provide an extra level of count verification, but it helps to find the problem more quickly if the overall count is off.
    • Verify Counts on Final Production Location: If you’re verifying counts of the production set before copying it to the final production location (which, these days, is either FTP location or hard drive), you will need to verify those counts again after copying to ensure that all files made it to the final location.
  • Sampling of Results: Unless the production is very small, it’s impractical to open every last file to be produced to confirm that it is correct. However, you can still consider employing accepted statistical sampling procedures (such as those described here and here for searching) to identify an appropriate sample size and randomly select that sample to open and confirm that the correct files were selected, HASH values of produced native files match the original source versions of those files, images are clear and text files contain the correct text.
  • Redacted Files: If any redacted files are being produced, each of these (not just a sample subset) should be reviewed to confirm that redactions of privileged or confidential information made it to the produced image, text and native file. Many review platforms overlay redactions which have to be burned into the images at production time, so it’s easy for mistakes in the process to cause those redactions to be left out, or the redactions may not be carried forward to the text or native files.  It’s very important to check them all.
  • Inclusion of Logs: Depending on agreed upon parameters, the production may include log files such as:
    • Production Log: Listing of all files being produced, with an agreed upon list of metadata fields to identify those files.
    • Privilege Log: Listing of responsive files not being produced because of privilege (and possibly confidentiality as well). This listing often identifies the privilege being asserted for each file in the privilege log.
    • Exception Log: Listing of files that could not be produced because of a problem with the file. Producing these logs is less common, but could be necessary if questions come up about the comprehensiveness of the production.

Each production will have different parameters, so the QC requirements will differ, so there are examples, but not necessarily a comprehensive list of all potential QC checks to perform.

So, what do you think?  Can you think of other appropriate QC checks to perform on production sets?  If so, please share them!  As well as any comments you might have or if you’d like to know more about a particular topic.

Sponsor: This blog is sponsored by CloudNine, which is a data and legal discovery technology company with proven expertise in simplifying and automating the discovery of data for audits, investigations, and litigation. Used by legal and business customers worldwide including more than 50 of the top 250 Am Law firms and many of the world’s leading corporations, CloudNine’s eDiscovery automation software and services help customers gain insight and intelligence on electronic data.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.