eDiscovery Daily Blog

Self-Collecting? Don’t Forget to Check for Image Only Files – eDiscovery Best Practices

Yesterday, we talked about the importance of tracking chain of custody order to be able to fight challenges of electronically stored information (ESI) by opposing parties.  Today, let’s talk about a common mistake that organizations make when collecting their own files to turn over for discovery purposes.

I’ve worked with a number of attorneys who have turned over the collection of potentially responsive files to the individual custodians of those files, or to someone in the organization responsible for collecting those files (typically, an IT person).  Self-collection by custodians, unless managed closely, can be a wildly inconsistent process (at best).  In some cases, those attorneys have instructed those individuals to perform various searches to turn “self-collection” into “self-culling”.  Self-culling can cause at least two issues:

  1. You have to go back to the custodians and repeat the process if additional search terms are identified.
  2. Potentially responsive image-only files will be missed with self-culling.

Unless search terms are agreed to by the parties up front, it’s not unusual to identify additional searches to be performed – even when up front agreement, terms can often be renegotiated during the case.  It’s also common to have a number of image-only files within any collection, especially if the custodians frequently scan executed documents or use fax software to receive documents from other parties.  In those cases, image-only PDF or TIFF files can often make up as much as 20% of the collection.  When custodians are asked to perform “self-culling” by performing their own searches of their data, these files will typically be missed.

For these reasons, I usually advise against self-culling by custodians and also don’t recommend that IT perform self-culling, unless they have the ability to process that data to identify image-only files and perform Optical Character Recognition (OCR) to capture text from them.  If your IT department has the capabilities and experience to do so (and the process and chain of custody is well documented), then that’s great.  Many internal IT departments either don’t have the capabilities or expertise, in which case it’s best to collect all potentially responsive files from the custodians and turn them over to a qualified eDiscovery provider to perform the culling (performing OCR as needed to include responsive image-only files in the resulting responsive document set).  With the full data set available, there is also no need to go back to the custodians to collect additional data (unless the case requires supplemental productions).

So, what do you think?  Do you self-collect data for discovery purposes?  If so, how do you account for image-only files?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.