eDiscovery Daily Blog

If Your Documents Are Not Logical, Discovery Won’t Be Either – eDiscovery Best Practices

Scanning may no longer be cool, but it’s still necessary.  Electronic discovery still typically includes a paper component.  When it comes to paper, how documents are identified is critical to how useful they will be.  Here’s an example.

Your client collects hard copy documents from various custodians related to the case and organizes them into folders.  In one of the folders is a one page fax cover sheet attached to a two page letter, as well as an unrelated report and four different contracts, each 15-20 pages.  The entire folder is scanned as a single document, as either a TIFF or PDF file.

Only the letter is retrieved in a search as responsive to the case.  But, because it is contained within a document containing 70 to 80 other pages, you wind up reviewing 70 to 80 unrelated pages that would not otherwise have to review.  It complicates production, as well – how do you produce partial “documents”?  Also, if the non-responsive report and contracts have duplicates in the collection, you can’t effectively de-dupe those to eliminate those from the review population because they’re combined together.

It happens more often than you think.  It also can happen – sometimes quite often – with the scanned documents that the other side produces to you.  So, how do you get the documents into a more logical and usable organization?

Logical Document Determination (or LDD) is a process that some eDiscovery providers (including – shameless plug warning! – CloudNine Discovery).  It’s a process where each image page in a scanned document set is reviewed and the “logical document breaks” (i.e., each page that starts a new document) is identified.  Then, the documents are re-assembled, based on those logical document breaks.

Once the documents are logically organized, other processes – like Optical Character Recognition (OCR) and clustering (including near duplicate identification) can then be performed at the appropriate level of documents and the smaller, more precise, unitized documents can be indexed for searching.  Instead of reviewing a 70-80 page “document” comprised of several logical documents, your search will retrieve the two page letter that is actually responsive, making your review and production processes more efficient.

LDD is typically priced on a per page basis of pages reviewed for logical document breaks – prices can vary depending on the volume of pages to be reviewed and where the work is being performed (there are providers in the US and overseas).  While it’s a manual process, it’s well worth it if your collection of imaged documents is poorly defined.

So, what do you think? Have you ever received a collection of poorly organized image files? If so, did you use Logical Document Determination to organize them properly?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

print