eDiscovery Daily Blog

eDiscovery Best Practices: Message Thread Review Saves Costs and Improves Consistency


Insanity is doing the same thing over and over again and expecting a different result.  But, in ESI review, it can be even worse when you get a different result.

One of the biggest challenges when reviewing ESI is identifying duplicates so that your reviewers aren’t reviewing the same files again and again.  Not only does that drive up costs unnecessarily, but it could lead to problems if the same file is categorized differently by different reviewers (for example, inadvertent production of a duplicate of a privileged file if it is not correctly categorized).

Of course, there are a number of ways to identify duplicates.  Exact duplicates (that contain the exact same content in the same file format) can be identified through hash values, which are a digital fingerprint of the content of the file.  MD5 and SHA-1 are the most popular hashing algorithms, which can identify exact duplicates of a file, so that they can be removed from the review population.  Since many of the same emails are emailed to multiple parties and the same files are stored on different drives, deduplication through hashing can save considerable review costs.

Sometimes, files are not exact duplicates but contain the same (or almost the same) information.  One example is a Word document published to an Adobe PDF file – the content is the same, but the file format is different, so the hash value will be different.  Near-deduplication can be used to identify files where most or all of the content matches so they can be verified as duplicates and eliminated from review.

Then, there is message thread analysis.  Of course, most email messages are part of a larger discussion, which could be just between two parties, or include a number of parties in the discussion.  To review each email in the discussion thread would result in much of the same information being reviewed over and over again.  Instead, message thread analysis pulls those messages together and enables them to be reviewed as an entire discussion.  That includes any side conversations within the discussion that may or may not be related to the original topic (e.g., a side discussion about lunch plans or did you see American Idol last night).

FirstPass®, powered by Venio FPR™, is one example of an application that provides a mechanism for message thread analysis of Outlook emails that pulls the entire thread into one conversation for review as one big “tree”.  The “tree” representation gives you the ability to see all of the conversations within the discussion and focus your review on the last emails in each conversation to see what is said without having to review each email.  Side conversations are “branches” of the tree and FirstPass enables you to tag individual messages, specific branches or the entire tree as responsive, non-responsive, privileged or some other designation.  Also, because of the way that Outlook tracks emails in the thread, FirstPass identifies messages that are missing from the collection with a red X, enabling you to investigate and determine if additional collection is needed and avoiding potential spoliation claims.

With message thread analysis, you can minimize review of duplicative information within emails, saving time and cost and ensuring consistency in the review.

So, what do you think?  Does your review tool support message thread analysis?   Please share any comments you might have or if you’d like to know more about a particular topic.