eDiscovery Daily Blog

First Pass Review: Fuzzy Searching Your Opponent’s Data

Yesterday, we talked about the use of First Pass Review (FPR) applications (such as FirstPass™, powered by Venio FPR™) to not only conduct first pass review of your own collection, but also to analyze your opponent’s ESI production. One way to analyze that data is through synonym searching to find variations of your search terms to increase the possibility of finding the terminology used by your opponents.

Fuzzy Searching

Another type of analysis is the use of fuzzy searching. Attorneys know what terms they’re looking for, but those terms may not often be spelled correctly. Also, opposing counsel may produce a number of image only files that require Optical Character Recognition (OCR), which is usually not 100% accurate.

FirstPass supports “fuzzy” searching, which is a mechanism by finding alternate words that are close in spelling to the word you’re looking for (usually one or two characters off). FirstPass will display all of the words – in the collection – close to the word you’re looking for, so if you’re looking for the term “petroleum”, you can find variations such as “peroleum”, “petoleum” or even “petroleom” – misspellings or OCR errors that could be relevant. Then, simply select the variations you wish to include in the search. Fuzzy searching is the best way to broaden your search to include potential misspellings and OCR errors and FirstPass provides a terrific capability to select those variations to review additional potential “hits” in your collection.

Tomorrow, I’ll talk about the use of domain categorization to quickly identify potential inadvertent disclosures and weed out non-responsive files produced by your opponent, based on the domain of the communicators. Hasta la vista, baby!  🙂

In the meantime, what do you think? Have you used fuzzy searching to find misspellings or OCR errors in an opponent’s produced ESI? Please share any comments you might have or if you’d like to know more about a particular topic.

print