eDiscovery Searching: A Great Example of Why Search Results Need to Be Tested
So, I got a chuckle out of one of the stories that both sources (and probably others, as well) highlighted last week:
A+E, Discovery get ready to roll out
The story is about two of the biggest players in the global TV, A+E Networks and Discovery Networks, rolling out their channels into India and Latin America respectively. The article proceeds to discuss the challenges of rolling out these channels into markets with various requirements and several languages and dialects included in those markets.
This story has nothing to do with eDiscovery.
Why did it wind up in the list of eDiscovery stories returned by these two services? Because the story title “A+E, Discovery get ready to roll out” retrieved a hit on “e-Discovery”. Many search engines are generally set to ignore punctuation when searching, so a search for “e-Discovery” actually looks like a search for “e Discovery” to a search engine (keep in mind searches are also usually case insensitive). So, a document with a title of “A+E, Discovery get ready to roll out” could actually be viewed by a search engine as “a e discovery get ready to roll out”, causing the document to be considered a “hit” for “e discovery”.
This is just one example why search results can retrieve unexpected results. And, why a defensible search process (such as the “STARR” approach outlined here) that involves testing and refining searches is vital to maximizing your search recall and precision.
BTW, this can happen to any search engine, so it’s not a reflection on either Pinhawk or Google. Both are excellent resources that can occasionally retrieve non relevant results, just like any other “web crawling” service.
So, what do you think? Did you see this story crop up in the eDiscovery listings? Have you encountered similar examples of search anomalies? Please share any comments you might have or if you’d like to know more about a particular topic.
