eDiscovery Daily Blog

eDiscovery Case Law: Judge Scheindlin Says “No” to Self-Collection, “Yes” to Predictive Coding


When most people think of the horrors of Friday the 13th, they think of Jason Voorhees.  When US Immigration and Customs thinks of Friday the 13th horrors, do they think of Judge Shira Scheindlin?

As noted in Law Technology News (Judge Scheindlin Issues Strong Opinion on Custodian Self-Collection, written by Ralph Losey, a previous thought leader interviewee on this blog), New York District Judge Scheindlin issued a decision last Friday (July 13) addressing the adequacy of searching and self-collection by government entity custodians in response to Freedom of Information Act (FOIA) requests.  As Losey notes, this is her fifth decision in National Day Laborer Organizing Network et al. v. United States Immigration and Customs Enforcement Agency, et al., including one that was later withdrawn.

Regarding the defendant’s question as to “why custodians could not be trusted to run effective searches of their own files, a skill that most office workers employ on a daily basis” (i.e., self-collect), Judge Scheindlin responded as follows:

“There are two answers to defendants' question. First, custodians cannot 'be trusted to run effective searches,' without providing a detailed description of those searches, because FOIA places a burden on defendants to establish that they have conducted adequate searches; FOIA permits agencies to do so by submitting affidavits that 'contain reasonable specificity of detail rather than merely conclusory statements.' Defendants' counsel recognize that, for over twenty years, courts have required that these affidavits 'set [ ] forth the search terms and the type of search performed.' But, somehow, DHS, ICE, and the FBI have not gotten the message. So it bears repetition: the government will not be able to establish the adequacy of its FOIA searches if it does not record and report the search terms that it used, how it combined them, and whether it searched the full text of documents.”

“The second answer to defendants' question has emerged from scholarship and caselaw only in recent years: most custodians cannot be 'trusted' to run effective searches because designing legally sufficient electronic searches in the discovery or FOIA contexts is not part of their daily responsibilities. Searching for an answer on Google (or Westlaw or Lexis) is very different from searching for all responsive documents in the FOIA or e-discovery context.”

“Simple keyword searching is often not enough: 'Even in the simplest case requiring a search of on-line e-mail, there is no guarantee that using keywords will always prove sufficient.' There is increasingly strong evidence that '[k]eyword search[ing] is not nearly as effective at identifying relevant information as many lawyers would like to believe.' As Judge Andrew Peck — one of this Court's experts in e-discovery — recently put it: 'In too many cases, however, the way lawyers choose keywords is the equivalent of the child's game of 'Go Fish' … keyword searches usually are not very effective.'”

Regarding search best practices and predictive coding, Judge Scheindlin noted:

“There are emerging best practices for dealing with these shortcomings and they are explained in detail elsewhere. There is a 'need for careful thought, quality control, testing, and cooperation with opposing counsel in designing search terms or keywords to be used to produce emails or other electronically stored information.' And beyond the use of keyword search, parties can (and frequently should) rely on latent semantic indexing, statistical probability models, and machine learning tools to find responsive documents.”

“Through iterative learning, these methods (known as 'computer-assisted' or 'predictive' coding) allow humans to teach computers what documents are and are not responsive to a particular FOIA or discovery request and they can significantly increase the effectiveness and efficiency of searches. In short, a review of the literature makes it abundantly clear that a court cannot simply trust the defendant agencies' unsupported assertions that their lay custodians have designed and conducted a reasonable search.”

Losey notes that “A classic analogy is that self-collection is equivalent to the fox guarding the hen house. With her latest opinion, Schiendlin [sic] includes the FBI and other agencies as foxes not to be trusted when it comes to searching their own email.”

So, what do you think?  Will this become another landmark decision by Judge Scheindlin?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.