Analysis

Court Disagrees with Plaintiff’s Contentions that Defendant’s TAR Process is Defective: eDiscovery Case Law

In Winfield, et al. v. City of New York, No. 15-CV-05236 (LTS) (KHP) (S.D.N.Y. Nov. 27, 2017), New York Magistrate Judge Katharine H. Parker, after conducting an in camera review of the defendant’s TAR process and a sample set of documents, granted in part and denied in part the plaintiffs’ motion, ordering the defendant to provide copies of specific documents where the parties disagreed on their responsiveness and a random sample of 300 additional documents deemed non-responsive by the defendant.  Judge Parker denied the plaintiff’s request for information about the defendant’s TAR process, finding no evidence of gross negligence or unreasonableness in their process.

Case Background

In this dispute over alleged discrimination in the City’s affordable housing program, the parties had numerous disputes over the handling of discovery by the defendant in the case.  The plaintiffs lodged numerous complaints about the pace of discovery and document review, which initially involved only manual linear review of documents, so the Court directed the defendant to complete linear review as to certain custodians and begin using Technology Assisted Review (“TAR”) software for the rest of the collection.  After a dispute over the search terms selected for use, the plaintiffs proposed over 800 additional search terms to be run on certain custodians, most of which (after negotiation) were accepted by the defendant (despite a stated additional cost of $248,000 to review the documents).

The defendant proposed to use its TAR software for this review, but the plaintiffs objected, contending that the defendant had over-designated documents as privileged and non-responsive, using an “impermissibly narrow view of responsiveness” during its review process.  To support its contention, the plaintiffs produced certain documents to the Court that the defendant produced inadvertently (including 5 inadvertently produced slip sheets of documents not produced), which they contended should have been marked responsive and relevant.  As a result, the Court required the defendant to submit a letter for in camera review describing its predictive coding process and training for document reviewers.  The Court also required the defendant to provide a privilege log for a sample set of 80 documents that it designated as privileged in its initial review.  Out of those 80 documents, the defendant maintained its original privilege assertions over only 20 documents, finding 36 of them non-privileged and producing them as responsive and another 15 of them as non-responsive.

As a result, the plaintiffs filed a motion requesting random samples of several categories of documents and also sought information about the TAR ranking system used by the defendant and all materials submitted by the defendant for the Court’s in camera review relating to predictive coding.

Judge’s Ruling

Judge Parker noted that both parties did “misconstrue the Court’s rulings during the February 16, 2017 conference” and ordered the defendant to “expand its search for documents responsive to Plaintiffs’ document requests as it construed this Court’s prior ruling too narrowly”, indicating that the plaintiffs should meet and confer with the defendant after reviewing the additional production if they “believe that the City impermissibly withheld documents responsive to specific requests”.

As for the plaintiffs’ challenges to the defendant’s TAR process, Judge Parker referenced Hyles v. New York City, where Judge Andrew Peck, referencing Sedona Principle 6, stated the producing party is in the best position to “evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”  Judge Parker also noted that “[c]ourts are split as to the degree of transparency required by the producing party as to its predictive coding process”, citing cases that considered seed sets as work product and other cases that supported transparency of seed sets.  Relying on her in camera review of the materials provided by the defendant, Judge Parker concluded “that the City appropriately trained and utilized its TAR system”, noting that the defendant’s seed set “included over 7,200 documents that were reviewed by the City’s document review team and marked as responsive or non-responsive in order to train the system” and that “the City provided detailed training to its document review team as to the issues in the case.”

As a result, Judge Parker ordered the defendant “to produce the five ‘slip-sheeted’ documents and the 15 NR {non-responsive documents reclassified from privileged} Documents”, “to provide to Plaintiffs a sample of 300 non-privileged documents in total from the HPD custodians and the Mayor’s Office” and to “provide Plaintiffs with a random sample of 100 non-privileged, non-responsive documents in total from the DCP/Banks review population” (after applying the plaintiffs’ search terms and utilizing TAR on that collection).  Judge Parker ordered the parties to meet and confer on any disputes “with the understanding that reasonableness and proportionality, not perfection and scorched-earth, must be their guiding principles.”  Judge Parker denied the plaintiffs’ request for information about the defendant’s TAR process (but “encouraged” the defendant to share information with the plaintiffs) and denied the plaintiffs’ request to the defendant’s in camera submissions as being protected by the work product privilege.

So, what do you think?  Should TAR ranking systems and seed sets be considered work product or should they be transparent?  Please share any comments you might have or if you’d like to know more about a particular topic.

Case opinion link courtesy of eDiscovery Assistant.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Sometimes, the Data You Receive Isn’t Ready to Rock and Roll: eDiscovery Best Practices

Having just encountered a similar situation with one of my clients, I thought this was a topic worth revisiting.  Just because data is produced to you, it doesn’t mean that data is ready to “rock and roll”.

Here’s a case in point: I once worked with a client that received a multi-part production from the other side (via another party involved in the litigation, per agreement between the parties) that included image files, OCR text files and metadata (yes, the dreaded “load file” production).  The files that my client received were produced over several months to several other parties in the litigation.  The production contained numerous emails, each of which (of course) included an email sent date.  Can you guess which format the email sent date was provided in?  Here are some choices (using today’s date and 1:00 PM as an example):

  • 09/11/2017 13:00:00
  • 9/11/2017 1:00 PM
  • September 11, 2017 1:00 PM
  • Sep-17-2017 1:00 PM
  • 2013/09/11 13:00:00

The answer: all of them.

Because there were several productions to different parties with (apparently) different format agreements, my client didn’t have the option to request the data to be reproduced in a standard format.  Not only that, the name of the produced metadata field wasn’t consistent between productions – in about 15 percent of the documents the producing party named the field email_date_sent, in the rest of them, it was simply named date_sent.

What a mess, right?

If you know how to fix this issue, then – congrats! – you can probably stop reading.  Our client (both then and recently), didn’t know how.  Fortunately, at CloudNine, there are plenty of computer “geeks” to address problems like this (including me).

In the example above, we had to standardize the format of the dates into one standard format in one field.  We used a combination of SQL queries to get the data into one field and string commands and regular expressions to manipulate dates that didn’t fit a standard SQL date format by re-parsing them into a correct date format.  For example, the date 2017/09/11 was reparsed into 09/11/2017.

Getting the dates into a standard format in a single field not only enabled us to load that data successfully into the CloudNine platform, it also enabled us to then identify (in combination with other standard email metadata fields) duplicates in the collection based on those metadata fields.  As a result, we were able to exclude a significant percentage of the emails as duplicates, which wouldn’t have been possible before the data was converted and standardized.

Over the years, I’ve seen many examples where data (either from our side or the other side) needs to be converted.  It happens more than you think.  When that happens, it’s good to work with a solutions provider that has several “geeks” on their team that can provide that service.  Sometimes, having data that’s ready to “rock and roll” takes some work.

So, what do you think?  Have you received productions that needed conversion?  If so, what did you do?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.