eDiscoveryDaily Archives

eDiscovery Case Law: Counsel, The Inadvertent Disclosure "Buck" Stops With You

August 3, 2012

Here is yet another case of inadvertently disclosed privileged documents. In Blythe v. Bell, 2012 NCBC 42, North Carolina Business Superior Court Judge James L. Gale denied a motion for an order compelling the return of privileged documents inadvertently disclosed by the defendants, ruling that privilege had been waived on those documents.

In this case, the defendants produced 3.5 million documents on two hard drives which were ultimately determined to contain approximately 1,700 potentially privileged documents (the documents were to or from the outside counsel’s domain, an easy criteria to identify potentially privileged documents). The defendants contracted with an outside consultant (Computer Ants) to obtain, process, and search their eMails for responsive documents. For their part, the plaintiffs questioned whether Computer Ants was sufficiently qualified as an expert in electronic discovery to reasonably justify Defendants’ reliance on it to protect against the production of privileged information. Prior to establishing Computer Ants, the owner (Thomas Scott) had worked as a truck driver, a Bass Pro Shop Security Manager, a respiratory therapist, and a financial auditor for a retail seller. He had “never provided any forensic computer services in the context of a lawsuit” nor had ever “been engaged as a computer expert or provided an opinion in any legal proceeding”. Sounds as if the plaintiffs had a legitimate concern.

Judge Gale used a five-factor balancing test previously used in Morris v. Scenera Research, LLC, which considers: “(1) the reasonableness of the precautions taken to prevent inadvertent disclosure; (2) the number of inadvertent disclosures; (3) the extent of the disclosures; (4) any delay in measures taken to rectify the disclosures; and (5) the overriding interests of justice.”

Judge Gale noted that “One federal district court characterizes the need for advance efforts to protect against waiver as “’paramount.’” However, the defendant produced “the hard drives prepared by Computer Ants without any review or sampling or other quality assurance effort to assess whether the consultant’s efforts had been successful in eliminating privileged communications. Defendants admit that they relied exclusively on ‘this contractor and this procedure’ to filter out documents potentially subject to the attorney-client privilege.”

Since “the multi-factor balancing test applied by the federal courts on this record is controlled by the first factor”, Judge Gale, while noting that the “court takes no pleasure in finding the waiver of attorney-client privilege”, nonetheless had no choice but to do so based on the first factor alone.

So, what do you think? How do you evaluate your eDiscovery provider to ensure their qualifications? What precautions do you take to prevent inadvertent disclosure? Please share any comments you might have or if you’d like to know more about a particular topic.

Source: JD Supra, via Brooks, Pierce, McLendon, Humphrey & Leonard LLP

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery History: Zubulake’s e-Discovery

August 2, 2012

In the 22 months since this blog began, we have published 133 posts related to eDiscovery case law. When discussing the various case opinions that involve decisions regarding to eDiscovery, it’s easy to forget that there are real people impacted by these cases and that the story of each case goes beyond just whether they preserved, collected, reviewed and produced electronically stored information (ESI) correctly. A new book, by the plaintiff in the most famous eDiscovery case ever, provides the “backstory” that goes beyond the precedent-setting opinions of the case, detailing her experiences through the events leading up to the case, as well as over three years of litigation.

Laura A. Zubulake, the plaintiff in the Zubulake vs. UBS Warburg case, has written a new book: Zubulake's e-Discovery: The Untold Story of my Quest for Justice. It is the story of the Zubulake case – which resulted in one of the largest jury awards in the US for a single plaintiff in an employment discrimination case – as told by the author, in her words. As Zubulake notes in the Preface, the book “is written from the plaintiff’s perspective – my perspective. I am a businessperson, not an attorney. The version of events and opinions expressed are portrayed by me from facts and circumstances as I perceived them.” It’s a “classic David versus Goliath story” describing her multi-year struggle against her former employer – a multi-national financial giant.

Zubulake begins the story by developing an understanding of the Wall Street setting of her employer within which she worked for over twenty years and the growing importance of email in communications within that work environment. It continues through a timeline of the allegations and the evidence that supported those allegations leading up to her filing of a discrimination claim with the Equal Employment Opportunity Commission (EEOC) and her subsequent dismissal from the firm. This Allegations & Evidence chapter is particularly enlightening to those who may be familiar with the landmark opinions but not the underlying evidence and how that evidence to prove her case came together through the various productions (including the court-ordered productions from backup tapes). The story continues through the filing of the case and the beginning of the discovery process and proceeds through the events leading up to each of the landmark opinions (with a separate chapter devoted each to Zubulake I, III, IV and V), then subsequently through trial, the jury verdict and the final resolution of the case.

Throughout the book, Zubulake relays her experiences, successes, mistakes, thought processes and feelings during the events and the difficulties and isolation of being an individual plaintiff in a three-year litigation process. She also weighs in on the significance of each of the opinions, including one ruling by Judge Shira Scheindlin that may not have had as much impact on the outcome as you might think. For those familiar with the opinions, the book provides the “backstory” that puts the opinions into perspective; for those not familiar with them, it’s a comprehensive account of an individual who fought for her rights against a large corporation and won. Everybody loves a good “David versus Goliath story”, right?

The book is available at Amazon and also at CreateSpace. Look for my interview with Laura regarding the book in this blog next week.

So, what do you think? Are you familiar with the Zubulake opinions? Have you read the book? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: The Number of Pages in Each Gigabyte Can Vary Widely

July 31, 2012

A while back, we talked about how the average number of pages in each gigabyte is approximately 50,000 to 75,000 pages and that each gigabyte effectively culled out can save $18,750 in review costs. But, did you know just how widely the number of pages per gigabyte can vary?

The “how many pages” question comes up a lot and I’ve seen a variety of answers. Michael Recker of Applied Discovery posted an article to their blog last week titled Just How Big Is a Gigabyte?, which provides some perspective based on the types of files contained within the gigabyte, as follows:

“For example, e-mail files typically average 100,099 pages per gigabyte, while Microsoft Word files typically average 64,782 pages per gigabyte. Text files, on average, consist of a whopping 677,963 pages per gigabyte. At the opposite end of the spectrum, the average gigabyte of images contains 15,477 pages; the average gigabyte of PowerPoint slides typically includes 17,552 pages.”

Of course, each GB of data is rarely just one type of file. Many emails include attachments, which can be in any of a number of different file formats. Collections of files from hard drives may include Word, Excel, PowerPoint, Adobe PDF and other file formats. So, estimating page counts with any degree of precision is somewhat difficult.

In fact, the same exact content ported into different applications can be a different size in each file, due to the overhead required by each application. To illustrate this, I decided to conduct a little (admittedly unscientific) study using yesterday’s one page blog post about the Apple/Samsung litigation. I decided to put the content from that page into several different file formats to illustrate how much the size can vary, even when the content is essentially the same. Here are the results:

Text File Format (TXT): Created by performing a “Save As” on the web page for the blog post to text – 10 KB;
HyperText Markup Language (HTML): Created by performing a “Save As” on the web page for the blog post to HTML – 36 KB, over 3.5 times larger than the text file;
Microsoft Excel 2010 Format (XLSX): Created by copying the contents of the blog post and pasting it into a blank Excel workbook – 128 KB, nearly 13 times larger than the text file;
Microsoft Word 2010 Format (DOCX): Created by copying the contents of the blog post and pasting it into a blank Word document – 162 KB, over 16 times larger than the text file;
Adobe PDF Format (PDF): Created by printing the blog post to PDF file using the CutePDF printer driver – 211 KB, over 21 times larger than the text file;
Microsoft Outlook 2010 Message Format (MSG): Created by copying the contents of the blog post and pasting it into a blank Outlook message, then sending that message to myself, then saving the message out to my hard drive – 221 KB, over 22 times larger than the text file.

The Outlook example was probably the least representative of a typical email – most emails don’t have several embedded graphics in them (with the exception of signature logos) – and most are typically much shorter than yesterday’s blog post (which also included the side text on the page as I copied that too). Still, the example hopefully illustrates that a “page”, even with the same exact content, will be different sizes in different applications. As a result, to estimate the number of pages in a collection with any degree of accuracy, it’s not only important to understand the size of the data collection, but also the makeup of the collection as well.

So, what do you think? Was this example useful or highly flawed? Or both? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Case Law: On the Eve of Trial with Apple, Samsung is Dealt Adverse Inference Sanction

July 30, 2012

In Apple Inc. v. Samsung Elecs. Co., Case No.: C 11-1846 LHK (PSG) (N.D. Cal.), California Magistrate Judge Paul S. Grewal stated last week that jurors can presume “adverse inference” from Samsung’s automatically deletion of emails that Apple requested in pre-trial discovery.

Two of the world’s dominant smartphone makers are locked into lawsuits against each other all over the globe as they fiercely compete in the exploding mobile handset market. Both multinationals have brought their best weapons available to the game, with Apple asserting a number of technical and design patents along with trade dress rights. Samsung is, in return, asserting their “FRAND” (“Fair, Reasonable and Non-Discriminatory) patents against Apple. The debate rages online about whether a rectangular slab of glass should be able to be patented and whether Samsung is abusing their FRAND patents.

As for this case, Samsung’s proprietary “mySingle” email system is at the center of this discussion. In this web-based system, which Samsung has argued is in line with Korean law, every two weeks any emails not manually saved will automatically be deleted. Unfortunately, failure to turn “off” the auto-delete function resulted in spoliation of evidence as potentially responsive emails were deleted after the duty to preserve began.

Judge Grewal had harsh words in his order, noting the trouble Samsung has faced in the past:

“Samsung’s auto-delete email function is no stranger to the federal courts. Over seven years ago, in Mosaid v. Samsung, the District of New Jersey addressed the “rolling basis” by which Samsung email was deleted or otherwise rendered inaccessible. Mosaid also addressed Samsung’s decision not to flip an “off-switch” even after litigation began. After concluding that Samsung’s practices resulted in the destruction of relevant emails, and that “common sense dictates that [Samsung] was more likely to have been threatened by that evidence,” Mosaid affirmed the imposition of both an adverse inference and monetary sanctions.

Rather than building itself an off-switch—and using it—in future litigation such as this one, Samsung appears to have adopted the alternative approach of “mend it don’t end it.” As explained below, however, Samsung’s mend, especially during the critical seven months after a reasonable party in the same circumstances would have reasonably foreseen this suit, fell short of what it needed to do”.

The trial starts today and while no one yet knows how the jury will rule, Judge Grewal’s instructions to the jury regarding the adverse inference certainly won’t help Samsung’s case:

“Samsung has failed to prevent the destruction of relevant evidence for Apple’s use in this litigation. This is known as the “spoliation of evidence.

I instruct you, as a matter of law, that Samsung failed to preserve evidence after its duty to preserve arose. This failure resulted from its failure to perform its discovery obligations.

You also may presume that Apple has met its burden of proving the following two elements by a preponderance of the evidence: first, that relevant evidence was destroyed after the duty to preserve arose. Evidence is relevant if it would have clarified a fact at issue in the trial and otherwise would naturally have been introduced into evidence; and second, the lost evidence was favorable to Apple.

Whether this finding is important to you in reaching a verdict in this case is for you to decide. You may choose to find it determinative, somewhat determinative, or not at all determinative in reaching your verdict.”

Here are some other cases with adverse inference sanctions previously covered by the blog, including this one, this one, this one and this one.

So, what do you think? Will the “adverse inference” order decide this case? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Case Law: Twitter to Appeal Decision in People v. Harris

July 27, 2012

As reported by The Wall Street Journal, Twitter plans to appeal a court order requiring the company to produce messages posted by Malcolm Harris, an Occupy Wall Street activist facing criminal charges. He was one of more than 700 people arrested last October when demonstrators marched onto the Brooklyn Bridge roadway.

Back in April, Harris tried to quash a subpoena seeking production of his Tweets and his Twitter account user information in his New York criminal case. That request was rejected, so Twitter then sought to quash the subpoena themselves, claiming that the order to produce the information imposed an “undue burden” on Twitter and even forced it to “violate federal law”.

On June 30, in People v. Harris, 2011NY080152, New York Criminal Court Judge Matthew Sciarrino Jr. ruled that Twitter must produce tweets and user information of Harris, noting: “If you post a tweet, just like if you scream it out the window, there is no reasonable expectation of privacy. There is no proprietary interest in your tweets, which you have now gifted to the world. This is not the same as a private email, a private direct message, a private chat, or any of the other readily available ways to have a private conversation via the internet that now exist…Those private dialogues would require a warrant based on probable cause in order to access the relevant information.”

Judge Sciarrino indicated that his decision was “partially based on Twitter's then terms of service agreement. After the April 20, 2012 decision, Twitter changed its terms and policy effective May 17, 2012. The newly added portion states that: ‘You Retain Your Right To Any Content You Submit, Post Or Display On Or Through The Service.’” So, it would be interesting to see if the same ruling would be applied for “tweets” and other information posted after that date.

“We're appealing the Harris decision,” wrote Benjamin Lee, Twitter's lead litigator. “It doesn't strike the right balance between the rights of users and the interests of law enforcement”.

Martin Stolar, the attorney representing Harris, praised Twitter's decision. "Privacy interests in the information age are a special category which has to be freshly looked at by the courts," he said in a statement. "We are pleased that Twitter sees the far-reaching implications of the ruling against Mr. Harris and against Twitter."

So, what do you think? Will Twitter succeed in its appeal? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Review Attorneys, Are You Smarter than a High Schooler?

July 26, 2012

Review attorneys are taking a beating these days. There’s so much attention being focused on technology assisted review, with the latest study noting the cost-effectiveness of technology assisted review (when compared to manual review) having just been released this month. There is also the very detailed and well known white paper study written by Maura Grossman and Gordon Cormack (Technology-Assisted Review in E-Discovery can be More Effective and More Efficient that Exhaustive Manual Review) which notes not only the cost-effectiveness of technology assisted review but also that it was actually more accurate.

The latest study, from information scientist William Webber (and discussed in this Law Technology News article by Ralph Losey) seems to indicate that trained reviewers don’t provide any better review accuracy than a pair of high schoolers that he selected with “no legal training, and no prior e-discovery experience, aside from assessing a few dozen documents for a different TREC topic as part of a trial experiment”. In fact, the two high schoolers did better! He also notes that “[t]hey worked independently and without supervision or correction, though one would be correct to describe them as careful and motivated.” His conclusion?

“The conclusion that can be reached, though, is that our assessors were able to achieve reliability (with or without detailed assessment guidelines) that is competitive with that of the professional reviewers — and also competitive with that of a commercial e-discovery vendor.”

Webber also cites two other studies with similar results and notes “All of this raises the question that is posed in the subject of this post: if (some) high school students are as reliable as (some) legally-trained, professional e-discovery reviewers, then is legal training a practical (as opposed to legal) requirement for reliable first-pass review for responsiveness? Or are care and general reading skills the more important factors?”

I have a couple of observations about the study. Keep in mind, I’m not an attorney (and don’t play one on TV), but I have worked with review teams on several projects and have observed the review process and how it has been conducted in a real world setting, so I do have some real-world basis for my thoughts:

Two high schoolers is not a significant sample size: I’ve worked on several projects where some reviewers are really productive and others are highly unproductive to the point of being useless. It’s difficult to determine a valid conclusion on the basis of two non-legal reviewers in his study and four non-legal reviewers in one of the studies that Webber cites.
Review is typically an iterative process: In my experience, most legal reviews that I’ve seen start with detailed instructions and training provided to the reviewers, followed up with regular (daily, if not more frequent) changes to instructions to reflect information gathered during the review process. Instructions are refined as the review commences and more information is learned about the document collection. Since Webber noted that “[t]hey worked independently and without supervision or correction”, it doesn’t appear that his review test was conducted in this manner. This makes it less of a real world scenario, in my opinion.

I also think some reviews especially benefit from a first pass review with legal trained reviewers (for example, a reviewer who understands intellectual property laws is going to understand potential IP issues better than someone who hasn’t had the training in IP law). Nonetheless, these studies are bound to “fan the flames” of debate regarding the effectiveness of manual attorney review (even more than they already are).

So, what do you think? Do you think his study is valid? Or do you have other concerns about the conclusions he has drawn? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Conduct Yourself Ethically with EDRM’s Model Code of Conduct

July 24, 2012

The Electronic Discovery Reference Model (EDRM) has made numerous contributions to the eDiscovery industry since it was founded in 2005, with the EDRM diagram (above) having become a universally accepted standard – rare in our industry – to reflect the eDiscovery life cycle. Last year, we noted the introduction of the EDRM Model Code of Conduct (MCoC), which focuses on the ethical duties of service providers associated with these five key principles and also provides a corollary for each principle to illustrate ethical duties of their clients. Now, your organization can subscribe to the MCoC to demonstrate its commitment to conducting itself in an ethical manner.

As the invitation email (to subscribe) from EDRM notes, “The MCoC was drafted by members of the EDRM MCoC Project and reflects years of exhaustive dialogue and a wide array of viewpoints representative of the interests of corporations, law firms and service providers…[It] is designed to promote predictability and stability in the legal industry for both providers and consumers of electronic discovery products and services.”

You can read the code online here, or download it as a 22 page PDF file here. To voluntarily subscribe to the MCoC, you can register on the EDRM website here. Identify your organization, provide information for an authorized representative and answer four verification questions (truthfully, of course) to affirm your organization’s commitment to the spirit of the MCoC, and your organization is in! You can also provide a logo for EDRM to include when adding you to the list of subscribing organizations. As of this writing, there are 39 subscribing organizations listed here, including CloudNine Discovery, the company I work for (and sponsor of this blog, in case you haven’t noticed).

So, what do you think? Is your organization subscribed? If not, what’s stopping you? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Need to Catch Up on Trends Over the Last Six Weeks? Take a Time Capsule.

July 23, 2012

I try to set aside some time over the weekend to catch up on my reading and keep abreast of developments in the industry and although that’s sometimes that’s easier said than done, I stumbled across an interesting compilation of legal technology information from my friend Christy Burke and her team at Burke & Company. On Friday, Burke & Company released The Legal Technology Observer (LTO) Time Capsule on Legal IT Professionals. LTO was a 6 week concentrated collection of essays, articles, surveys and blog posts providing expert practical knowledge about legal technology, eDiscovery, and social media for legal professionals.

The content has been formatted into a PDF version and is available for free download here. As noted in their press release, Burke & Company's bloggers, including Christy, Melissa DiMercurio, Ada Spahija and Taylor Gould, as well as many distinguished guest contributors, set out to examine the trends, topics and perspectives that are driving today's legal technology world for 6 weeks from June 6 to July 12. They did so with help of many of the industry's most respected experts and LTO acquired more than 21,000 readers in just 6 weeks. Nice job!

The LTO Time Capsule covers a wide range of topics related to legal technology. There were several topics that have impact to eDiscovery, some of which included thought leaders previously interviewed on this blog (links to their our previous interviews with them below), including:

The EDRM Speaks My Language: Written by – Ada Spahija, Communications Specialist at Burke and Company LLC; Featuring – Experts George Socha and Tom Gelbmann.
Learning to Speak EDRM: Written by – Ada Spahija, Communications Specialist at Burke and Company LLC; Featuring – Experts George Socha and Tom Gelbmann.
Predictive Coding: Dozens of Names, No Definition, Lots of Controversy: Written by – Sharon D. Nelson, Esq. and John W. Simek.
Social Media 101 for Law Firms – Don’t Get Left Behind: Written by – Ada Spahija, Communications Specialist at Burke and Company LLC; Featuring – Kerry Scott Boll of JustEngage.
Results of Social Media 101 Snap-Poll: Written by – Ada Spahija, Communications Specialist at Burke and Company LLC.
Getting up to Speed with eDiscovery: Written by – Taylor Gould, Communications Intern at Burke and Company LLC; Featuring – Browning Marean, Senior Counsel at DLA Piper, San Diego.
LTO Interviews Craig Ball to Examine the Power of Computer Forensics: Written by – Melissa DiMercurio, Account Executive at Burke and Company LLC; Featuring – Expert Craig Ball, Trial Lawyer and Certified Computer Forensic Examiner.
LTO Asks Bob Ambrogi How a Lawyer Can Become a Legal Technology Expert: Written by – Melissa DiMercurio, Account Exectuive at Burke and Company LLC; Featuring – Bob Ambrogi, Practicing Lawyer, Writer and Media Consultant.
LTO Interviews Jeff Brandt about the Mysterious Cloud Computing Craze: Written by – Taylor Gould, Communications Intern at Burke and Company LLC; Featuring – Jeff Brandt, Editor of PinHawk Law Technology Daily Digest.
Legal Technology Observer eDiscovery in America – A Legend in the Making: Written by – Christy Burke, President of Burke and Company LLC; Featuring – Barry Murphy, Analyst with the eDJ Group and Contributor to eDiscoveryJournal.com.
IT-Lex and the Sedona Conference® Provide Real Help to Learn eDiscovery and Technology Law: Written by – Christy Burke, President of Burke and Company LLC.

These are just some of the topics, particularly those that have an impact on eDiscovery. To check out the entire list of articles, click here to download the report.

So, what do you think? Do you need a quick resource to catch up on your reading? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Case Law: Judge Scheindlin Says “No” to Self-Collection, “Yes” to Predictive Coding

July 20, 2012

When most people think of the horrors of Friday the 13th, they think of Jason Voorhees. When US Immigration and Customs thinks of Friday the 13th horrors, do they think of Judge Shira Scheindlin?

As noted in Law Technology News (Judge Scheindlin Issues Strong Opinion on Custodian Self-Collection, written by Ralph Losey, a previous thought leader interviewee on this blog), New York District Judge Scheindlin issued a decision last Friday (July 13) addressing the adequacy of searching and self-collection by government entity custodians in response to Freedom of Information Act (FOIA) requests. As Losey notes, this is her fifth decision in National Day Laborer Organizing Network et al. v. United States Immigration and Customs Enforcement Agency, et al., including one that was later withdrawn.

Regarding the defendant’s question as to “why custodians could not be trusted to run effective searches of their own files, a skill that most office workers employ on a daily basis” (i.e., self-collect), Judge Scheindlin responded as follows:

“There are two answers to defendants' question. First, custodians cannot 'be trusted to run effective searches,' without providing a detailed description of those searches, because FOIA places a burden on defendants to establish that they have conducted adequate searches; FOIA permits agencies to do so by submitting affidavits that 'contain reasonable specificity of detail rather than merely conclusory statements.' Defendants' counsel recognize that, for over twenty years, courts have required that these affidavits 'set [ ] forth the search terms and the type of search performed.' But, somehow, DHS, ICE, and the FBI have not gotten the message. So it bears repetition: the government will not be able to establish the adequacy of its FOIA searches if it does not record and report the search terms that it used, how it combined them, and whether it searched the full text of documents.”

“The second answer to defendants' question has emerged from scholarship and caselaw only in recent years: most custodians cannot be 'trusted' to run effective searches because designing legally sufficient electronic searches in the discovery or FOIA contexts is not part of their daily responsibilities. Searching for an answer on Google (or Westlaw or Lexis) is very different from searching for all responsive documents in the FOIA or e-discovery context.”

“Simple keyword searching is often not enough: 'Even in the simplest case requiring a search of on-line e-mail, there is no guarantee that using keywords will always prove sufficient.' There is increasingly strong evidence that '[k]eyword search[ing] is not nearly as effective at identifying relevant information as many lawyers would like to believe.' As Judge Andrew Peck — one of this Court's experts in e-discovery — recently put it: 'In too many cases, however, the way lawyers choose keywords is the equivalent of the child's game of 'Go Fish' … keyword searches usually are not very effective.'”

Regarding search best practices and predictive coding, Judge Scheindlin noted:

“There are emerging best practices for dealing with these shortcomings and they are explained in detail elsewhere. There is a 'need for careful thought, quality control, testing, and cooperation with opposing counsel in designing search terms or keywords to be used to produce emails or other electronically stored information.' And beyond the use of keyword search, parties can (and frequently should) rely on latent semantic indexing, statistical probability models, and machine learning tools to find responsive documents.”

“Through iterative learning, these methods (known as 'computer-assisted' or 'predictive' coding) allow humans to teach computers what documents are and are not responsive to a particular FOIA or discovery request and they can significantly increase the effectiveness and efficiency of searches. In short, a review of the literature makes it abundantly clear that a court cannot simply trust the defendant agencies' unsupported assertions that their lay custodians have designed and conducted a reasonable search.”

Losey notes that “A classic analogy is that self-collection is equivalent to the fox guarding the hen house. With her latest opinion, Schiendlin [sic] includes the FBI and other agencies as foxes not to be trusted when it comes to searching their own email.”

So, what do you think? Will this become another landmark decision by Judge Scheindlin? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: How Many Requests for User Information is Twitter Receiving? It’s Transparent.

July 19, 2012

As illustrated in the example we posted Tuesday, Twitter continues to receive requests from government agencies for user information (often related to litigation). How many are they receiving? Now, you can find out, simply by clicking on their new Transparency Report page to see the number of requests they have received.

Starting for the first six months of this year, Twitter’s report will be issued every six months and provides information in three areas:

Government requests received for user information;
Government requests received to withhold content; and
DMCA takedown notices received from copyright holders.

Twitter provides a table for each category. For the government requests categories (first two sections), it shows requests by country. In the User Information Requests table, it’s notable that, out of 849 total user information requests for the first half of 2012, 679 were requested by US government entities (we’re so litigious!). They also provide stats for percentage of the requests where some or all information was produced and a count of users/accounts specified. Here are some observations:

There were 849 total user information requests for the first half of 2012, 679 coming from US government entities. The only other countries that had more than 10 requests were: Japan (98), Canada (11) and the United Kingdom (11).
Information was produced in 63% of those requests, 75% of the time for US requests. Interestingly enough, only 20% of Japan’s 98 requests resulted in information produced.
The 849 total user information requests for the first half of 2012 specified 1,181 user accounts in those requests, with the 679 US requests specifying 948 user accounts.

Twitter notes that their report is inspired by Google’s own Transparency Report (click here to see their Transparency Report page and here to see user data requests they receive from government agencies and courts for a selected six-month period, starting with July through December 2009). Early versions of the report don’t show the percentages of user data requests they comply with or the number of users or accounts about which data was requested. But, it’s interesting to note that since Google began tracking requests, they have risen from greater than 12,539 in July through December 2009 to greater than 18,257 in July through December 2011, a 46% rise in two years. It will be interesting to see if the number of Twitter requests rises in a similar fashion. I’m betting yes.

Of course, there’s a protocol to follow if you’re a government entity or law enforcement organization requesting private information from Twitter as we detailed back in April.

So, what do you think? Is this useful information? Would you have expected more or less information requests to Twitter? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscoveryDaily