Proportionality

eDiscovery Best Practices: The Number of Pages in Each Gigabyte Can Vary Widely

 

A while back, we talked about how the average number of pages in each gigabyte is approximately 50,000 to 75,000 pages and that each gigabyte effectively culled out can save $18,750 in review costs.  But, did you know just how widely the number of pages per gigabyte can vary?

The “how many pages” question comes up a lot and I’ve seen a variety of answers.  Michael Recker of Applied Discovery posted an article to their blog last week titled Just How Big Is a Gigabyte?, which provides some perspective based on the types of files contained within the gigabyte, as follows:

“For example, e-mail files typically average 100,099 pages per gigabyte, while Microsoft Word files typically average 64,782 pages per gigabyte. Text files, on average, consist of a whopping 677,963 pages per gigabyte. At the opposite end of the spectrum, the average gigabyte of images contains 15,477 pages; the average gigabyte of PowerPoint slides typically includes 17,552 pages.”

Of course, each GB of data is rarely just one type of file.  Many emails include attachments, which can be in any of a number of different file formats.  Collections of files from hard drives may include Word, Excel, PowerPoint, Adobe PDF and other file formats.  So, estimating page counts with any degree of precision is somewhat difficult.

In fact, the same exact content ported into different applications can be a different size in each file, due to the overhead required by each application.  To illustrate this, I decided to conduct a little (admittedly unscientific) study using yesterday’s one page blog post about the Apple/Samsung litigation.  I decided to put the content from that page into several different file formats to illustrate how much the size can vary, even when the content is essentially the same.  Here are the results:

  • Text File Format (TXT): Created by performing a “Save As” on the web page for the blog post to text – 10 KB;
  • HyperText Markup Language (HTML): Created by performing a “Save As” on the web page for the blog post to HTML – 36 KB, over 3.5 times larger than the text file;
  • Microsoft Excel 2010 Format (XLSX): Created by copying the contents of the blog post and pasting it into a blank Excel workbook – 128 KB, nearly 13 times larger than the text file;
  • Microsoft Word 2010 Format (DOCX): Created by copying the contents of the blog post and pasting it into a blank Word document – 162 KB, over 16 times larger than the text file;
  • Adobe PDF Format (PDF): Created by printing the blog post to PDF file using the CutePDF printer driver – 211 KB, over 21 times larger than the text file;
  • Microsoft Outlook 2010 Message Format (MSG): Created by copying the contents of the blog post and pasting it into a blank Outlook message, then sending that message to myself, then saving the message out to my hard drive – 221 KB, over 22 times larger than the text file.

The Outlook example was probably the least representative of a typical email – most emails don’t have several embedded graphics in them (with the exception of signature logos) – and most are typically much shorter than yesterday’s blog post (which also included the side text on the page as I copied that too).  Still, the example hopefully illustrates that a “page”, even with the same exact content, will be different sizes in different applications.  As a result, to estimate the number of pages in a collection with any degree of accuracy, it’s not only important to understand the size of the data collection, but also the makeup of the collection as well.

So, what do you think?  Was this example useful or highly flawed?  Or both?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: eDiscovery Work is Growing in Law Firms and Corporations

 

There was an article in Law Technology News last Friday (Survey Shows Surge in E-Discovery Work at Law Firms and Corporations, written by Monica Bay) that discussed the findings of a survey released by The Cowen Group, indicating that eDiscovery work in law firms and corporations is growing considerably.  Eighty-eight law firm and corporate law department professionals responded to the survey.

Some of the key findings:

  • 70 percent of law firm respondents reported an increase in workload for their litigation support and eDiscovery departments (compared to 42 percent in the second quarter of 2009);
  • 77 percent of corporate law department respondents reported an increase in workload for their litigation support and eDiscovery departments;
  • 60 percent of respondents anticipate increasing their internal capabilities for eDiscovery;
  • 55 percent of corporate and 62 percent of firm respondents said they "anticipate outsourcing a significant amount of eDiscovery to third-party providers” (some organizations expect to both increase internal capabilities and outsource);
  • 50 percent of the firms believe they will increase technology speeding in the next three months (compared to 31 percent of firms in 2010);
  • 43 percent of firms plan to add people to their litigation support and eDiscovery staff in the next 3 months, compared to 32 percent in 2011;
  • Noting that “corporate legal departments are under increasing pressure to ‘do more with less in-house to keep external costs down’”, only 12 percent of corporate respondents anticipate increasing headcount and 30 percent will increase their technology spend in the next six months;
  • In the past year, 49 percent of law firms and 23 percent of corporations have used Technology Assisted Review/ Predictive Coding technology through a third party service provider – an additional 38 percent have considered using it;
  • As for TAR/Predictive Coding inhouse, 30 percent of firms have an inhouse tool, and an additional 35 percent are considering making the investment.

As managing partner David Cowen notes, “Cases such as Da Silva Moore, Kleen, and Global Aerospace, which have hit our collective consciousness in the past three months, affect the investments in technology that both law firms and corporations are making.”  He concludes the Executive Summary of the report with this advice: “Educate yourself on the latest evolving industry trends, invest in relationships, and be an active participant in helping your executives, your department, and your clients ‘do more with less’.”

So, what do you think?  Do any of those numbers and trends surprise you?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: The Da Silva Moore Case Has Class (Certification, That Is)

 

As noted in an article written by Mark Hamblett in Law Technology News, Judge Andrew Carter of the U.S. District Court for the Southern District of New York has granted conditional class certification in the Da Silva Moore v. Publicis Groupe & MSL Group case.

In this case, women employees of the advertising conglomerate Publicis Groupe and its U.S. subsidiary, MSL, have accused their employer of company-wide discrimination, pregnancy discrimination, and a practice of keeping women at entry-level positions with few opportunities for promotion.

Judge Carter concluded that “Plaintiffs have met their burden by making a modest factual showing to demonstrate that they and potential plaintiffs together were victims of a common policy or plan that violated the law. They submit sufficient information that because of a common pay scale, they were paid wages lower than the wages paid to men for the performance of substantially equal work. The information also reveals that Plaintiffs had similar responsibilities as other professionals with the same title. Defendants may disagree with Plaintiffs' contentions, but the Court cannot hold Plaintiffs to a higher standard simply because it is an EPA action rather an action brought under the FLSA.”

“Courts have conditionally certified classes where the plaintiffs have different job functions,” Judge Carter noted, indicating that “[p]laintiffs have to make a mere showing that they are similarly situated to themselves and the potential opt-in members and Plaintiffs here have accomplished their goal.”

This is just the latest development in this test case for the use of computer-assisted coding to search electronic documents for responsive discovery. On February 24, Magistrate Judge Andrew J. Peck of the U.S. District Court for the Southern District of New York issued an opinion making it likely the first case to accept the use of computer-assisted review of electronically stored information (“ESI”) for this case.  However, on March 13, District Court Judge Andrew L. Carter, Jr. granted plaintiffs’ request to submit additional briefing on their February 22 objections to the ruling.  In that briefing (filed on March 26), the plaintiffs claimed that the protocol approved for predictive coding “risks failing to capture a staggering 65% of the relevant documents in this case” and questioned Judge Peck’s relationship with defense counsel and with the selected vendor for the case, Recommind.

Then, on April 5, Judge Peck issued an order in response to Plaintiffs’ letter requesting his recusal, directing plaintiffs to indicate whether they would file a formal motion for recusal or ask the Court to consider the letter as the motion.  On April 13, (Friday the 13th, that is), the plaintiffs did just that, by formally requesting the recusal of Judge Peck (the defendants issued a response in opposition on April 30).  But, on April 25, Judge Carter issued an opinion and order in the case, upholding Judge Peck’s opinion approving computer-assisted review.

Not done, the plaintiffs filed an objection on May 9 to Judge Peck's rejection of their request to stay discovery pending the resolution of outstanding motions and objections (including the recusal motion, which has yet to be ruled on.  Then, on May 14, Judge Peck issued a stay, stopping defendant MSLGroup's production of electronically stored information.  Finally, on June 15, Judge Peck, in a 56 page opinion and order, denied the plaintiffs’ motion for recusal

So, what do you think?  What will happen in this case next?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: When Litigation Hits, The First 7 to 10 Days is Critical

When a case is filed, several activities must be completed within a short period of time (often as soon as the first seven to ten days after filing) to enable you to assess the scope of the case, where the key electronically stored information (ESI) is located and whether to proceed with the case or attempt to settle with opposing counsel.  Here are several of the key early activities that can assist in deciding whether to litigate or settle the case.

Activities:

  • Create List of Key Employees Most Likely to have Documents Relevant to the Litigation: To estimate the scope of the case, it’s important to begin to prepare the list of key employees that may have potentially responsive data.  Information such as name, title, eMail address, phone number, office location and where information for each is stored on the network is important to be able to proceed quickly when issuing hold notices and collecting their data.
  • Issue Litigation Hold Notice and Track Results: The duty to preserve begins when you anticipate litigation; however, if litigation could not be anticipated prior to the filing of the case, it is certainly clear once the case if filed that the duty to preserve has begun.  Hold notices must be issued ASAP to all parties that may have potentially responsive data.  Once the hold is issued, you need to track and follow up to ensure compliance.  Here are a couple of recent posts regarding issuing hold notices and tracking responses.
  • Interview Key Employees: As quickly as possible, interview key employees to identify potential locations of responsive data in their possession as well as other individuals they can identify that may also have responsive data so that those individuals can receive the hold notice and be interviewed.
  • Interview Key Department Representatives: Certain departments, such as IT, Records or Human Resources, may have specific data responsive to the case.  They may also have certain processes in place for regular destruction of “expired” data, so it’s important to interview them to identify potentially responsive sources of data and stop routine destruction of data subject to litigation hold.
  • Inventory Sources and Volume of Potentially Relevant Documents: Potentially responsive data can be located in a variety of sources, including: shared servers, eMail servers, employee workstations, employee home computers, employee mobile devices, portable storage media (including CDs, DVDs and portable hard drives), active paper files, archived paper files and third-party sources (consultants and contractors, including cloud storage providers).  Hopefully, the organization already has created a data map before litigation to identify the location of sources of information to facilitate that process.  It’s important to get a high level sense of the total population to begin to estimate the effort required for discovery.
  • Plan Data Collection Methodology: Determining how each source of data is to be collected also affects the cost of the litigation.  Are you using internal resources, outside counsel or a litigation support vendor?  Will the data be collected via an automated collection system or manually?  Will employees “self-collect” any of their own data?  Answers to these questions will impact the scope and cost of not only the collection effort, but the entire discovery effort.

These activities can result in creating a data map of potentially responsive information and a “probable cost of discovery” spreadsheet (based on initial estimated scope compared to past cases at the same stage) that will help in determining whether to proceed to litigate the case or attempt to settle with the other side.

So, what do you think?  How quickly do you decide whether to litigate or settle?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Costs, Outside Counsel and Vendor Performance Chief Among GC Concerns

 

A survey was recently conducted by eDiscovery Solutions Group (eDSG) that of Global 250 General Counsel on various aspects of eDiscovery processes and concerns regarding eDiscovery.  The results were summed up in a post in the blog, The eDiscovery Paradigm Shift, written by Charles Skamser.  With a little over half (127 out of 250 organizations or 51%) responding, the post noted some interesting findings with regard to how organizations handle various eDiscovery tasks and their concerns regarding the process overall.

eDiscovery Services

According to the survey, organizations are (not surprisingly) still highly dependent on outside counsel for eDiscovery services, with over half of the organizations (51%) relying on them for eDiscovery collections and Early Case Assessment (ECA) services and 43% relying on them for document review services.  Organizations rely on third party forensics groups 35% of the time for eDiscovery collections and rely on Legal Process Outsource (LPO) providers 29% of the time for ECA services and 43% of the time for document review services.  Organizations handle ECA internally 20% of the time and handle collection and review 13% of the time each.

The author notes surprise that 51% of the respondents identified outside counsel for their ECA and wondered if there was confusion by respondents about the term “LPO” and whether it applied to litigation service providers.  It’s also possible that the term “ECA” might have been confusing as well – to many in the legal profession it means estimating risk (in terms of time and cost to proceed with the case instead of settling) and not analysis of the data.

Frustrations and Pet Peeves

eDSG also asked the respondents about their top frustrations and top pet peeves over the past 12 months (respondents could select more than one in each category).  Top frustrations were “Cost of eDiscovery not declining as rapidly as expected” (95%) and “Increase in the Amount of ESI” (90%).  Also notable are the respondents that are frustrated with “Dealing with eDiscovery Software Vendors” (80%) and “Outside Counsel Not Providing Adequate Support for eDiscovery Requirements” (75%).  Sounds like most of the respondents have multiple frustrations!

Top pet peeves were “Outside Counsel and LPOs Knowingly Low Balling Cost Estimates” (80%) and “eDiscovery Cost Overruns”, “LPOs dropping the ball on eDiscovery Projects” and “Anyone that states that litigation in now all about technology” (all at 75%).  Also, 65% of respondents find eDiscovery Vendor sales people “annoying”.  🙂

Concerns

With regard to the next 12 months, eDSG asked the respondents about their top concerns going forward (again, respondents could select more than one in each category).  Top concerns were “Managing the Cost of eDiscovery” (a perfect 100%) and “Collaboration between internal stakeholders” (91%).  Other concerns included “Education and Training of Staff ” (79%) and “Understanding the Impact of Social Media” (75%).

Summary

A link to the blog post with more information and survey results is available here.  Based on the responses, most organizations outsource their eDiscovery activities to either outside counsel and litigation support vendors; yet, many of them don’t appear to be happy with the results their outsource providers are giving them.  It sounds like there’s lots of room for improvement.  The cost of eDiscovery appears to be the biggest frustration and the biggest concern of in-house counsel personnel going forward.

So, what do you think?  Did any of these survey results surprise you?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Case Law: Judge Peck Denies Recusal Motion in Da Silva Moore

 

It’s been a few weeks since we heard anything from the Da Silva Moore case.  If you’ve been living under a rock the past few months, Magistrate Judge Andrew J. Peck of the U.S. District Court for the Southern District of New York issued an opinion in this case in February making it one of the first cases to accept the use of computer-assisted review of electronically stored information (“ESI”).  However, the plaintiffs objected to the ruling and questioned Judge Peck’s relationship with defense counsel and with the selected vendor for the case, Recommind and ultimately formally requested the recusal of Judge Peck.  For links to all of the recent events in the case that we’ve covered, click here.

Last Friday, in a 56 page opinion and order, Judge Peck denied the plaintiffs’ motion for recusal.  The opinion and order reviewed the past several contentious months and rejected the plaintiffs’ arguments for recusal in the following areas:

Participation in conferences discussing the use of predictive coding:

“I only spoke generally about computer-assisted review in comparison to other search techniques…The fact that my interest in and knowledge about predictive coding in general overlaps with issues in this case is not a basis for recusal.”

“To the extent plaintiffs are complaining about my general discussion at these CLE presentations about the use of predictive coding in general, those comments would not cause a reasonable objective observer to believe I was biased in this case. I did not say anything about predictive coding at these LegalTech and other CLE panels that I had not already said in in my Search,Forward article, i.e., that lawyers should consider using predictive coding in appropriate cases. My position was the same as plaintiffs’ consultant . . . . Both plaintiffs and defendants were proposing using predictive coding in this case.  I did not determine which party’s predictive coding protocol was appropriate in this case until the February 8, 2012 conference, after the panels about which plaintiffs complain.”

“There are probably fewer than a dozen federal judges nationally who regularly speak at ediscovery conferences. Plaintiffs' argument that a judge's public support for computer-assisted review is a recusable offense would preclude judges who know the most about ediscovery in general (and computer-assisted review in particular) from presiding over any case where the use of predictive coding was an option, or would preclude those judges from speaking at CLE programs. Plaintiffs' position also would discourage lawyers from participating in CLE programs with judges about ediscovery issues, for fear of subsequent motions to recuse the judge (or disqualify counsel).”

Relationship with defense counsel Ralph Losey:

“While I participated on two panels with defense counsel Losey, we never had any ex parte communication regarding this lawsuit. My preparation for and participation in ediscovery panels involved only ediscovery generally and the general subject of computer-assisted review. Losey's affidavit makes clear that we have never spoken about this case, and I confirm that. During the panel discussions (and preparation sessions), there was absolutely no discussion of the details of the predictive coding protocol involved in this case or with regard to what a predicative coding protocol should look like in any case. Plaintiffs' assertion that speaking on an educational panel with counsel creates an appearance of impropriety is undermined by Canon 4 of the Judicial Code of Conduct, which encourages judges to participate in such activities.”

Relationship with Recommind, the selected vendor in the case:

“The panels in which I participated are distinguishable. First, I was a speaker at educational conferences, not an audience member. Second, the conferences were not one-sided, but concerned ediscovery issues including search methods in general. Third, while Recommind was one of thirty-nine sponsors and one of 186 exhibitors contributing to LegalTech's revenue, I had no part in approving the sponsors or exhibitors (i.e., funding for LegalTech) and received no expense reimbursement or teaching fees from Recommind or LegalTech, as opposed to those companies that sponsored the panels on which I spoke. Fourth, there was no "pre-screening" of MSL's case or ediscovery protocol; the panel discussions only covered the subject of computer-assisted review in general.”

Perhaps it is no surprise that Judge Peck denied the recusal motion.  Now, the question is: will District Court Judge Andrew L. Carter, Jr. weigh in?

So, what do you think?  Should Judge Peck recuse himself in this case or does he provide an effective argument that recusal is unwarranted?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

State eDiscovery Rules: Pennsylvania Supreme Court Amends eDiscovery Rules, Rejects Federal Rules

 

Last week, the Pennsylvania Supreme Court adopted amendments to the rules on how discovery of electronically stored information is handled in the state.  However, the chairwoman of Pennsylvania’s Civil Procedural Rules Committee, Diane W. Perer, has expressly rejected federal law on the subject in her explanatory comment stating that, despite the adoption of the term “electronically stored information,” “there is no intent to incorporate federal jurisprudence surrounding the discovery of electronically stored information.”  Instead, “[t]he treatment of such issues is to be determined by traditional principles of proportionality under Pennsylvania law”.

The explanatory comment also discusses the “Proportionality Standard” and its application to electronic discovery, as well as “Tools for Addressing Electronically Stored Information”.  When it comes to proportionality, Pennsylvania courts are required to consider:

“(i) the nature and scope of the litigation, including the importance and complexity of the issues and the amounts at stake;

(ii) the relevance of electronically stored information and its importance to the court’s adjudication in the given case;

(iii) the cost, burden, and delay that may be imposed on the parties to deal with electronically stored information;

(iv) the ease of producing electronically stored information and whether substantially similar information is available with less burden; and

(v) any other factors relevant under the circumstances.”

When it comes to tools for addressing ESI, the comment stated that "[p]arties and courts may consider tools such as electronic searching, sampling, cost sharing and non-waiver agreements to fairly allocate discovery burdens and costs. When using non-waiver agreements, parties may wish to incorporate those agreements into court orders to maximize protection vis-à-vis third parties."

The amendments affect rules 4009.1, 4009.11, 4009.12, 4009.21, 4009.23, and 4011.  For example, in Rule 4009.1, the court added the phrase "electronically stored information" to the "production of documents and things" a party may request. It also added a subsection that a party requesting ESI "may specify the format in which it is to be produced and a responding party or person not a party may object."  If no format is requested, the rule states the ESI can be produced in the form in which it is typically maintained.

In some cases, the amendments affect only the notes, not the substance of the rule itself.  For example, in a note to Rule 4009.11 regarding the request for production of documents and things, the court said a request for ESI should be "as specific as possible."

So, what do you think?  Was it necessary for Pennsylvania to distance themselves from the Federal rules, or was it a good idea?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Where Does the Money Go? RAND Provides Some Answers

 

The RAND Corporation, a nonprofit research and analysis institution recently published a new 159 page report related to understanding eDiscovery costs entitled Where the Money Goes: Understanding Litigant Expenditures for Producing Electronic Discovery by Nicholas M. Pace and Laura Zakaras that has some interesting findings and recommendations.  To obtain either a paperback copy or download a free eBook of the report, click here.

For the study, the authors requested case-study data from eight Fortune 200 companies and obtained data for 57 large-volume eDiscovery productions (from both traditional lawsuits and regulatory investigations) as well as information from extensive interviews with key legal personnel from the participating companies.  Here are some of the key findings from the research:

  • Review Makes Up the Largest Percentage of eDiscovery Production Costs: By a whopping amount, the major cost component in their cases was the review of documents for relevance, responsiveness, and privilege (typically about 73 percent). Collection, on the other hand, only constituted about 8 percent of expenditures for the cases in the study, while processing costs constituted about 19 percent in the cases.  It costs about $14,000 to review each gigabyte and $20,000 in total production costs for each gigabyte (click here for a previous study on per gigabyte costs).  Review costs would have to be reduced by about 75% in order to make those costs comparable to processing, the next highest component.
  • Outside Counsel Makes Up the Largest Percentage of eDiscovery Expenditures: Again, by a whopping amount, the major cost component was expenditures for outside counsel services, which constituted about 70 percent of total eDiscovery production costs.  Vendor expenditures were around 26 percent.  Internal expenditures, even with adjustments made for underreporting, were generally around 4 percent of the total.  So, almost all eDiscovery expenditures are outsourced in one way or another.
  • If Conducted in the Traditional Manner, Review Costs Are Difficult to Reduce Significantly: Rates currently paid to “project attorneys during large-scale reviews in the US may well have bottomed out” and foreign review teams are often not a viable option due to “issues related to information security, oversight, maintaining attorney-client privilege, and logistics”.  Increasing the rate of review is also limited as, “[g]iven the trade-off between reading speed and comprehension…it is unrealistic to expect much room for improvement in the rates of unassisted human review”.  The study also notes that techniques for grouping documents, such as near-duplicate detection and clustering, while helpful, are “not the answer”.
  • Computer-Categorized Document Review Techniques May Be a Solution: Techniques such as predictive coding have the potential of reducing the review hours by about 75% with about the same level of consistency, resulting in review costs of less than $2,000 and total production costs of less than $7,000.  However, “lack of clear signals from the bench” that the techniques are defensible and lack of confidence by litigants that the techniques are reliable enough to reliably identify the majority of responsive documents and privileged documents are barriers to wide-scale adoption.

Not surprisingly, the recommendations included taking “the bold step of using, publicly and transparently, computer-categorized document review techniques” for large-scale eDiscovery efforts.

So, what do you think?  Are you surprised by the cost numbers?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: Test Your Searches Before the Meet and Confer

 

One of the very first posts ever on this blog discussed the danger of using wildcards.  For those who haven’t been following the blog from the beginning, here’s a recap.

A couple of years ago, I provided search strategy assistance to a client that had already agreed upon several searches with opposing counsel.  One search related to mining activities, so the attorney decided to use a wildcard of “min*” to retrieve variations like “mine”, “mines” and “mining”.

That one search retrieved over 300,000 files with hits.

Why?  Because there are 269 words in the English language that begin with the letters “min”.  Words like “mink”, “mind”, “mint” and “minion” were all being retrieved in this search for files related to “mining”.  We ultimately had to go back to opposing counsel and attempt to negotiate a revised search that was more appropriate.

What made that process difficult was the negotiation with opposing counsel.  My client had already agreed on over 200 terms with opposing counsel and had proposed many of those terms, including this one.  The attorneys had prepared these terms without assistance from a technology consultant (I was brought into the project after the terms were negotiated and agreed upon) and without testing any of the terms.

Since they had been agreed upon, opposing counsel was understandably resistant to modifying the terms.  The fact that my client faced having to review all of these files was not their problem.  We were ultimately able to provide a clear indication that many of the terms in this search were non-responsive and were able to get opposing counsel to agree to a modified list of variations of “mine” that included “minable”, “mine”, “mineable”, “mined”, “minefield”, “minefields”, “miner”, “miners”, “mines”, “mining” and “minings”.  We were able sort through the “minutia” and “minimize” the result set to less than 12,000 files with hits, saving our client a “mint”, which they certainly didn’t “mind”.  OK, I’ll stop now.

However, there were several other inefficient terms that opposing counsel refused to renegotiate and my client was forced to review thousands of additional files that they shouldn’t have had to review, which was a real “mindblower” (sorry, I couldn’t resist).  Had the client included a technical member on the team and had they tested each of these searches before negotiating terms with opposing counsel, they would have been able to figure out which terms were overbroad and would have been better prepared to negotiate favorable search terms for retrieving potentially responsive data.

When litigation is anticipated, it’s never too early to begin collecting potentially responsive data and assessing it by performing searches and testing the results.  However, if you wait until after the meet and confer with opposing counsel, it can be too late.

So, what do you think?  What steps do you take to assess your data before negotiating search terms?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: For Da Silva Moore Addicts

 

I am getting prepared to head for sunny Los Angeles for LegalTech West Coast shortly, so today I’m getting by with a little help from my friends.  Tomorrow and Wednesday, I’ll be covering the show.  It wouldn’t be a week in eDiscovery without some tidbits about the Da Silva Moore case, so here are some other sources of information and perspectives about the eDiscovery case of the year (so far).  But, first, let’s recap.

Several weeks ago, in Da Silva Moore v. Publicis Groupe & MSL Group, No. 11 Civ. 1279 (ALC) (AJP) (S.D.N.Y. Feb. 24, 2012), Magistrate Judge Andrew J. Peck of the U.S. District Court for the Southern District of New York issued an opinion making it likely the first case to accept the use of computer-assisted review of electronically stored information (“ESI”) for this case.  However, on March 13, District Court Judge Andrew L. Carter, Jr. granted plaintiffs’ request to submit additional briefing on their February 22 objections to the ruling.  In that briefing (filed on March 26), the plaintiffs claimed that the protocol approved for predictive coding “risks failing to capture a staggering 65% of the relevant documents in this case” and questioned Judge Peck’s relationship with defense counsel and with the selected vendor for the case, Recommind.

Then, on April 5, Judge Peck issued an order in response to Plaintiffs’ letter requesting his recusal, directing plaintiffs to indicate whether they would file a formal motion for recusal or ask the Court to consider the letter as the motion.  On April 13, (Friday the 13th, that is), the plaintiffs did just that, by formally requesting the recusal of Judge Peck (the defendants issued a response in opposition on April 30).  But, on April 25, Judge Carter issued an opinion and order in the case, upholding Judge Peck’s opinion approving computer-assisted review.

Not done, the plaintiffs filed an objection on May 9 to Judge Peck's rejection of their request to stay discovery pending the resolution of outstanding motions and objections (including the recusal motion, which has yet to be ruled on.  Then, last Monday, Judge Peck issued a stay, stopping defendant MSLGroup's production of electronically stored information.

More News

And, there’s even more news.  As Sean Doherty of Law Technology News reports, last Monday, Judge Peck denied an amicus curiae (i.e., friend-of-the-court) brief filed in support of the plaintiffs' motion for recusal.  For more on the filing and Judge Peck’s denial of the motion, click here.

Summary of Filings

Rob Robinson of ComplexD has provided a thorough summary of filings in a single PDF file.  He provides a listing of the filings, a Scribd plug-in viewer of the file – all 1,320 pages(!), so be patient as the page takes a little time to load – and a link to download the PDF file.  The ability to search through the entire case of filings for key issues and terms is well worth it.  Thanks, Rob!

Da Silva Moore and the Role of ACEDS

Also, Sharon Nelson of the Ride The Lightning blog (and a previous thought leader interviewee on this blog) has provided a very detailed blog post regarding the in depth investigation that the Association of Certified E-Discovery Specialists® (ACEDS™) has conducted on the case, including requesting financial disclosures for Judge Peck for 2008, 2009, 2010 and 2011 (for items including for “honoraria” and “teaching fees.”).  She wonders why “a certification body would want to be so heavily involved in an investigation of a judge in a very controversial case” and offers some possible thoughts as to why.  A very interesting read!

So, what do you think?  Are you “maxed out” on Da Silva Moore coverage yet?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.