Analysis Archives

eDiscovery Daily is Two Years Old Today!

September 20, 2012

It’s hard to believe that it has been two years ago today since we launched the eDiscoveryDaily blog. Now that we’ve hit the “terrible twos”, is the blog going to start going off on rants about various eDiscovery topics, like Will McAvoy in The Newsroom? Maybe. Or maybe not. Wouldn’t that be fun!

As we noted when recently acknowledging our 500th post, we have seen traffic on our site (from our first three months of existence to our most recent three months) grow an amazing 442%! Our subscriber base has nearly doubled in the last year alone! We now have nearly seven times the visitors to the site as we did when we first started. We continue to appreciate the interest you’ve shown in the topics and will do our best to continue to provide interesting and useful eDiscovery news and analysis. That’s what this blog is all about. And, in each post, we like to ask for you to “please share any comments you might have or if you’d like to know more about a particular topic”, so we encourage you to do so to make this blog even more useful.

We also want to thank the blogs and publications that have linked to our posts and raised our public awareness, including Pinhawk, The Electronic Discovery Reading Room, Unfiltered Orange, Litigation Support Blog.com, Litigation Support Technology & News, Ride the Lightning, InfoGovernance Engagement Area, Learn About E-Discovery, Alltop, Law.com, Justia Blawg Search, Atkinson-Baker (depo.com), ABA Journal, Complex Discovery, Next Generation eDiscovery Law & Tech Blog and any other publication that has picked up at least one of our posts for reference (sorry if I missed any!). We really appreciate it!

We like to take a look back every six months at some of the important stories and topics during that time. So, here are some posts over the last six months you may have missed. Enjoy!

We talked about best practices for issuing litigation holds and how issuing the litigation hold is just the beginning.

By the way, did you know that if you deleted a photo on Facebook three years ago, it may still be online?

We discussed states (Delaware, Pennsylvania and Florida) that have implemented new rules for eDiscovery in the past few months.

We talked about how to achieve success as a non-attorney in a law firm, providing quality eDiscovery services to your internal “clients” and how to be an eDiscovery consultant, and not just an order taker, for your clients.

We warned you that stop words can stop your searches from being effective, talked about how important it is to test your searches before the meet and confer and discussed the importance of the first 7 to 10 days once litigation hits in addressing eDiscovery issues.

We told you that, sometimes, you may need to collect from custodians that aren’t there, differentiated between quality assurance and quality control and discussed the importance of making sure that file counts add up to what was collected (with an example, no less).

By the way, did you know the number of pages in a gigabyte can vary widely and the same exact content in different file formats can vary by as much as 16 to 20 times in size?

We provided a book review on Zubulake’s e-Discovery and then interviewed the author, Laura Zubulake, as well.

BTW, eDiscovery Daily has had 150 posts related to eDiscovery Case Law since the blog began. Fifty of them have been in the last six months.

P.S. – We still haven't missed a business day yet without a post. Yes, we are crazy.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: Quality Control, Making Sure the Numbers Add Up

September 18, 2012

Yesterday, we wrote about tracking file counts from collection to production, the concept of expanded file counts, and the categorization of files during processing. Today, let’s walk through a scenario to show how the files collected are accounted for during the discovery process.

Tracking the Counts after Processing

We discussed the typical categories of excluded files after processing – obviously, what’s not excluded is available for searching and review. Even if your approach includes a technology assisted review (TAR) methodology such as predictive coding, it’s still likely that you will want to do some culling out of files that are clearly non-responsive.

Documents during review may be classified in a number of ways, but the most common ways to classify documents as to whether they are responsive, non-responsive, or privileged. Privileged documents are also typically classified as responsive or non-responsive, so that only the responsive documents that are privileged need be identified on a privilege log. Responsive documents that are not privileged are then produced to opposing counsel.

Example of File Count Tracking

So, now that we’ve discussed the various categories for tracking files from collection to production, let’s walk through a fairly simple eMail based example. We conduct a fairly targeted collection of a PST file from each of seven custodians in a given case. The relevant time period for the case is January 1, 2010 through December 31, 2011. Other than date range, we plan to do no other filtering of files during processing. Duplicates will not be reviewed or produced. We’re going to provide an exception log to opposing counsel for any file that cannot be processed and a privilege log for any responsive files that are privileged. Here’s what this collection might look like:

Collected Files: 101,852 – After expansion, 7 PST files expand to 101,852 eMails and attachments.
Filtered Files: 23,564 – Filtering eMails outside of the relevant date range eliminates 23,564 files.
Remaining Files after Filtering: 78,288 – After filtering, there are 78,288 files to be processed.
NIST/System Files: 0 – eMail collections typically don’t have NIST or system files, so we’ll assume zero files here. Collections with loose electronic documents from hard drives typically contain some NIST and system files.
Exception Files: 912 – Let’s assume that a little over 1% of the collection (912) is exception files like password protected, corrupted or empty files.
Duplicate Files: 24,215 – It’s fairly common for approximately 30% of the collection to include duplicates, so we’ll assume 24,215 files here.
Remaining Files after Processing: 53,161 – We have 53,161 files left after subtracting NIST/System, Exception and Duplicate files from the total files after filtering.
Files Culled During Searching: 35,618 – If we assume that we are able to cull out 67% (approximately 2/3 of the collection) as clearly non-responsive, we are able to cull out 35,618 files.
Remaining Files for Review: 17,543 – After culling, we have 17,543 files that will actually require review (whether manual or via a TAR approach).
Files Tagged as Non-Responsive: 7,017 – If approximately 40% of the document collection is tagged as non-responsive, that would be 7,017 files tagged as such.
Remaining Files Tagged as Responsive: 10,526 – After QC to ensure that all documents are either tagged as responsive or non-responsive, this leaves 10,526 documents as responsive.
Responsive Files Tagged as Privileged: 842 – If roughly 8% of the responsive documents are privileged, that would be 842 privileged documents.
Produced Files: 9,684 – After subtracting the privileged files, we’re left with 9,684 responsive, non-privileged files to be produced to opposing counsel.

The percentages I used for estimating the counts at each stage are just examples, so don’t get too hung up on them. The key is to note the numbers in red above. Excluding the interim counts in black, the counts in red represent the different categories for the file collection – each file should wind up in one of these totals. What happens if you add the counts in red together? You should get 101,852 – the number of collected files after expanding the PST files. As a result, every one of the collected files is accounted for and none “slips through the cracks” during discovery. That’s the way it should be. If not, investigation is required to determine where files were missed.

So, what do you think? Do you have a plan for accounting for all collected files during discovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: Quality Control, It’s a Numbers Game

September 17, 2012

Previously, we wrote about Quality Assurance (QA) and Quality Control (QC) in the eDiscovery process. Both are important in improving the quality of work product and making the eDiscovery process more defensible overall. For example, in attorney review, QA mechanisms include validation rules to ensure that entries are recorded correctly while QC mechanisms include a second review (usually by a review supervisor or senior attorney) to ensure that documents are being categorized correctly. Another overall QC mechanism is tracking of document counts through the discovery process, especially from collection to production, to identify how every collected file was handled and why each non-produced document was not produced.

Expanded File Counts

Scanned counts of files collected are not the same as expanded file counts. There are certain container file types, like Outlook PST files and ZIP archives that exist essentially to store a collection of other files. So, the count that is important to track is the “expanded” file count after processing, which includes all of the files contained within the container files. So, in a simple scenario where you collect Outlook PST files from seven custodians, the actual number of documents (emails and attachments) within those PST files could be in the tens of thousands. That’s the starting count that matters if your goal is to account for every document in the discovery process.

Categorization of Files During Processing

Of course, not every document gets reviewed or even included in the search process. During processing, files are usually categorized, with some categories of files usually being set aside and excluded from review. Here are some typical categories of excluded files in most collections:

Filtered Files: Some files may be collected, and then filtered during processing. A common filter for the file collection is the relevant date range of the case. If you’re collecting custodians’ source PST files, those may include messages outside the relevant date range; if so, those messages may need to be filtered out of the review set. Files may also be filtered based on type of file or other reasons for exclusion.
NIST and System Files: Many file collections also contain system files, like executable files (EXEs) or Dynamic Link Library (DLLs) that are part of the software on a computer which do not contain client data, so those are typically excluded from the review set. NIST files are included on the National Institute of Standards and Technology list of files that are known to have no evidentiary value, so any files in the collection matching those on the list are “De-NISTed”.
Exception Files: These are files that cannot be processed or indexed, for whatever reason. For example, they may be password-protected or corrupted. Just because these files cannot be processed doesn’t mean they can be ignored, depending on your agreement with opposing counsel, you may need to at least provide a list of them on an exception log to prove they were addressed, if not attempt to repair them or make them accessible (BTW, it’s good to establish that agreement for disposition of exception files up front).
Duplicate Files: During processing, files that are exact duplicates may be put aside to avoid redundant review (and potential inconsistencies). Some exact duplicates are typically identified based on the HASH value, which is a digital fingerprint generated based on the content and format of the file – if two files have the same HASH value, they have the same exact content and format. Emails (and their attachments) may be identified as duplicates based on key metadata fields, so an attachment cannot be “de-duped” out of the collection by a standalone copy of the same file.

All of these categories of excluded files can reduce the set of files to actually be searched and reviewed. Tomorrow, we’ll illustrate an example of a file set from collection to production to illustrate how each file is accounted for during the discovery process.

So, what do you think? Do you have a plan for accounting for all collected files during discovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Milestones: Our 500th Post!

August 30, 2012

One thing about being a daily blog is that the posts accumulate more quickly. As a result, I’m happy to announce that today is our 500th post on eDiscoveryDaily! In less than two years of existence!

When we launched on September 20, 2010, our goal was to be a daily resource for eDiscovery news and analysis and we have done our best to deliver on that goal. During that time, we have published 144 posts on eDiscovery Case Law and have identified numerous cases related to Spoliation Claims and Sanctions. We’ve covered every phase of the EDRM life cycle, including:

We’ve discussed key industry trends in Social Media Technology and Cloud Computing. We’ve published a number of posts on eDiscovery best practices on topics ranging from Project Management to coordinating eDiscovery within Law Firm Departments to Searching and Outsourcing. And, a lot more. Every post we have published is still available on the site for your reference.

Comparing our first three months of existence with our most recent three months, we have seen traffic on our site grow an amazing 442%! Our subscriber base has nearly doubled in the last year alone!

And, we have you to thank for that! Thanks for making the eDiscoveryDaily blog a regular resource for your eDiscovery news and analysis! We really appreciate the support!

I also want to extend a special thanks to Jane Gennarelli, who has provided some wonderful best practice post series on a variety of topics, ranging from project management to coordinating review teams to learning how to be a true eDiscovery consultant instead of an order taker. Her contributions are always well received and appreciated by the readers – and also especially by me, since I get a day off!

We always end each post with a request: “Please share any comments you might have or if you’d like to know more about a particular topic.” And, we mean it. We want to cover the topics you want to hear about, so please let us know.

Tomorrow, we’ll be back with a new, original post. In the meantime, feel free to click on any of the links above and peruse some of our 499 previous posts. Maybe you missed some? 😉

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Case Law: No Kleen Sweep for Technology Assisted Review

August 24, 2012

For much of the year, proponents of predictive coding and other technology assisted review (TAR) concepts have been pointing to three significant cases where the technology based approaches have either been approved or are seriously being considered. Da Silva Moore v. Publicis Groupe and Global Aerospace v. Landow Aviation are two of the cases, the third one is Kleen Products v. Packaging Corp. of America. However, in the Kleen case, the parties have now reached an agreement to drop the TAR-based approach, at least for the first request for production.

Background and Debate Regarding Search Approach

On February 21, the plaintiffs asked Magistrate Judge Nan Nolan to require the producing parties to employ a technology assisted review approach (referred to as "content-based advanced analytics," or CBAA) in their production of documents for discovery purposes.

In their filing, the plaintiffs claimed that “[t]he large disparity between the effectiveness of [the computer-assisted coding] methodology and Boolean keyword search methodology demonstrates that Defendants cannot establish that their proposed [keyword] search methodology is reasonable and adequate as they are required.” Citing studies conducted between 1994 and 2011 claimed to demonstrate the superiority of computer-assisted review over keyword approaches, the plaintiffs claimed that computer-assisted coding retrieved for production “70 percent (worst case) of responsive documents rather than no more than 24 percent (best case) for Defendants’ Boolean, keyword search.”

In their response, the defendants contended that the plaintiffs "provided no legitimate reason that this Court should deviate here from reliable, recognized, and established discovery practices" in favor of their "unproven" CBAA methods. The defendants also emphasized that they have "tested, independently validated, and implemented a search term methodology that is wholly consistent with the case law around the nation and that more than satisfies the ESI production guidelines endorsed by the Seventh Circuit and the Sedona Conference." Having (according to their briefing) already produced more than one million pages of documents using their search methods, the defendants conveyed outrage that the plaintiffs would ask the court to "establish a new and radically different ESI standard for cases in this District."

Stipulation and Order

After “a substantial number of written submissions and oral presentations to the Court” regarding the search technology issue, “in order to narrow the issues, the parties have reached an agreement that will obviate the need for additional evidentiary hearings on the issue of the technology to be used to search for documents responsive to the First Requests.” That agreement was memorialized this week in the Stipulation and Order Relating to ESI Search (link to stipulation courtesy of Law.com). As part of that agreement, the plaintiffs have withdrawn their demand that the defendants apply CBAA to the first production request (referred to in the stipulation as the “First Request Corpus”).

As for productions beyond the First Request Corpus, the plaintiffs also agreed not to “argue or contend” that the defendants should be required to CBAA or “predictive coding” with respect to any requests for production served on any defendant prior to October 1, 2013. As for requests for production served after October 1, 2013, it was agreed that the parties would “meet and confer regarding the appropriate search methodology to be used for such newly collected documents”, with the ability for either party to file a motion if they can’t agree. So, there will be no TAR-based approach in the Kleen case, at least until next October.

So, what do you think? Does this signal a difficulty in obtaining approval for TAR-based approaches? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: For Successful Predictive Coding, Start Randomly

August 20, 2012

Predictive coding is the hot eDiscovery topic of 2012, with three significant cases (Da Silva Moore v. Publicis Groupe, Global Aerospace v. Landow Aviation and Kleen Products v. Packaging Corp. of America) either approving or considering the use of predictive coding for eDiscovery. So, how should your organization begin when preparing a collection for predictive coding discovery? For best results, start randomly.

If that statement seems odd, let me explain.

Predictive coding is the use of machine learning technologies to categorize an entire collection of documents as responsive or non-responsive, based on human review of only a subset of the document collection. That subset of the collection is often referred to as the “seed” set of documents. How the seed set of documents is derived is important to the success of the predictive coding effort.

Random Sampling, It’s Not Just for Searching

When we ran our series of posts (available here, here and here) that discussed the best practices for random sampling to test search results, it’s important to note that searching is not the only eDiscovery activity where sampling a set of documents is a good practice. It’s also a vitally important step for deriving that seed set of documents upon which the predictive coding software learning decisions will be made. As is the case with any random sampling methodology, you have to begin by determining the appropriate sample size to represent the collection, based on your desired confidence level and an acceptable margin of error (as noted here). To ensure that the sample is a proper representative sample of the collection, you must ensure that the sample is performed from the entire collection to be predictively coded.

Given the debate in the above cases regarding the acceptability of the proposed predictive coding approaches (especially Da Silva Moore), it’s important to be prepared to defend your predictive coding approach and conducting a random sample to generate the seed documents is a key step to defensibility of that approach.

Then, once the sample is generated, the next key to success is the use of a subject matter expert (SME) to make responsiveness determinations. And, it’s important to conduct a sample (there’s that word again!) of the result set after the predictive coding process to determine whether the process achieved a sufficient quality in automatically coding the remainder of the collection.

So, what do you think? Do you start your predictive coding efforts “randomly”? You should. Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Use of Internet-Based Tools, Predictive Coding, Up in 2012, Says ABA

August 16, 2012

According to a recently released report from the American Bar Association (ABA), use of Internet-based electronic discovery tools and predictive coding has risen in 2012. The 2012 ABA Legal Technology Survey Report: Litigation and Courtroom Technology (Volume III) discusses the use of technology related to litigation, ranging from hardware used in the courtroom to technology related to eDiscovery and e-filing. It includes a trend report summarizing this year’s notable results and highlighting changes from previous years.

Statistical Highlights

Here are some of the notable stats from the ABA study:

Use of Internet-based eDiscovery and Litigation Support

44% of attorneys whose firm had handled an eDiscovery case said they had used Internet-based eDiscovery tools (up from 31% in 2011 – a 42% rise in usage);
In sole practitioner firms, 33% of attorneys said they had used Internet-based eDiscovery tools whereas nearly 67% of attorneys in large firms (500 or more attorneys) indicated they had used those tools;
35% of attorneys said they had used Internet-based litigation support software (up from 24% in 2011 – a 46% rise in usage).

Use of Desktop-based eDiscovery and Litigation Support

Use of Desktop-based eDiscovery rose from 46% to 48% (just a 4% rise in usage) and use of Desktop-based Litigation Support remained the same at 46%.

Use of Predictive Coding Technology

23% of those attorneys said they had used predictive coding technology to process or review ESI (up from 15% in 2011 – a 53% rise in usage);
Of the firms that have handled an eDiscovery case, only 5% of sole practitioners and only 6% of firms with less than 10 attorneys indicated they had used predictive coding technology whereas nearly 44% of attorneys in large firms said they used predictive coding.

Outsourcing

44% of attorneys surveyed indicated that they outsourced work to eDiscovery consultants and companies (slightly down from 45% in 2011 – a 2% drop);
Outsourcing to computer forensics specialists remained unchanged at 42%, according to the survey;
On the other hand, 25% of respondents indicated that they outsource to attorneys in other firms (up from 16% in 2011 – a 56% rise!). Hmmm…

All percentages rounded.

The 2012 ABA Legal Technology Survey Report is comprised of six volumes, with eDiscovery results discussed in Volume III (link above), which can be purchased from the ABA for $350 (or $300 if you’re an ABA member). If you’re just interested in the trend report, the cost for that is $55 ($45 for ABA members).

So, what do you think? Any surprises? Do those numbers reflect your own usage of the technologies and outsourcing patterns? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: Assessing Your Data Before Meet and Confer Shouldn’t Be Expensive

August 14, 2012

So, you’re facing litigation and you need help from an outside provider to “get your ducks in a row” to understand how much data you have, how many documents have hits on key terms and estimate the costs to process, review and produce the data so that you’re in the best position to negotiate appropriate terms at the Rule 26(f) conference (aka, meet and confer). But, how much does it cost to do all that? It shouldn’t be expensive. In fact, it could even be free.

Metadata Inventory

Once you’ve collected data from your custodians, it’s important to understand how much data you have for each custodian and how much data is stored on each media collected. You should also be able to break the collection down by file type and by date range. A provider should be able to process the data and provide a metadata inventory of the collected electronically stored information (ESI) that enables the inventory to be queried by:

Data source (hard drive, folder, or custodian)
Folder names and sizes
File names and sizes
Volume by file type
Date created and last date modified

When this done prior to the Rule 26(f) conference, it enables your legal team to intelligently negotiate at the conference by understanding the potential volume (and therefore potential cost) of including or excluding certain custodians, document types, or date ranges in the discovery order.

Word Index of the Collection

Want to get a sense of how many documents mention each of the key players in the case? Or, how many mention the key issues? After a simple index of the data, a provider should be able to at least provide a consolidated report of all the words (not including stop words, of course), from all sources that includes number of occurrences for each word in the collected ESI (at least for files that contain embedded text). This initial index won’t catch everything – image-only files and exception (e.g., corrupted or password protected) files won’t be included – but it will enable your legal team to intelligently negotiate at the meet and confer by understanding the potential volume (and therefore potential cost) of including or excluding certain key words in the discovery order.

eDiscovery Budget Worksheet

Loading the metadata inventory into an eDiscovery budget worksheet that includes standard performance data (such as document review production statistics) and projected billing rates and costs can provide a working eDiscovery project budget projection for the case. This projection can enable your legal team to advise their client of projected costs of the case, negotiate cost sharing or cost burden arguments in the meet and confer, and create a better discovery production strategy.

It shouldn’t be expensive to prepare these items to develop an initial assessment of the case to prepare for the Rule 26(f) conference. In fact, the company that I work for, CloudNine Discovery, provides these services for free. But, regardless who you use, it’s important to assess your data before the meet and confer to enable your legal team to understand the potential costs and risks associated with the case and negotiate the best possible approach for your client.

So, what do you think? What analysis and data assessment do you perform prior to the meet and confer? Please share any comments you might have or if you’d like to know more about a particular topic.

P.S.: No ducks were harmed in the making of this blog post.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: The Growth of eDiscovery is Transparent

August 9, 2012

With data in the world doubling every two years or so and the variety of issues that organizations need to address to manage that data from an eDiscovery standpoint, it would probably surprise none of you that the eDiscovery market is growing. But, do you know how quickly the market is growing?

According to a new market report published by Transparency Market Research (and reported by BetaNews), the global eDiscovery market is expected to rise 275% from 2010 to 2017. Their report eDiscovery (Software and Service) Market – Global Scenario, Trends, Industry Analysis, Size, Share and Forecast, 2010 – 2017 indicates that the global eDiscovery market was worth $3.6 billion in 2010 and is expected to reach $9.9 billion by 2017, growing at a Compound Annual Growth Rate (CAGR) of 15.4% during that time. Here are some other noteworthy stats that they report and forecast:

The U.S. portion of the eDiscovery market was valued at $3.0 billion in 2010, and is estimated to grow at a CAGR of 13.3% from 2010 to 2017 to reach $7.2 billion by 2017 (240% total growth);
The eDiscovery market in the rest of the world was valued at $600 million in 2010, and is estimated to grow at a CAGR of 23.2% from 2010 to 2017 to reach $2.7 billion by 2017 (450% total growth – wow!);
Not surprisingly, the U.S. is expected to continue to be the leader in terms of revenue with 73% of global eDiscovery market share in 2017;
The report also breaks the market into software based eDiscovery and services based eDiscovery, with the global software based eDiscovery market valued at $1.1 billion in 2010 and expected to grow at a CAGR of 11.5% to reach $2.5 billion by 2017 (227% total growth) and the global services based eDiscovery market valued at $2.5 billion in 2010 and expected to grow at a CAGR of 17.0% to reach $7.4 billion by 2017 (296% total growth).

According to the report, key factors driving the global eDiscovery market include “increasing adoption of predictive coding, growing risk mitigation activities in organizations, increase in criminal prosecutions and civil litigation and growth of record management across various industries”. They predict that “[i]n the next five years, the e-discovery industry growth will get further support from increasing automatic enterprise information archiving applications, growth in multi-media search for sound and visual data, next generation technology growth for cloud computing i.e. virtualization and increasing involvement of organizations in the social media space.”

The report also discusses topics such as pricing trends, competitor analysis, growth drivers, opportunities and inhibitors and provides company profiles of several big players in the industry. The 96 page report is available in a single user license for $4,395 up to a corporate license for $10,395.

So, what do you think? Do those growth numbers surprise you? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Interview with Laura Zubulake of Zubulake’s e-Discovery, Part 2

August 7, 2012

Last week, we discussed the new book by Laura A. Zubulake, the plaintiff in probably most famous eDiscovery case ever (Zubulake vs. UBS Warburg), entitled Zubulake's e-Discovery: The Untold Story of my Quest for Justice. I also conducted an interview with Laura last week to get her perspective on the book, including her reasons for writing the book seven years after the case ended and what she expects readers to learn from her story.

The book is the story of the Zubulake case – which resulted in one of the largest jury awards in the US for a single plaintiff in an employment discrimination case – as told by the author, in her words. As Zubulake notes in the Preface, the book “is written from the plaintiff’s perspective – my perspective. I am a businessperson, not an attorney. The version of events and opinions expressed are portrayed by me from facts and circumstances as I perceived them.” It’s a “classic David versus Goliath story” describing her multi-year struggle against her former employer – a multi-national financial giant. The book is available at Amazon and also at CreateSpace.

Our interview with Laura had so much good information in it, we couldn’t fit it all into a single post. Yesterday was part 1. Here is the second and final part!

What advice would have for plaintiffs who face a similar situation to the one you faced?

I don’t give advice, and I’ll tell you why. It’s because every case is different. And, it’s not just the facts of the case but it’s also the personal lives of the plaintiffs. So, it’s very difficult for me to do that. Unless you’re in someone else’s shoes, you really can’t appreciate what they’re going through, so I don’t give advice.

What do you think about the state of eDiscovery today and where do you think that more attention could be paid to the discovery process?

While I don’t work in the industry day-to-day, I read a lot and keep up with the trends and it’s pretty incredible to me how it has changed over the past eight to nine years. The first opinions in my case were in 2003 and 2004. Back then, we had so little available with regard to technology and legal guidance. When I attend a conference like LegalTech, I’m always amazed at the number of vendors and all the technology that’s now offered. From that standpoint, how it has matured as an industry is a good thing. However, I do believe that there are still important issues with regard to eDiscovery to be addressed. When you read surveys and you see how many corporations still have yet to adopt certain aspects of the eDiscovery process, the fact that’s the case raises concern. Some firms have not implemented litigation holds or document retention policies or an information governance structure to manage their information and you would think by now that a majority of corporations would have adopted something along those lines.

I guess organizations still think discovery issues and sanctions won’t happen to them. And, while I recognize the difficulty in a large organization with lots of employees to control everything and everybody, I’m surprised at the number of cases where sanctions occur. I do read some of the case law and I do “scratch my head” from time to time. So, I think there are still issues.

Obviously, the hot topic now is predictive coding. My concern is that people perceive that as the “end all” and the ultimate answer to questions. I think that processes like predictive coding will certainly help, but I think there’s still something to be said for the “human touch” when it comes to reviewing documents. I think that we’re making progress, but I think there is still more yet to go.

I read in an article that you were considering opening up an eDiscovery consulting practice. Is that the case and, if so, what will be unique about your practice?

It’s something that I’m considering. I’ve been working on the book, but I’d like to get back into more of a routine and perhaps focus on education for employees. When people address eDiscovery issues, they look to implement technology and look to establish retention policies and procedures to implement holds, and that’s all good. But, at the same time, I think there should be more efforts to educate the employees because they’re the ones who create the electronic documents. Educate them as to the risks involved and procedures to follow to minimize those risks, such as litigation holds. I think if you have an educated workforce and they understand that “less is more” when writing electronic documents, that they don’t always have to copy someone or forward something, that they can be more selective in their writing to reduce costs.

I think because of my background and my personal experiences and because I’m not an attorney, I can relate more to the typical worker. I was on the trading desk and I know the day-to-day stresses of trying to manage email, trying to do the right thing, but also trying to be productive. I think I can also relate to senior management and advise them that, although they may not recognize the risk, the risk is there. And, that’s because I’ve been a worker, I’ve been on the trading desk, I’ve been through litigation, I’ve actually reviewed documents and I’ve gone to trial. So, if you think that not implementing information governance or other eDiscovery policies is a good idea, that’s not the case. Corporations should see this as an opportunity to manage information and use those management structures for the benefit of their company.

Thanks, Laura, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Analysis