eDiscovery Trends: Needing “Technology Assisted Review” to Write a Blog Post


Late on a Thursday night, with a variety of tasks and projects on my plate at the moment, it seems more difficult this night to find a unique and suitable topic for today’s blog post.

One thing I often do when looking for ideas is to hit the web and turn to the many resources that I read regularly to stay abreast of developments in the industry.  Usually when I do that, I find one article or blog post that “speaks to me” as a topic to talk about on this blog.  However, when doing so last night, I found several topics worth discussing and had difficulty selecting just one.  So, here are some of the notable articles and posts that I’ve been reviewing:

There’s plenty more articles out there.  I’ve barely scratched the surface.  When we launched eDiscovery Daily about 16 months ago, some wondered whether there would be enough eDiscovery news and information to talk about on a daily basis.  The problem we have found instead is that there is SO much to talk about, it’s difficult to choose.  Today, I was unable to choose just one topic, so, as the picture notes, “I have nothing to say”.  Therefore, I’ve had to use “technology assisted review” to provide a post to you, thanks to the many excellent articles and blogs out there.  Enjoy!

So, what do you think?  Are there any specific topics that you find are being discussed a lot on the web?  Are there any topics that you’d like to see discussed more?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Our 2012 Predictions


Yesterday, we evaluated what others are saying and noted popular eDiscovery prediction trends for the coming year.  It’s interesting to identify common trends among the prognosticators and also the unique predictions as well.

But we promised our own predictions for today, so here they are.  One of the nice things about writing and editing a daily eDiscovery blog is that it forces you to stay abreast of what’s going on in the industry.  Based on the numerous stories we’ve read (many of which we’ve also written about), and in David Letterman “Top 10” fashion, here are our eDiscovery predictions for 2012:

  • Still More ESI in the Cloud: Frankly, this is like predicting “the Sun will be hot in 2012”.  Given the predictions in cloud growth by Forrester and Gartner, it seems inevitable that organizations will continue to migrate more data and applications to “the cloud”.  Even if some organizations continue to resist the cloud movement, those organizations still have to address the continued growth in usage of social media sites in business (which, last I checked, are based in the cloud).  It’s inevitable.
  • More eDiscovery Technology in the Cloud As Well: We will continue to see more cloud offerings for eDiscovery technology, ranging from information governance to preservation and collection to review and production.  With the need for corporations to share potentially responsive ESI with one or more outside counsel firms, experts and even opposing counsel, cloud based Software-as-a-Service (SaaS) applications are a logical choice for sharing that information effortlessly without having to buy software, hardware and provide infrastructure to do so.  Every year at LegalTech, there seems to be a few more eDiscovery cloud providers and this year should be no different.
  • Self-Service in the Cloud: So, organizations are seeing the benefits of the cloud not only for storing ESI, but also managing it during Discovery.  It’s the cost effective alternative.  But, organizations are demanding the control of a desktop application within their eDiscovery applications.  The ability to load your own data, add your own users and maintain their rights, create your own data fields are just a few of the capabilities that organizations expect to be able to do themselves.  And, more providers are responding to those needs.  That trend will continue this year.
  • Technology Assisted Review: This was the most popular prediction among the pundits we reviewed.  The amount of data in the world continues to explode, as there were 988 exabytes in the whole world as of 2010 and Cisco predicts that IP traffic over data networks will reach 4.8 zettabytes (each zettabyte is 1,000 exabytes) by 2015.  More than five times the data in five years.  Even in the smaller cases, there’s simply too much data to not use technology to get through it all.  Whether it’s predictive coding, conceptual clustering or some other technology, it’s required to enable attorneys manage the review more effectively and efficiently.
  • Greater Adoption of eDiscovery Technology for Smaller Cases: As each gigabyte of data is between 50,000 and 100,000 pages, a “small” case of 4 GB (or two max size PST files in Outlook® 2003) can still be 300,000 pages or more.  As “small” cases are no longer that small, attorneys are forced to embrace eDiscovery technology for the smaller cases as well.  And, eDiscovery providers are taking note.
  • Continued Focus on International eDiscovery:  So, cases are larger and there’s more data in the cloud, which leads to more cases where Discovery of ESI internationally becomes an issue.  The Sedona Conference® just issued in December the Public Comment Version of The Sedona Conference® International Principles on Discovery, Disclosure & Data Protection: Best Practices, Recommendations & Principles for Addressing the Preservation & Discovery of Protected Data in U.S. Litigation, illustrating how important an issue this is becoming for eDiscovery.
  • Prevailing Parties Awarded eDiscovery Costs: Shifting to the courtroom, we have started to see more cases where the prevailing party is awarded their eDiscovery costs as part of their award.  As organizations have pushed for more proportionality in the Discovery process, courts have taken it upon themselves to impose that proportionality through taxing the “losers” for reimbursement of costs, causing prevailing defendants to say: “Sue me and lose?  Pay my costs!”.
  • Continued Efforts and Progress on Rules Changes: Speaking of proportionality, there will be continued efforts and progress on changes to the Federal Rules of Civil Procedure as organizations push for clarity on preservation and other obligations to attempt to bring spiraling eDiscovery costs under control.  It will take time, but progress will be made toward that goal this year.
  • Greater Price/Cost Control Pressure on eDiscovery Services: In the meantime, while waiting for legislative relief, organizations will expect the cost for eDiscovery services to be more affordable and predictable.  In order to accommodate larger amounts of data, eDiscovery providers will need to offer simplified and attractive pricing alternatives.
  • Big Player Consolidation Continues, But Plenty of Smaller Players Available: In 2011, we saw HP acquire Autonomy and Symantec acquire Clearwell, continuing a trend of acquisitions of the “big players” in the industry.  This trend will continue, but there is still plenty of room for the “little guy” as smaller providers have been pooling resources to compete, creating an interesting dichotomy in the industry of few big and many small providers in eDiscovery.

So, what do you think?  Care to offer your own predictions?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: 2012 Predictions – By The Numbers

With a nod to Nick Bakay, “It’s all so simple when you break things down scientifically.”

The late December/early January time frame is always when various people in eDiscovery make their annual predictions as to what trends to expect in the coming year.  I know what you’re thinking – “oh no, not another set of eDiscovery predictions!”  However, at eDiscovery Daily, we do things a little bit differently.  We like to take a look at other predictions and see if we can spot some common trends among those before offering some of our own (consider it the ultimate “cheat sheet”).  So, as I did last year, I went “googling” for 2012 eDiscovery predictions, and organized the predictions into common themes.  I found eDiscovery predictions here, here, here, here, here, here and Applied Discovery.  Oh, and also here, here and here.  Ten sets of predictions in all!  Whew!

A couple of quick comments: 1) Not all of these are from the original sources, but the links above attribute the original sources when they are re-prints.  If I have failed to accurately attribute the original source for a set of predictions, please feel free to comment.  2) This is probably not an exhaustive list of predictions (I have other duties in my “day job”, so I couldn’t search forever), so I apologize if I’ve left anybody’s published predictions out.  Again, feel free to comment if you’re aware of other predictions.

Here are some of the common themes:

  • Technology Assisted Review: Nine out of ten “prognosticators” (up from 2 out of 7 last year) predicted a greater emphasis/adoption of technological approaches.  While some equate technology assisted review with predictive coding, other technology approaches such as conceptual clustering are also increasing in popularity.  Clearly, as the amount of data associated with the typical litigation rises dramatically, technology is playing a greater role to enable attorneys manage the review more effectively and efficiently.
  • eDiscovery Best Practices Combining People and Technology: Seven out of ten “augurs” also had predictions related to various themes associated with eDiscovery best practices, especially processes that combine people and technology.  Some have categorized it as a “maturation” of the eDiscovery process, with corporations becoming smarter about eDiscovery and integrating it into core business practices.  We’ve had numerous posts regarding to eDiscovery best practices in the past year, click here for a selection of them.
  • Social Media Discovery: Six “pundits” forecasted a continued growth in sources and issues related to social media discovery.  Bet you didn’t see that one coming!  For a look back at cases from 2011 dealing with social media issues, click here.
  • Information Governance: Five “soothsayers” presaged various themes related to the promotion of information governance practices and programs, ranging from a simple “no more data hoarding” to an “emergence of Information Management platforms”.  For our posts related to Information Governance and management issues, click here.
  • Cloud Computing: Five “mediums” (but are they happy mediums?) predict that ESI and eDiscovery will continue to move to the cloud.  Frankly, given the predictions in cloud growth by Forrester and Gartner, I’m surprised that there were only five predictions.  Perhaps predicting growth of the cloud has become “old hat”.
  • Focus on eDiscovery Rules / Court Guidance: Four “prophets” (yes, I still have my thesaurus!) expect courts to provide greater guidance on eDiscovery best practices in the coming year via a combination of case law and pilot programs/model orders to establish expectations up front.
  • Complex Data Collection: Four “psychics” also predicted that data collection will continue to become more complex as data sources abound, the custodian-based collection model comes under stress and self-collection gives way to more automated techniques.

The “others receiving votes” category (three predicting each of these) included cost shifting and increased awards of eDiscovery costs to the prevailing party in litigation, flexible eDiscovery pricing and predictable or reduced costs, continued focus on international discovery and continued debate on potential new eDiscovery rules.  Two each predicted continued consolidation of eDiscovery providers, de-emphasis on use of backup tapes, de-emphasis on use of eMail, multi-matter eDiscovery management (to leverage knowledge gained in previous cases), risk assessment /statistical analysis and more single platform solutions.  And, one predicted more action on eDiscovery certifications.

Some interesting predictions.  Tune in tomorrow for ours!

So, what do you think?  Care to offer your own “hunches” from your crystal ball?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Budgeting, Part 3: Understanding the Elements Contributing to Cost


We've spent some time in Part 1 and Part 2 of this series discussing the factors and assumptions that go into eDiscovery budgeting, but what about the concrete eDiscovery process itself? In addition to understanding the factors that go into budgeting, it's important to recognize the elements that contribute to eDiscovery costs.

There are five primary factors that contribute to the costs of eDiscovery in progress:

  • Collection: Collection of ESI can be simple and effortless, conducted by the client itself, or it may require the assistance of a hired third party to gain access to the ESI. The cost of collection can go up depending on the level of travel required. Forensic investigation and custodian interviews are not always necessary, but also increase the cost in cases requiring them.
  • Volume: The raw volume of ESI is one factor in the cost of eDiscovery, but not necessarily the one that counts. What's most important is the volume that must be reviewed by human eyes—and that can mean all of it, or only a fraction of the total ESI retrieved. It's possible to filter eDiscovery data by removing unwanted file types, limiting a search to a particular date range, or searching for relevant key words and phrases in documents. In order to moderate cost, it's usually wise to start with a more limited eDiscovery scope and expand it to cover a larger volume if necessary.  Many eDiscovery service providers offer free early cost assessment services to help attorneys estimate the volume of potentially responsive data that needs to be processed and reviewed. 
  • Number of Custodians: The number of sources involved in the collection of data can increase exponentially the amount of time and effort involved in eDiscovery, thereby increasing the cost accordingly.
  • Human Review: This is the most expensive factor in eDiscovery, requiring as much as 80% of the total eDiscovery budget.  It requires not only human beings working on an hourly wage, but time spent on training and the learning curve as they become more adept at recognizing and refining the key elements and terms required to be produced in a particular case. The more people and time involved in data review, the greater the probable expense.
  • Case Complexity: While a simple case may require a limited scope and review process, complex court cases can involve searching the same documents for multiple types of information for discovery. As a result, complex cases require more time spent on a document review strategy, as well as on a more elaborate review process.

So, what do you think? Are there any other major factors in eDiscovery budgeting or expense? Please share any comments you might have or if you'd like to know more about a particular topic.

eDiscovery Budgeting, Part 2: Key Assumptions and Choices That Affect eDiscovery Budgeting


Friday, we talked about assumptions and elements that contribute to cost that need to be considered when budgeting for eDiscovery activities.

Now that you know a bit about the factors surrounding the cost of eDiscovery, let's take a look at budgeting and the estimates that attorneys provide to a client before beginning eDiscovery work. The first step in budgeting is to prepare an estimate based on your and your client’s best guesses and assumptions. What are some of these assumptions?

  • Volume: Volume is almost always the largest driver of cost, as it will affect not only the quantity of data to be collected and processed, but also the amount of time human beings must spend reviewing discovery documents for relevance and privilege. Volume is also one of the more ambiguous factors. The most accurate estimate of volume is in megabytes (MB), gigabytes (GB) or terabytes (TB), but you won't always have access to these kinds of size descriptions. Instead, a client may tell you that there are "50,000 or so pages" of data, or "about 10,000 emails". The size of pages can vary widely depending on whether they are in an email, a PDF, or a word document, so it can be very difficult to estimate volume with any degree of accuracy.
  • Scope: It's wise to start with the smallest possible scope and expand if necessary, but that can be an inefficient way to review documents for eDiscovery, as it may mean going over the same files twice for different aspects of your eventual scope.
  • Efficiency: Whenever possible, it's important to plan an eDiscovery strategy in advance that will allow for a more efficient review of documents and data. The ability to maintain an efficient process of eDiscovery is largely dependent on timing and the ability to plan.
  • Timing: More time for eDiscovery activities means that the scope and search details can be refined, optimizing efficiency and minimizing costs. If the eDiscovery must be done in a hurry, efficiency suffers and costs rise.
  • Risk: Risk tolerance is a factor in cost, determining how much attention must be paid to refining every aspect of document review and data access. Mitigating risk up front through agreement and cooperation with opposing counsel can clearly define the risk so that you know where you stand.
  • Location: Where the data is located can affect costs and so can the jurisdiction of the case.  For example, different courts have provided different rulings on spoliation claims, so it’s important to consider location as part of the budgeting process.

So, what do you think? Have you found any of these assumptions to be especially problematic in your own eDiscovery budgeting estimates? Please share any comments you might have or if you'd like to know more about a particular topic.

eDiscovery Budgeting, Part 1: Assumptions and Elements that Contribute to Cost


While attorneys may struggle with the regional and international regulations surrounding eDiscovery, your client is likely to be less concerned with the practical legal details of your discovery request, and more concerned with the financial cost.

Whether you're working with the plaintiff or the defense, one of the most important considerations in preparing for eDiscovery is presenting the expense accurately and completely to the client – and that means understanding for yourself the factors that go into budgeting for eDiscovery. There are two main sets of elements to consider: those that affect budgeting and estimates, and those that will have a direct impact on the ultimate cost of eDiscovery.

Understanding Assumptions in eDiscovery

Because so much of the eDiscovery process cannot be predicted without accurate information, it's important to confirm any estimates from a client or from opposing counsel before proceeding with a budget.

Does your client really know the volume of data that is likely to be contained in certain files or backups, or are they providing generalized figures that may not be accurate? Do you know for certain the precise scope of the information you need to examine for discovery? Attorneys need to verify as many estimates as possible, noting any and all assumptions in their estimates so that the client can prepare for potential changes in eDiscovery costs if those early assumptions prove to be inaccurate.

eDiscovery budgeting is predicated on guesswork and assumptions that may include:

  • Volume
  • Scope
  • Efficiency
  • Risk
  • Timing

Each of these factors will be discussed in an upcoming blog post next week detailing the assumptions that go into estimating a budget for eDiscovery.

Breaking Down the Cost of eDiscovery

Once the estimate is complete and you’re ready to tackle the real work of eDiscovery, there are particular elements that contribute to the cost, while others are more minimal.

Some of the major elements comprising the cost of eDiscovery include:

  • Collection: including factors such as travel, retrieval, custodian interviews, and forensic collection (if necessary)
  • Volume of data
  • Number of custodians
  • Human review: the most expensive factor in eDiscovery costs
  • Case complexity

I'll discuss more on each of these factors in an upcoming blog post, as well.

The cost of eDiscovery can also be affected by the degree of open communication with opposing counsel. A cooperative relationship with the opposition can streamline discovery, while a contentious relationship makes it likely that discovery-related motions and court appearances will increase the total cost of this process.

So, what do you think? How much up front effort goes into your eDiscovery budgeting process? How do you monitor progress against the budget?  Please share any comments you might have or if you'd like to know more about a particular topic.

eDiscovery Best Practices: 4 Steps to Effective eDiscovery With Software Analytics


I read an interesting article from Texas Lawyer via entitled “4 Steps to Effective E-Discovery With Software Analytics” that has some interesting takes on project management principles related to eDiscovery and I’ve interjected some of my thoughts into the analysis below.  A copy of the full article is located here.  The steps are as follows:

1. With the vendor, negotiate clear terms that serve the project's key objectives.  The article notes the important of tying each collection and review milestone (e.g., collecting and imaging data; filtering data by file type; removing duplicates; processing data for review in a specific review platform; processing data to allow for optical character recognition (OCR) searching; and converting data into a tag image file format (TIFF) for final production to opposing counsel) to contract terms with the vendor. 

The specific milestones will vary – for example, conversion to TIFF may not be necessary if the parties agree to a native production – so it’s important to know the size and complexity of the project, and choose only an experienced eDiscovery vendor who can handle the variations.

2. Collect and process data.  Forensically sound data collection and culling of obviously unresponsive files (such as system files) to drastically decrease the overall review costs are key services that a vendor provides in this area.  As we’ve noted many times on this blog, effective culling can save considerable review costs – each gigabyte (GB) culled can save $16-$18K in attorney review costs.

The article notes that a hidden cost is the OCR process of translating extracted text into a searchable form and that it’s an optimal negotiation point with the vendor.  This may have been true when most collections were paper based, but as most collections today are electronic based, the percentage of documents requiring OCR is considerably less than it used to be.  However, it is important to be prepared that there are some native files which will be “image only”, such as TIFFs and scanned PDFs – those will require OCR to be effectively searched.

3. Select a data and document review platform.  Factors such as ease of use, robustness, and reliability of analytic tools, support staff accessibility to fix software bugs quickly, monthly user and hosting fees, and software training and support fees should be considered when selecting a document review platform.

The article notes that a hidden cost is selecting a platform with which the firm’s litigation support staff has no experience as follow-up consultation with the vendor could be costly.  This can be true, though a good vendor training program and an intuitive interface can minimize or even eliminate this component.

The article also notes that to take advantage of the vendor’s more modern technology “[a] viable option is to use a vendor's review platform that fits the needs of the current data set and then transfer the data to the in-house system”.  I’m not sure why the need exists to transfer the data back – there are a number of vendors that provide a cost-effective solution appropriate for the duration of the case.

4. Designate clear areas of responsibility.  By doing so, you minimize or eliminate inefficiencies in the project and the article mentions the RACI matrix to determine who is responsible (individuals responsible for performing each task, such as review or litigation support), accountable (the attorney in charge of discovery), consulted (the lead attorney on the case), and informed (the client).

Managing these areas of responsibility effectively is probably the biggest key to project success and the article does a nice job of providing a handy reference model (the RACI matrix) for defining responsibility within the project.

So, what do you think?  Do you have any specific thoughts about this article?   Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Your ESI Collection May Be Larger Than You Think


Here’s a sample scenario: You identify custodians relevant to the case and collect files from each.  Roughly 100 gigabytes (GB) of Microsoft Outlook email PST files and loose “efiles” is collected in total from the custodians.  You identify a vendor to process the files to load into a review tool, so that you can perform first pass review and, eventually, linear review and produce the files to opposing counsel.  After processing, the vendor sends you a bill – and they’ve charged you to process over 200 GB!!  What happened?!?

Did the vendor accidentally “double-bill” you?  That would be great – but no.  There’s a much more logical explanation and, unfortunately, you may wind up paying a lot more to process these files that you expected.

Many of the files in most ESI collections are stored in what are known as “archive” or “container” files.  For example, as noted above, Outlook emails are typically saved for each custodian in a personal storage (.PST) file format, which is an expanding container file. For most custodians, all of their email (and the corresponding attachments, if present) resides in a few PST files.  The scanned size for the PST file is the size of the file on disk.

Did you ever see one of those vacuum bags that you store clothes in and then suck all the air out so that the clothes won’t take as much space?  The PST file is like one of those vacuum bags – it typically stores the emails and attachments in a compressed format to save space.  When the emails and attachments are processed into a review tool, they are expanded into their normal size.  This expanded size can be 1.5 to 2 times larger than the scanned size (or more).  And, that’s what many vendors will bill on – the expanded size.

There are other types of archive container files that compress the contents – .zip and .rar files are two examples of compressed container files.  These files are often used to not only to compress files for storage on hard drives, but they are also used to compact or group a set of files when transmitting them, usually in – you guessed it – email.  With email comprising a majority of most ESI collections and the popularity of other archive container files for compressing file collections, the expanded size of your collection may be considerably larger than it appears when stored on disk.  It’s important to be prepared for that and know your options when processing that data, so you can effectively anticipate those processing costs.

So, what do you think?  Have you ever been surprised by processing costs of your ESI?   Please share any comments you might have or if you’d like to know more about a particular topic.

Working Successfully with eDiscovery and Litigation Support Service Providers: Evaluating Price


When you are looking for help with handling discovery materials, there are hundreds of service providers to choose from.  It’s important that you choose one that can meet your schedule, has fair pricing and does high-quality work.  But there are other things you should look at as well. 

In the next few blogs in this series, we’re going to discuss what you should be looking at when you evaluate a service provider.  Note that these points are not covered in order of importance.  The importance of any single evaluation point will vary from case to case and will depend on things like the type of service you are looking for, the duration of the project, the complexity of the project, and the size of the project.

Let’s start with Price.  Obviously, costs are significant and the first thing most people look at when doing an evaluation.  Unfortunately, many people don’t look at anything else.  Don’t fall into that trap.  If a service provider offers prices much lower than everyone else’s, that should sound some alarms.  There’s a chance the service provider doesn’t understand the task or is cutting corners somewhere.  Do a lot of digging and take a close look at the organization’s procedures and technology before selecting a service provider that is comparatively very low-priced. 

There’s another very important consideration when you are comparing service provider pricing:  not all pricing models are the same.  Make sure you understand every component of a service provider’s price, what’s included, what’s not, what exactly you are paying for, and how it affects the bottom line.  Let me give you an example.  Some service providers charge per GB for “input” gigs for electronic discovery processing, while others charge per GB for “output” gigs.  Of course, the ones that charge for “input” gigs charge a lower per gig price, but they are charging for more gigabytes. 

Understand how a service provider’s pricing is structured and what it means when you are evaluating prices.  It’s always a good idea to ask a service provider to estimate total costs for a project to verify your understanding.

In the next blogs in this series, we’ll look at other things you should be looking at when selecting a vendor.

What has been your experience with service provider work?  Do you have good or bad experiences you can tell us about?  Please share any comments you might have and let us know if you’d like to know more about an eDiscovery topic.

eDiscovery Trends: Craig Ball of Craig D. Ball, P.C.


This is the ninth (and final) of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball.  Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 600 presentations and papers.  Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, including in American Bar Association, ATLA and American Lawyer Media print and online publications.  He also writes a monthly column on computer forensics and e-discovery for Law Technology News called "Ball in your Court," honored as both the 2007 and 2008 Gold Medal honoree as “Best Regular Column” as awarded by Trade Association Business Publications International.  It’s also the 2009 Gold and 2007 Silver Medalist honoree of the American Society of Business Publication Editors as “Best Contributed Column” and their 2006 Silver Medalist honoree as “Best Feature Series” and “Best Contributed Column.””  The presentation, "PowerPersuasion: Craig Ball on PowerPoint," is consistently among the top rated continuing legal educational programs from coast-to-coast.

What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?

Price compression is a major trend.  Consumers are very slowly waking up to the fact that they have been the “drunken sailors on leave” in terms of how they have approached eDiscovery and there have been many “vendors of the night” ready to roll them for their paychecks.  eDiscovery has been more like a third world market where vendors have said “let’s ask for some crazy number” and perhaps they’ll be foolish enough to pay it.  And, if they don’t pay that one, let’s hit them with a little lower number, mention sanctions, give them a copy of something from Judge Scheindlin or Judge Grimm and then try again.  Until finally, they are so dissolved in a pool of their own urine that they’re willing to pay an outrageous price.  Those days are coming to an end and smart vendors are going to be prepare to be able to demonstrate the value and complexity behind their offerings.

I am seeing people recognizing that the “gravy train” is over except for the most egregious challenging eDiscovery situations where numbers really have little meaning.  When you’re talking about tens of thousands of employees and petabytes of data, the numbers can get astronomical.  But, for the usual case, with a more manageable number of custodians and issues, people are waking up to the fact that we can’t keep reinventing this wheel of great expense, so clients are pushing for more rational approaches and a few forward thinking vendors are starting to put forward some products will allow you to quantify what your exposure is going to be in eDiscovery.  We’re just not going to see per GB processing prices that are going to be measured in the double and triple digits – that just can’t go, at least when you’re talking about the raw data on the input side.  So, I’m seeing some behind the firewall products, even desktop products, that are going to be able to allow lawyers and people with relatively little technical expertise to handle small and medium sized cases.  Some of the hosting services are putting together pricing where, though I haven’t really tested them in real world situations, are starting to sound rational and less frightening.

I’m continuing to see more fragmentation in the market and I would like to see more integrated products, but it’s still like packaging a rather motley crew of different pieces that don’t always fit together well at all.  You’ve got relatively new review tools, some strong players like Clearwell and stronger than they used to be players like Relativity.  You’ve got people “from down under” that are really changing the game like Nuix.  And, you’ve got some upstarts – products that we’ve really not yet heard of at all.  I’m seeing at this conference that any one of them has the potential of becoming an industry standard.  I’m seeing some real innovation, some real new code bases coming out and that is impressive to me because it just hadn’t been happening before, it’s been “old wine in new bottles” for several years.

I also see some new ideas in collection.  I think people are starting to embrace what George Socha would like for me to aptly call the left side of the EDRM.  A lot of people have turned their heads away from the ugly business of selecting data to process and the collection of it and forensic and chain of custody issues and would gather it up any way they liked and process it.  But, I think there are some new and very viable ways that companies are offering for self-collection, for tracking of collection, for desk side interviews, and for generation and management of legal holds.  We’re seeing a lot of things emerging on that front.  Most of what I see in the legal hold management space is just awful.  That doesn’t mean it’s all awful, but most of it is awful.  It’s a lot of marketing speak, a lot of industry jargon, wrapped around a very uncreative, somewhat impractical, set of tools.  The question really is, are these things really much better than a well designed spreadsheet?  Certainly, they’re more scalable, but some have a “rushed to market” feel to me and I think it’s going to take them some time to mature.  Everyone is jumping on this Pension Committee bandwagon that Judge Scheindlin created for us, and not everyone has brought their Sunday best.

As for social media, it is a big deal because, if you’re paying attention to what’s happening with the generation about to explode on the scene, they simply have marginalized email.  Just as we are starting to get our arms around email, it’s starting to move off center stage.  And, I think the most important contribution to eDiscovery in 2010 has occurred silently and with little fanfare and I’d like to make sure you mention it.  In November, Facebook, the most important social networking site on the planet, very quietly provided the ability for you to package and collect, for personal storage, the entire contents of your Facebook life, including your Wall, your messaging, and your Facemail.  For all of the pieces of your Facebook existence, you can simply click and receive it back in a Zip file.  The ability to preserve and, ultimately, reopen and process that data is the most forward thinking thing that has emerged from the social networking world since there has been a social networking world.  How wonderful that Facebook had the foresight to say “you know, it would be nice if we could give people their entire Facebook stuff in a neat package in a moment in time”.

None of the others have done that yet, but I think that Facebook is so important that it’s going to make that a standard.  It’s going to need to be in Google Apps, it’s going to need to be in Gmail.  If you’re going to live your life “in the cloud”, then you’re going to have to have a way to grab your life from the cloud and move it somewhere else.  Maybe their portability was a way to head off antitrust, for all I know.  Whatever their motivation, I don’t think that most lawyers know that there is essentially this one-click preservation of Facebook.  If a vendor did it, you would hear about it in the elevators here at the show.  Facebook did it for free, and without any fanfare, and it’s an important thing for you to get out there.  The vendor that comes out with a tool that processes these packages that emerge, especially if they announce it when the Oscars come out {laugh}, is well positioned.

So, yes, social networking is important because it means that a lot of things change, forensics change.  You’re just not going to be able to do media forensics anymore on cloud content.  The cloud is going to make eDiscovery simpler, and that’s the one thing I haven’t heard anybody say, because you’ll have less you’ll need to delete and it’s much more likely to be gone – really gone – when you delete it (no forensics needed).  Collection and review can be easier.  What would you rather search, Gmail or Outlook?  Not only can Outlook emails be in several places, but the quality of a Google-based search is better, even though it’s not built for eDiscovery.  If I’m going to stand up in court and say that “I searched all these keywords and I saw all of the communications related to these keywords”, I’d rather do it with the force of Google than with the historically “snake bitten” engine for search that’s been in Outlook.  We always say in eDiscovery that you don’t use Outlook as a review and search tool because we know it isn’t good.  So, we take the container files, PSTs and OSTs and we parse them in better tools.  I think we’ll be able to do it both ways. 

I foresee a day not long off when Google will allow either the repatriation of those collections for use in more powerful tools or will allow different types of searches to be run on the Gmail collections other than just Gmail search.  You may be able to do searches and collect from your own Gmail, to place a hold on that Gmail.  Right now, you’d have to collect it, tag it, move it to a folder – you have to do some gyrations.  I think it will mature and they may open their API, so that there can be add-on tools from the lab or from elsewhere that will allow people to hook into Gmail.  To a degree, you can do that right now, by paying an upgrade fee for Postini, where they can download a PST with your Gmail content.  The problem with that is that Gmail is structured data, you really need to see the threading that Gmail provides to really appreciate the conversation that is Gmail.  Whereas, if you pull it down to PST (except in the latest version of Outlook, which I think 2010 does a pretty good job of threading), I don’t know if that is replicated in the Postini PST.  I’ll have to test that.

Office 2010 is a trend, as well.  Outlook 2010 is the first Microsoft tool that is eDiscovery friendly, by design.  I think Exchange 2010 is going to make our lives easier in eDiscovery.  We’re going to have a lot more “deleted” information hang around in the Windows 7 environment and in the Outlook 2010 and Exchange 2010 environment.  Data is not going away until you jump through some serious hoops to make it go away.

I think the iPad is also going to have quite an impact.  At first, it will be smoke and mirrors, but before 2011 bids us goodbye, I think the iPad is going to find its way into some really practical, gestural interfaces for working with data in eDiscovery.  I’ve yet to see anything yet but a half-assed version of an app.  Everyone rushed out and you wanted some way to interface with your product, but they didn’t build a purpose-built app for the iPad to really take advantage of its strengths, to be able to gesturally move between screens.  I foresee a day where you’ll have a ring of designations around the screen and you’ll flip a document, like a privileged document, into the appropriate designation and it will light up or something so that you know it went into the correct bin – as if you were at a desk and you were moving paper to different parts of the desk.  Sometimes, I wonder why somebody hasn’t thought of this before.  I’ve done no metrics, I’ve done no ergonomic studies to know that the paper metaphor serves the task well.  But, my gut tells me that we need to teach lawyers to walk before they can run, to help them interact with data in a metaphor that they understand in a graphical user interface.  Point and click, drag and drop, pinch and stretch, which are three dimensional concepts translated into a two dimensional interface. The interface of the iPad is so intuitive that a three year old could figure it out.  Just like Windows Explorer impacted the design of so many applications (“it’s an Explorer-like interface”), the iPad will do the same.

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the second afternoon of LTNY}  I think that the show felt well attended, upbeat, fresher that it has in two years.  I give the credit to the vendors showing up with some genuinely new products, instead of renamed, remarketed new products, although there’s still plenty of that.  There were so many announcements of new products before the show that you really wonder how new is this product?  But, there were some that really look like they were built from the ground up and that’s impressive.  There’s some money being spent on development again, and that’s positive.  The traffic was better, I’m glad we finally eliminated the loft area of the exhibit hall that would get so hot and uncomfortable.  I thought the traffic flow was very difficult in a positive way, which is to say that there were a lot of warm bodies out there, walking and talking and looking.

Henry Dicker and his team should be congratulated and I wouldn’t be surprised if they set a record over the past several years at this show.  The budgets were showing, money was freed up and that’s a positive for everyone in this industry.  Also, the quality of the questions being put forward in the educational tracks are head and shoulders better, more incisive and insightful and more advanced.  We’re starting to see the results of people working at the “201 level”, but we still don’t have enough technologists here, it’s still way too lawyer heavy.  This is the New York market, everybody is chasing after the Fortune 500, but everything has to be downward scalable too.  A good show.

What are you working on that you’d like our readers to know about?

The first week of June, I’m going to be teaching a technology for lawyers and litigation support professionals academy with an ultra all star cast of a very small, but dedicated faculty, including Michael Arkfeld, Judge Paul Grimm, Judge John Facciola, and others.  It’s called the eDiscovery Training Academy and will be held at the Georgetown Law School. It’s going to be rigorous, challenging, extremely technical and the hope is that the people emerge from that week genuinely equipped to talk the talk and walk the walk of productive 26(f) conferences and real interaction with IT personnel and records managers.  We’re going to start down at the surface of the magnetic media and we’re going to keep climbing until we can climb no further.

Thanks, Craig, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!