Analysis

eDiscovery Trends: Ralph Losey of Jackson Lewis, LLP

 

This is the sixth of the 2012 LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year.

Today’s thought leader is Ralph Losey. Ralph is an attorney in private practice with the law firm of Jackson Lewis, LLP, where he is a Partner and the firm's National e-Discovery Counsel. Ralph is also an Adjunct Professor at the University of Florida College of Law teaching eDiscovery and advanced eDiscovery. Ralph is also a prolific author of eDiscovery books and articles, the principle author and publisher of the popular e-Discovery Team® Blog and founder and owner of an intensive online training program, e-Discovery Team Training, with attorney and technical students all over the world.

Our interview with Ralph had so much good information in it, we couldn’t fit it all into a single post.  So, today is part one.  Part two will be published in the blog tomorrow!

Many people are saying that 2012 is the year of technology assisted review.  What do you think needs to happen for that to come true?

Well, many things.  First of all, we need to have better training for lawyers so that they'll know how to use the technology.  If you bring an advanced computer to anyone, they're going to need some kind of instruction on how to use it.  You have to have people trained to use the tools.  That's very important and I spend a lot of time focusing on training in my firm and around the country to other attorneys and bar groups.  The tool alone really can't do much or help you unless you fit in the use of it into a larger, legal methodology.

In other words, just bringing in technology in itself doesn't answer any questions.  It may answer some, but it doesn't give you the answers you need in order to use it in your practice.

I'm a legal practitioner.  I've been practicing law, for, I guess about 32 years now.  So, that's how I look at technology – as tools to practice law and represent clients.  And, the truth is most people don't know how to use predictive coding yet, so we're going to have a training and learning curve like you do with any new technology.

Vendors also need to start bringing the prices down so that it's more affordable and make it accessible to a large number of attorneys, rather than just a few attorneys that can afford to handle it in large cases.  I've been complaining about this to vendors for a while now.  The good news is I think that they're listening.  I'm beginning to see prices come down and I think this trend will continue.  It's in their own best interest to do that because in the long run, they are going to be more successful in bringing this technology to attorneys and making money for their companies if they look at more of a large scale, larger volume, lower profit as opposed to making larger amounts of profit and fewer projects.

I think most of the vendors are receptive to that.  The reason they probably just don't jump on it right away is the demand isn’t there yet.  Build it and they will come.  But, they're only coming in small numbers.  When they're only coming in small numbers in order to pay for their business, they have to charge a lot.

So, it's a circle.  It comes back again to training.  An educated consumer will want this.  I want this.  I like it, and I want it affordable.

Do you think that it's just merely a matter of bringing prices down?  Or is it being creative in how you price differently?

Well, it's both.  The bottom line is always the bottom line, but it’s important to get there in a way that's win-win for both the consumer (law firms and corporate law departments) and for the provider.  So, there needs to be creative solutions.  As a result, I think people are now “putting on their thinking caps” and coming up with new ways to price solutions because there are different needs.  I have my own ideas on how I want to use it, and so I want people to price accordingly.  I don't want there to be a “one-size-fits-all” type of solution.  I think the vendors are hearing that, too.

You had a recent blog post about bottom line proportional review and you noted that the larger cases have a lot at stake, so the budget is much higher.  How does it work for smaller cases?

It's going to take a legal method, and I think that the method I described (bottom line proportional review) is the way to make it happen.  In order to make bottom line driven review (where you're basically setting a budget up front) to be acceptable to the requesting party, they're going to want to make sure that this isn't just another way to “hide the ball”.  They're going to want to make sure that they can find the relevant evidence that they need to evaluate their case to either see that they've got a winning case (so they can move for a summary-judgment, establish a strong settlement position, or go to trial) or see that they have a weak case and value it accordingly.

We all want to find out as quickly as possible how good a case it is.  We really don't want to spend all of our time and money just doing discovery.  The whole point of discovery is to discover how good your case is and then resolve it.

I'm very oriented to resolving cases.  That's really most of my life.  I wasn't an eDiscovery lawyer most of my career.  I was a trial lawyer, and I think that perspective is lacking from some of the vendors and some of the analysts and some of the other people in eDiscovery.  People seem to think discovery is an end in itself.  It's not.  It's just a way to prepare for trial.

So, there is no reason to get all of the relevant evidence.  That's an archaic notion of the past.  There's too much relevant evidence.  All that counts is the important relevant evidence.  The smoking guns are what counts.  The highly relevant or hot documents are what counts.

You do have to wade through some relevant documents to get there, but the point is to get there.  It gets back to my “seven plus or minus two” rule.  It's not my rule.  It's an old rule of persuasion.  That's never going to change.  People are never going to remember more than seven documents at a trial.  They just can't.  The juror's mind is not capable of it.

Lawyers can handle probably several hundred exhibits, and they can keep it in their head.  But, they don't make the decisions.  And, the several hundred exhibits are merely predicates or evidentiary foundations in order to get the key exhibits out there that you then use in your closing argument.

The point of discovery and litigation is to identify and locate these key documents.  When you understand that, then you'll accept and understand the fact that you don't need all relevant information, all relevant documents.  You just need the most highly relevant documents so that you can feel pretty confident you've got the handful of documents you need to try the case.

The thing that’s exciting about predictive coding is its ranking abilities.  You don't have to look at the junk that's not really that relevant.  You only look at the most relevant documents, whether it’s the most relevant 5,000, 50,000 or 100,000.  Whatever it is that's appropriate to your size case.  You're not going to look at 100,000 documents in a $250,000 dollar discrimination case.  It makes no sense.

That's where you get back to proportionality.  It's a somewhat long answer to your question, but people need to understand that this isn't a way to hide the truth.  It's really a way to get the truth out there in an efficient, economic manner.

So, based on the five dollar per document review cost example in your post, if you have $25,000 to spend, you can review the top 5,000 documents, right?

That's right.  And the five dollars is just like a working number that you use.  Some document collections can be even more expensive and difficult.  For example, a collection with a lot of 20-page spreadsheets (where you actually determine what's confidential and what's not in each sheet) can drive that number up.  Banking cases are a nightmare.  You've got all this financial information, where some of it's relevant and some of it's not.  For other cases, it can be a lot cheaper.  But, you also have to take some vendor claims with a big grain of salt.  “Oh, I'll do your whole thing for you for a buck a document.”  Will you?  Really?  What does that include?

Thanks, Ralph, for participating in the interview!

And to the readers, just a reminder that part two of our interview with Ralph Losey will be published tomorrow.  Don't miss it!  And, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Brian Schrader of Business Intelligence Associates (BIA)

 

This is the fifth of the 2012 LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

  1. What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?
  2. Which trend(s), if any, haven’t emerged to this point like you thought they would?
  3. What are your general observations about LTNY this year and how it fits into emerging trends?
  4. What are you working on that you’d like our readers to know about?

Today’s thought leader is Brian Schrader. Brian is Co-Founder and President of Business Intelligence Associates, Inc. (BIA).  Brian is an expert and frequent writer and speaker on eDiscovery and computer forensics topics, particularly those addressing the collection, preservation and processing functions of the eDiscovery process.

What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?

Well, I think you don't have to walk around the floor very much to see that this year everybody is talking about predictive coding.  I think you're going to see that shake out a lot over the next year.  We've been doing predictive coding for about a year and a half now, and we have our own algorithms for that.  We have our review teams, and they've been using our algorithms to do predictive coding.  We like to call it “suggestive coding”.

What I expect you’ll find this year is a standard shakeout among providers because everybody talks about predictive coding.  The question is how does everybody approach it?  It's very much a black-box solution.  Most people don't know what goes on inside that process and how the process works.  So, I think that's going to be a hot topic for a while.  We're doing a lot of predictive coding and BIA is going to be announcing some cool things later this year on our predictive coding offerings.

Every provider that you talk to seems to have a predictive coding solution.  I'm really looking forward to seeing how things develop, because we have a lot of input on it and a lot of experience.  We have our review team that is reviewing millions and millions of documents per year, so we can compare various predictive coding engines to real results.  It gives us the ability to review the technology.  We look forward to being part of that conversation and I hope to see a little bit more clarity from the players and some real standards set around that process.

The courts have now also started to look at these algorithmic methods, Judge Peck in particular.  Everybody agrees that key word searching is inadequate.  But, people are still tentative about it – they say “it sounds good, but how does it work?  How are we going to approach it?”

Which trend(s), if any, haven’t emerged to this point like you thought they would?

Frankly, I thought we'd see a lot more competition for us in data collection.  A huge pain point for companies is how to gather all their data from all over the world.  It's something we've always focused on.  I started to see some providers focus on that, but now it looks like everybody, even some of the classic data collection providers, are focusing more on review tools.  That surprises me a bit, though I'm happy to be left with a wide-open field to have more exposure there.

When we first came out with TotalDiscovery.com last year, we thought we'd see all sorts of similar solutions pop up out there, but we just haven't.  Even the traditional collection companies haven't really offered a similar solution.  Perhaps it’s because everybody has a “laser focus” on predictive coding, since document review is so much more expensive.  I think that has really overpowered the focus of a lot of providers as they've focused only on that.  We have tried to focus on both collection and review.

I think data processing has become a commodity.  In talking to customers, they don't really ask about it anymore.  They all expect that everybody has the same base level capabilities.  Everybody knows that McDonald's secret sauce is basically Thousand Island dressing, so it’s no longer unique, the “jig is up”.  So, it's all about the ends, the collection, and the review.

What are your general observations about LTNY this year and how it fits into emerging trends?

Well, predictive coding again.  I think there's an awful lot of talk but not enough detail.  What you're seeing is a lot of providers who are saying “we’ll have predictive coding in six months”.  You're going to see a huge number of players in that field this year.  Everybody's going to throw a hat in the ring, and it's going to be interesting to see how that all works out.  Because how do you set the standards?  Who gets up there and really cooperates? 

I think it's really up to the individual companies to get together and cooperate on this. This particular field is so critical to the legal process that I don't think you can have everybody having individual standards and processes.  The most successful companies are going to be the ones that step up and work together to set those standards.  And, I don't know for sure, but I wouldn't be surprised if The Sedona Conference already has a subcommittee on this topic.

What are you working on that you’d like our readers to know about?

Our biggest announcement is around data collection – we've vastly expanded it.  Our motto is to collect “any data, anytime, anywhere”.  We've been providing data collection services for over a decade, and our collection guys like to say they've never met a piece of data they didn't like.

Now, we've brought that data collection capability direction to TotalDiscovery.com.  The latest upgrade, which we’re previewing at the show to be released in March, will offer the ability to collect data from social media sites like Facebook, Twitter, as well as collections from Webmail and Apple systems.  So, you can collect pretty much anything through TotalDiscovery.com that we have historically offered in our services division. It gives you a single place to manage data collection and bring it all together in one place, and then deliver it out to the review platform you want.

We’re on a three-week development cycle, which doesn’t always mean new features every three weeks, but it does mean we’re regularly adding new features.  Mid-year in 2011, we added legal hold capabilities and we’ve also recently added other components to simplify search and data delivery.  Now, we’ve added expanded collection for social media sites, Webmail and Apple.  Later this year, we expect to release our predictive coding capabilities to enable clients to perform predictive coding right after collection instead of waiting until the data is in the review tool.

Thanks, Brian, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Tom Gelbmann of Gelbmann & Associates, LLC

 

This is the fourth of the 2012 LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

  1. What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?
  2. Which trend(s), if any, haven’t emerged to this point like you thought they would?
  3. What are your general observations about LTNY this year and how it fits into emerging trends?
  4. What are you working on that you’d like our readers to know about?

Today’s thought leader is Tom Gelbmann. Tom is Principal of Gelbmann & Associates, LLC.  Since 1993, Gelbmann & Associates, LLC has advised law firms and Corporate Law Departments to realize the full benefit of their investments in Information Technology.  Tom has also been co-author of the leading survey on the electronic discovery market, The Socha-Gelbmann Electronic Discovery Survey; last year he and George Socha converted the Survey into Apersee, an online system for selecting eDiscovery providers and their offerings.  In 2005, he and George Socha launched the Electronic Discovery Reference Model project to establish standards within the eDiscovery industry – today, the EDRM model has become a standard in the industry for the eDiscovery life cycle and there are nine active projects with over 300 members from 81 participating organizations.

What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?  And which trend(s), if any, haven’t emerged to this point like you thought they would?

I’m seeing an interesting trend regarding offerings from traditional top tier eDiscovery providers. Organizations who have invested in eDiscovery related technologies are beginning to realize these same technologies can be applied to information governance and compliance and enable an organization to get a much greater grasp on its total content.  Greater understanding of location and profile of content not only helps with eDiscovery and compliance, but also business intelligence and finally – destruction – something few organizations are willing to address.

We have often heard – Storage is cheap. The full sentence should be: Storage is cheap, but management is expensive.  I think that a lot of the tools that have been applied for collection, culling, search and analysis enable organizations to look at large quantities of information that is needlessly retained. It also allows them to take a look at information and get some insights on their processes and how that information is either helping their processes or, more importantly, hindering those processes and I think it's something you're going to see will help sell these tools upstream rather than downstream.

As far as items that haven't quite taken off, I think that technology assisted coding – I prefer that term over “predictive coding” – is coming, but it's not there yet.  It’s going to take a little bit more, not necessarily waiting for the judiciary to help, but just for organizations to have good experiences that they could talk about that demonstrate the value.  You're not going to remove the human from the process.  But, it's giving the human a better tool.  It’s like John Henry, with the ax versus the steam engine.  You can cut a lot more wood with the steam engine, but you still need the human.

What are your general observations about LTNY this year and how it fits into emerging trends?

Based on the sessions that I've attended, I think there's much more education.  There's just really more practical information for people to take away on how to manage eDiscovery and deal with eDiscovery related products or problems, whether it's cross-border issues, how to deal with the volumes, how to bring processes in house or work effectively with vendors.  There's a lot more practical “how-tos” than I've seen in the past.

What are you working on that you’d like our readers to know about?

Well, I think one of the things I'm very proud of with EDRM is that just before LegalTech, we put out a press release of what's happening with the projects, and I'm very pleased that five of the nine EDRM projects had significant announcements.  You can go to EDRM.net for that press release that details those accomplishments, but it shows that EDRM is very vibrant, and the teams are actually making good progress. 

Secondly, George Socha and I are very proud about the progress of Apersee, which was announced last year at LegalTech.  We've learned a lot, and we've listened to our clientele in the market – consumers and providers.  We listened, and then our customers changed their mind.  But, as a result, it's on a stronger track and we're very proud to announce that we have two gold sponsors, AccessData and Nuix.  We’re also talking to additional potential sponsors, and I think we'll have those announcements very shortly.

Thanks, Tom, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Case Law: Predictive Coding Considered by Judge in New York Case

In Da Silva Moore v. Publicis Groupe, No. 11 Civ. 1279 (ALC) (S.D.N.Y. Feb. 8, 2012), Magistrate Judge Andrew J. Peck of the U.S. District Court for the Southern District of New York instructed the parties to submit proposals to adopt a protocol for e-discovery that includes the use of predictive coding, perhaps the first known case where a technology assisted review approach was considered by the court.

In this case, the plaintiff, Monique Da Silva Moore, filed a Title VII gender discrimination action against advertising conglomerate Publicis Groupe, on her behalf and the behalf of other women alleged to have suffered discriminatory job reassignments, demotions and terminations.  Discovery proceeded to address whether Publicis Groupe:

  • Compensated female employees less than comparably situated males through salary, bonuses, or perks;
  • Precluded or delayed selection and promotion of females into higher level jobs held by male employees; and
  • Disproportionately terminated or reassigned female employees when the company was reorganized in 2008.

Consultants provided guidance to the plaintiffs and the court to develop a protocol to use iterative sample sets of 2,399 documents from a collection of 3 million documents to yield a 95 percent confidence level and a 2 percent margin of error (see our previous posts here, here and here on how to determine an appropriate sample size, randomly select files and conduct an iterative approach). In all, the parties expect to review between 15,000 to 20,000 files to create the “seed set” to be used to predictively code the remainder of the collection.

The parties were instructed to submit their draft protocols by February 16th, which is today(!).  The February 8th hearing was attended by counsel and their respective ESI experts.  It will be interesting to see what results from the draft protocols submitted and the opinion from Judge Peck that results.

So, what do you think?  Should courts order the use of technology such as predictive coding in litigation?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: George Socha of Socha Consulting

 

This is the first of the 2012 LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

  1. What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?
  2. Which trend(s), if any, haven’t emerged to this point like you thought they would?
  3. What are your general observations about LTNY this year and how it fits into emerging trends?
  4. What are you working on that you’d like our readers to know about?

Today’s thought leader is George Socha.  A litigator for 16 years, George is President of Socha Consulting LLC, offering services as an electronic discovery expert witness, special master and advisor to corporations, law firms and their clients, and legal vertical market software and service providers in the areas of electronic discovery and automated litigation support. George has also been co-author of the leading survey on the electronic discovery market, The Socha-Gelbmann Electronic Discovery Survey; last year he and Tom Gelbmann converted the Survey into Apersee, an online system for selecting eDiscovery providers and their offerings.  In 2005, he and Tom Gelbmann launched the Electronic Discovery Reference Model project to establish standards within the eDiscovery industry – today, the EDRM model has become a standard in the industry for the eDiscovery life cycle and there are nine active projects with over 300 members from 81 participating organizations.  George has a J.D. for Cornell Law School and a B.A. from the University of Wisconsin – Madison.

What do you consider to be the emerging trends in eDiscovery that will have the greatest impact in 2012?

I may have said this last year too, but it holds true even more this year – if there's an emerging trend, it's the trend of people talking about the emerging trend.  It started last year and this year every person in the industry seems to be delivering the emerging trend.  Not to be too crass about it, but often the message is, "Buy our stuff", a message that is not especially helpful.

Regarding actual emerging trends, each year we all try to sum up legal tech in two or three words.  The two words for this year can be “predictive coding.”  Use whatever name you want, but that's what everyone seems to be hawking and talking about at LegalTech this year.  This does not necessarily mean they really can deliver.  It doesn't mean they know what “predictive coding” is.  And it doesn't mean they've figured out what to do with “predictive coding.”  Having said that, expanding the use of machine assisted review capabilities as part of the e-discovery process is a important step forward.  It also has been a while coming.  The earliest I can remember working with a client, doing what's now being called predictive coding, was in 2003.  A key difference is that at that time they had to create their own tools.  There wasn't really anything they could buy to help them with the process.

Which trend(s), if any, haven’t emerged to this point like you thought they would?

One thing I don't yet hear is discussion about using predictive coding capabilities as a tool to assist with determining what data to preserve in the first place.  Right now the focus is almost exclusively on what do you do once you’ve “teed up” data for review, and then how to use predictive coding to try to help with the review process.

Think about taking the predictive coding capabilities and using them early on to make defensible decisions about what to and what not to preserve and collect.  Then consider continuing to use those capabilities throughout the e-discovery process.  Finally, look into using those capabilities to more effectively analyze the data you're seeing, not just to determine relevance or privilege, but also to help you figure out how to handle the matter and what to do on a substantive level.

What are your general observations about LTNY this year and how it fits into emerging trends?

Well, Legal Tech continues to have been taken over by electronic discovery.  As a result, we tend to overlook whole worlds of technologies that can be used to support and enhance the practice of law. It is unfortunate that in our hyper-focus on e-discovery, we risk losing track of those other capabilities.

What are you working on that you’d like our readers to know about?

With regard to EDRM, we recently announced that we have hit key milestones in five projects.  Our EDRM Enron Email Data Set has now officially become an Amazon public dataset, which I think will mean wider use of the materials.

We announced the publication of our Model Code of Conduct, which was five years in the making.  We have four signatories so far, and are looking forward to seeing more organizations sign on.

We announced the publication of version 2.0 of our EDRM XML schema.  It's a tightened-up schema, reorganized so that it should be a bit easier to use and more efficient in the operation.

With the Metrics project, we are beginning to add information to a database that we've developed to gather metrics, the objective being to be able to make available metrics with an empirical basis, rather than the types of numbers bandied about today, where no one seems to know how they were arrived at. Also, last year the Uniform Task Billing Management System (UTBMS) code set for litigation was updated.  The codes to use for tracking e-discovery activities were expanded from a single code that covered not just e-discovery but other activities, to a number of codes based on the EDRM Metrics code set.

On the Information Governance Reference Model (IGRM) side, we recently published a joint white paper with ARMA.  The paper cross-maps the EDRMs Information Governance Reference Model (IGRM) with ARMA's Generally Accepted Recordkeeping Principles (GARP).  We look forward to more collaborative materials coming out of the two organizations.

As for Apersee, we continue to allow consumers search the data on the site for free, but we also are longer charging providers a fee for their information to be available.  Instead, we now have two sponsors and some advertising on the site.  This means that any provider can put information in, and everyone can search that information.  The more data that goes in, the more useful the searching process comes because.  All this fits our goal of creating a better way to match consumers with the providers who have the services, software, skills and expertise that the consumers actually need.

And on a consulting and testifying side, I continue to work a broad array of law firms; corporate and governmental consumers of e-discovery services and software; and providers offering those capabilities.

Thanks, George, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: “Assisted” is the Key Word for Technology Assisted Review

 

As noted in our blog post entitled 2012 Predictions – By The Numbers, almost all of the sets of eDiscovery predictions we reviewed (9 out of 10) predicted a greater emphasis on Technology Assisted Review (TAR) in the coming year.  It was one of our predictions, as well.  And, during all three days at LegalTech New York (LTNY) a couple of weeks ago, sessions were conducted that addressed technology assisted review concepts and best practices.

While some equate technology assisted review with predictive coding, other technology approaches such as conceptual clustering are also increasing in popularity.  They qualify as TAR approaches, as well.  However, for purposes of this blog post, we will focus on predictive coding.

Over a year ago, I attended a Virtual LegalTech session entitled Frontiers of E-Discovery: What Lawyers Need to Know About “Predictive Coding” and wrote a blog post from that entitled What the Heck is “Predictive Coding”?  The speakers for the session were Jason R. Baron, Maura Grossman and Bennett Borden (Jason and Bennett are previous thought leader interviewees on this blog).  The panel gave the best descriptive definition that I’ve seen yet for predictive coding, as follows:

“The use of machine learning technologies to categorize an entire collection of documents as responsive or non-responsive, based on human review of only a subset of the document collection. These technologies typically rank the documents from most to least likely to be responsive to a specific information request. This ranking can then be used to “cut” or partition the documents into one or more categories, such as potentially responsive or not, in need of further review or not, etc.”

It’s very cool technology and capable of efficient and accurate review of the document collection, saving costs without sacrificing quality of review (in some cases, it yields even better results than traditional manual review).  However, there is one key phrase in the definition above that can make or break the success of the predictive coding process: “based on human review of only a subset of the document collection”. 

Key to the success of any review effort, whether linear or technology assisted, is knowledge of the subject matter.  For linear review, knowledge of the subject matter usually results in preparation of high quality review instructions that (assuming the reviewers competently follow those instructions) result in a high quality review.  In the case of predictive coding, use of subject matter experts (SMEs) to review a core subset of documents (typically known as a “seed set”) and make determinations regarding that subset is what enables the technology in predictive coding to “predict” the responsiveness and importance of the remaining documents in the collection.  The more knowledgeable the SMEs are in creating the “seed set”, the more accurate the “predictions” will be.

And, as is the case with other processes such as document searching, sampling the results (by determining the appropriate sample size of responsive and non-responsive items, randomly selecting those samples and reviewing both groups – responsive and non-responsive – to test the results) will enable you to determine how effective the process was in predictively coding the document set.  If sampling shows that the process yielded inadequate results, take what you’ve learned from the sample set review and apply it to create a more accurate “seed set” for re-categorizing the document collection.  Sampling will enable you to defend the accuracy of the predictive coding process, while saving considerable review costs.

So, what do you think?  Have you utilized predictive coding in any of your reviews?  How did it work for you?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Needing “Technology Assisted Review” to Write a Blog Post

 

Late on a Thursday night, with a variety of tasks and projects on my plate at the moment, it seems more difficult this night to find a unique and suitable topic for today’s blog post.

One thing I often do when looking for ideas is to hit the web and turn to the many resources that I read regularly to stay abreast of developments in the industry.  Usually when I do that, I find one article or blog post that “speaks to me” as a topic to talk about on this blog.  However, when doing so last night, I found several topics worth discussing and had difficulty selecting just one.  So, here are some of the notable articles and posts that I’ve been reviewing:

There’s plenty more articles out there.  I’ve barely scratched the surface.  When we launched eDiscovery Daily about 16 months ago, some wondered whether there would be enough eDiscovery news and information to talk about on a daily basis.  The problem we have found instead is that there is SO much to talk about, it’s difficult to choose.  Today, I was unable to choose just one topic, so, as the picture notes, “I have nothing to say”.  Therefore, I’ve had to use “technology assisted review” to provide a post to you, thanks to the many excellent articles and blogs out there.  Enjoy!

So, what do you think?  Are there any specific topics that you find are being discussed a lot on the web?  Are there any topics that you’d like to see discussed more?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Sampling within eDiscovery Software

Those of you who have been following this blog since early last year may remember that we published a three part series regarding testing your eDiscovery searches using sampling (as part of the “STARR” approach discussed on this blog about a year ago).  We discussed how to determine the appropriate sample size to test your search, using a sample size calculator (freely available on the web).  We also discussed how to make sure the sample size is randomly selected (again referencing a site freely available on the web for generating the random set).  We even walked through an example of how you can test and refine a search using sampling, saving tens of thousands in review costs with defensible results.

Instead of having to go to all of these external sites to manually size and generate your random sample set, it’s even better when the eDiscovery ECA or review software you’re using handles that process for you.  The latest version of FirstPass®, powered by Venio FPR™, does exactly that.  Version 3.5.1.2 of FirstPass has introduced a sampling module that provides a wizard that walks you through the process of creating a sample set to review to test your searches.  What could be easier?

The wizard begins by providing a dialog to enable the user to select the sampling population.  You can choose from tagged documents from one or more tags, documents in saved search results, documents from one or more selected custodians or all documents in the database.  When choosing tags, you can choose ANY of the selected tags, ALL of the selected tags, or even choose documents NOT in the selected tags (for example, enabling you to test the documents not tagged as responsive to confirm that responsive documents weren’t missed in your search).

You can then specify your confidence level (e.g., 95% confidence level) and confidence interval (a.k.a., margin of error – e.g., 4%) using slider bars.  As you slide the bars to the desired level, the application shows you how that will affect the size of the sample to be retrieved.  You can then name the sample and describe its purpose, then identify whether you want to view the sample set immediately, tag it or place it into a folder.  Once you’ve identified the preferred option for handling your sample set, the wizard gives you a summary form for displaying your choices.  Once you click the Finish button, it creates the sample and gives you a form to show you what it did.  Then, if you chose to view the sample set immediately, it will display the sample set (if not, you can then retrieve the tag or folder containing your sample set).

By managing this process within the software, it saves considerable time outside the application having to identify the sample size and create a randomly selected set of IDs, then go back into the application to retrieve and tag those items as belonging to the sample set (which is how I used to do it).  The end result is simplified and streamlined.

So, what do you think?  Is sample set generation within the ECA or review tool a useful feature?  Please share any comments you might have or if you’d like to know more about a particular topic.

Full disclosure: I work for CloudNine Discovery, which provides SaaS-based eDiscovery review applications FirstPass® (for first pass review) and OnDemand® (for linear review and production).

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Our 2012 Predictions

 

Yesterday, we evaluated what others are saying and noted popular eDiscovery prediction trends for the coming year.  It’s interesting to identify common trends among the prognosticators and also the unique predictions as well.

But we promised our own predictions for today, so here they are.  One of the nice things about writing and editing a daily eDiscovery blog is that it forces you to stay abreast of what’s going on in the industry.  Based on the numerous stories we’ve read (many of which we’ve also written about), and in David Letterman “Top 10” fashion, here are our eDiscovery predictions for 2012:

  • Still More ESI in the Cloud: Frankly, this is like predicting “the Sun will be hot in 2012”.  Given the predictions in cloud growth by Forrester and Gartner, it seems inevitable that organizations will continue to migrate more data and applications to “the cloud”.  Even if some organizations continue to resist the cloud movement, those organizations still have to address the continued growth in usage of social media sites in business (which, last I checked, are based in the cloud).  It’s inevitable.
  • More eDiscovery Technology in the Cloud As Well: We will continue to see more cloud offerings for eDiscovery technology, ranging from information governance to preservation and collection to review and production.  With the need for corporations to share potentially responsive ESI with one or more outside counsel firms, experts and even opposing counsel, cloud based Software-as-a-Service (SaaS) applications are a logical choice for sharing that information effortlessly without having to buy software, hardware and provide infrastructure to do so.  Every year at LegalTech, there seems to be a few more eDiscovery cloud providers and this year should be no different.
  • Self-Service in the Cloud: So, organizations are seeing the benefits of the cloud not only for storing ESI, but also managing it during Discovery.  It’s the cost effective alternative.  But, organizations are demanding the control of a desktop application within their eDiscovery applications.  The ability to load your own data, add your own users and maintain their rights, create your own data fields are just a few of the capabilities that organizations expect to be able to do themselves.  And, more providers are responding to those needs.  That trend will continue this year.
  • Technology Assisted Review: This was the most popular prediction among the pundits we reviewed.  The amount of data in the world continues to explode, as there were 988 exabytes in the whole world as of 2010 and Cisco predicts that IP traffic over data networks will reach 4.8 zettabytes (each zettabyte is 1,000 exabytes) by 2015.  More than five times the data in five years.  Even in the smaller cases, there’s simply too much data to not use technology to get through it all.  Whether it’s predictive coding, conceptual clustering or some other technology, it’s required to enable attorneys manage the review more effectively and efficiently.
  • Greater Adoption of eDiscovery Technology for Smaller Cases: As each gigabyte of data is between 50,000 and 100,000 pages, a “small” case of 4 GB (or two max size PST files in Outlook® 2003) can still be 300,000 pages or more.  As “small” cases are no longer that small, attorneys are forced to embrace eDiscovery technology for the smaller cases as well.  And, eDiscovery providers are taking note.
  • Continued Focus on International eDiscovery:  So, cases are larger and there’s more data in the cloud, which leads to more cases where Discovery of ESI internationally becomes an issue.  The Sedona Conference® just issued in December the Public Comment Version of The Sedona Conference® International Principles on Discovery, Disclosure & Data Protection: Best Practices, Recommendations & Principles for Addressing the Preservation & Discovery of Protected Data in U.S. Litigation, illustrating how important an issue this is becoming for eDiscovery.
  • Prevailing Parties Awarded eDiscovery Costs: Shifting to the courtroom, we have started to see more cases where the prevailing party is awarded their eDiscovery costs as part of their award.  As organizations have pushed for more proportionality in the Discovery process, courts have taken it upon themselves to impose that proportionality through taxing the “losers” for reimbursement of costs, causing prevailing defendants to say: “Sue me and lose?  Pay my costs!”.
  • Continued Efforts and Progress on Rules Changes: Speaking of proportionality, there will be continued efforts and progress on changes to the Federal Rules of Civil Procedure as organizations push for clarity on preservation and other obligations to attempt to bring spiraling eDiscovery costs under control.  It will take time, but progress will be made toward that goal this year.
  • Greater Price/Cost Control Pressure on eDiscovery Services: In the meantime, while waiting for legislative relief, organizations will expect the cost for eDiscovery services to be more affordable and predictable.  In order to accommodate larger amounts of data, eDiscovery providers will need to offer simplified and attractive pricing alternatives.
  • Big Player Consolidation Continues, But Plenty of Smaller Players Available: In 2011, we saw HP acquire Autonomy and Symantec acquire Clearwell, continuing a trend of acquisitions of the “big players” in the industry.  This trend will continue, but there is still plenty of room for the “little guy” as smaller providers have been pooling resources to compete, creating an interesting dichotomy in the industry of few big and many small providers in eDiscovery.

So, what do you think?  Care to offer your own predictions?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: 2012 Predictions – By The Numbers

With a nod to Nick Bakay, “It’s all so simple when you break things down scientifically.”

The late December/early January time frame is always when various people in eDiscovery make their annual predictions as to what trends to expect in the coming year.  I know what you’re thinking – “oh no, not another set of eDiscovery predictions!”  However, at eDiscovery Daily, we do things a little bit differently.  We like to take a look at other predictions and see if we can spot some common trends among those before offering some of our own (consider it the ultimate “cheat sheet”).  So, as I did last year, I went “googling” for 2012 eDiscovery predictions, and organized the predictions into common themes.  I found eDiscovery predictions here, here, here, here, here, here and Applied Discovery.  Oh, and also here, here and here.  Ten sets of predictions in all!  Whew!

A couple of quick comments: 1) Not all of these are from the original sources, but the links above attribute the original sources when they are re-prints.  If I have failed to accurately attribute the original source for a set of predictions, please feel free to comment.  2) This is probably not an exhaustive list of predictions (I have other duties in my “day job”, so I couldn’t search forever), so I apologize if I’ve left anybody’s published predictions out.  Again, feel free to comment if you’re aware of other predictions.

Here are some of the common themes:

  • Technology Assisted Review: Nine out of ten “prognosticators” (up from 2 out of 7 last year) predicted a greater emphasis/adoption of technological approaches.  While some equate technology assisted review with predictive coding, other technology approaches such as conceptual clustering are also increasing in popularity.  Clearly, as the amount of data associated with the typical litigation rises dramatically, technology is playing a greater role to enable attorneys manage the review more effectively and efficiently.
  • eDiscovery Best Practices Combining People and Technology: Seven out of ten “augurs” also had predictions related to various themes associated with eDiscovery best practices, especially processes that combine people and technology.  Some have categorized it as a “maturation” of the eDiscovery process, with corporations becoming smarter about eDiscovery and integrating it into core business practices.  We’ve had numerous posts regarding to eDiscovery best practices in the past year, click here for a selection of them.
  • Social Media Discovery: Six “pundits” forecasted a continued growth in sources and issues related to social media discovery.  Bet you didn’t see that one coming!  For a look back at cases from 2011 dealing with social media issues, click here.
  • Information Governance: Five “soothsayers” presaged various themes related to the promotion of information governance practices and programs, ranging from a simple “no more data hoarding” to an “emergence of Information Management platforms”.  For our posts related to Information Governance and management issues, click here.
  • Cloud Computing: Five “mediums” (but are they happy mediums?) predict that ESI and eDiscovery will continue to move to the cloud.  Frankly, given the predictions in cloud growth by Forrester and Gartner, I’m surprised that there were only five predictions.  Perhaps predicting growth of the cloud has become “old hat”.
  • Focus on eDiscovery Rules / Court Guidance: Four “prophets” (yes, I still have my thesaurus!) expect courts to provide greater guidance on eDiscovery best practices in the coming year via a combination of case law and pilot programs/model orders to establish expectations up front.
  • Complex Data Collection: Four “psychics” also predicted that data collection will continue to become more complex as data sources abound, the custodian-based collection model comes under stress and self-collection gives way to more automated techniques.

The “others receiving votes” category (three predicting each of these) included cost shifting and increased awards of eDiscovery costs to the prevailing party in litigation, flexible eDiscovery pricing and predictable or reduced costs, continued focus on international discovery and continued debate on potential new eDiscovery rules.  Two each predicted continued consolidation of eDiscovery providers, de-emphasis on use of backup tapes, de-emphasis on use of eMail, multi-matter eDiscovery management (to leverage knowledge gained in previous cases), risk assessment /statistical analysis and more single platform solutions.  And, one predicted more action on eDiscovery certifications.

Some interesting predictions.  Tune in tomorrow for ours!

So, what do you think?  Care to offer your own “hunches” from your crystal ball?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.