Industry Trends

“Not Me”, The Fallibility of Human Review – eDiscovery Best Practices

When I talk with attorneys about using technology to assist with review (whether via techniques such as predictive coding or merely advanced searching and culling mechanisms), most of them still seem to question whether these techniques can measure up to good, old-fashioned human attorney review.  Despite several studies that question the accuracy of human review, many attorneys still feel that their review capability is as good or better than technical approaches.  Here is perhaps the best explanation I’ve seen yet why that may not be the case.

In Craig Ball’s latest blog post on his Ball in Your Court blog (The ‘Not Me’ Factor), Craig provides a terrific explanation as to why predictive coding is “every bit as good (and actually much, much better) at dealing with the overwhelming majority of documents that don’t require careful judgment—the very ones where keyword search and human reviewers fail miserably.”

“It turns out that well-designed and –trained software also has little difficulty distinguishing the obviously relevant from the obviously irrelevant.  And, again, there are many, many more of these clear cut cases in a collection than ones requiring judgment calls.

So, for the vast majority of documents in a collection, the machines are every bit as capable as human reviewers.  A tie.  But giving the extra point to humans as better at the judgment call documents, HUMANS WIN!  Yeah!  GO HUMANS!   Except….

Except, the machines work much faster and much cheaper than humans, and it turns out that there really is something humans do much, much better than machines:  they screw up.

The biggest problem with human reviewers isn’t that they can’t tell the difference between relevant and irrelevant documents; it’s that they often don’t.  Human reviewers make inexplicable choices and transient, unwarranted assumptions.  Their minds wander.  Brains go on autopilot.  They lose their place.  They check the wrong box.  There are many ways for human reviewers to err and just one way to perform correctly.

The incidence of error and inconsistent assessments among human reviewers is mind boggling.  It’s unbelievable.  And therein lays the problem: it’s unbelievable.    People I talk to about reviewer error might accept that some nameless, faceless contract reviewer blows the call with regularity, but they can’t accept that potential in themselves.  ‘Not me,’ they think, ‘If I were doing the review, I’d be as good as or better than the machines.’  It’s the ‘Not Me’ Factor.”

While Craig acknowledges that “there is some cause to believe that the best trained reviewers on the best managed review teams get very close to the performance of technology-assisted review”, he notes that they “can only achieve the same result by reviewing all of the documents in the collection, instead of the 2%-5% of the collection needed to be reviewed using predictive coding”.  He asks “[i]f human review isn’t better (and it appears to generally be far worse) and predictive coding costs much less and takes less time, where’s the rational argument for human review?”

Good question.  Having worked with some large review teams with experienced and proficient document reviewers at an eDiscovery provider that employed a follow-up QC check of reviewed documents, I can still recall how often those well-trained reviewers were surprised at some of the classification mistakes they made.  And, I worked on one project with over a hundred reviewers working several months, so you can imagine how expensive that was.

BTW, Craig is no stranger to this blog – in addition to several of his articles we’ve referenced, we’ve also conducted thought leader interviews with him at LegalTech New York the past three years.  Here’s a link if you want to check those out.

So, what do you think?  Do you think human review is better than technology assisted review?  If so, why?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

200,000 Visits on eDiscovery Daily! – eDiscovery Milestones

While we may be “just a bit behind” Google in popularity (900 million visits per month), we’re proud to announce that yesterday eDiscoveryDaily reached the 200,000 visit milestone!  It took us a little over 21 months to reach 100,000 visits and just over 11 months to get to 200,000 (don’t tell my boss, he’ll expect 300,000 in 5 1/2 months).  When we reach key milestones, we like to take a look back at some of the recent stories we’ve covered, so here are some recent eDiscovery items of interest.

EDRM Data Set “Controversy”: Including last Friday, we have covered the discussion related to the presence of personally-identifiable information (PII) data (including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers) within the Electronic Discovery Reference Model (EDRM) Enron Data Set and the “controversy” regarding the effort to clean it up (additional posts here and here).

Minnesota Implements Changes to eDiscovery Rules: States continue to be busy with changes to eDiscovery rules. One such state is Minnesota, which has amending its rules to emphasize proportionality, collaboration, and informality in the discovery process.

Changes to Federal eDiscovery Rules Could Be Coming Within a Year: Another major set of amendments to the discovery provisions of the Federal Rules of Civil Procedure is getting closer and could be adopted within the year.  The United States Courts’ Advisory Committee on Civil Rules voted in April to send a slate of proposed amendments up the rulemaking chain, to its Standing Committee on Rules of Practice and Procedure, with a recommendation that the proposals be approved for publication and public comment later this year.

I Tell Ya, Information Governance Gets No Respect: A new report from 451 Research has indicated that “although lawyers are bullish about the prospects of information governance to reduce litigation risks, executives, and staff of small and midsize businesses, are bearish and ‘may not be placing a high priority’ on the legal and regulatory needs for litigation or government investigation.”

Is it Time to Ditch the Per Hour Model for Document Review?: Some of the recent stories involving alleged overbilling by law firms for legal work – much of it for document review – begs the question whether it’s time to ditch the per hour model for document review in place of a per document rate for review?

Fulbright’s Litigation Trends Survey Shows Increased Litigation, Mobile Device Collection: According to Fulbright’s 9th Annual Litigation Trends Survey released last month, companies in the United States and United Kingdom continue to deal with, and spend more on litigation.  From an eDiscovery standpoint, the survey showed an increase in requirements to preserve and collect data from employee mobile devices, a high reliance on self-preservation to fulfill preservation obligations and a decent percentage of organizations using technology assisted review.

We also covered Craig Ball’s Eight Tips to Quash the Cost of E-Discovery (here and here) and interviewed Adam Losey, the editor of IT-Lex.org (here and here).

Jane Gennarelli has continued her terrific series on Litigation 101 for eDiscovery Tech Professionals – 32 posts so far, here is the latest.

We’ve also had 15 posts about case law, just in the last 2 months (and 214 overall!).  Here is a link to our case law posts.

On behalf of everyone at CloudNine Discovery who has worked on the blog over the last 32+ months, thanks to all of you who read the blog every day!  In addition, thanks to the other publications that have picked up and either linked to or republished our posts!  We really appreciate the support!  Now, on to 300,000!

And, as always, please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Some Additional Perspective on the EDRM Enron Data Set “Controversy” – eDiscovery Trends

Sharon Nelson wrote a terrific post about the “controversy” regarding the Electronic Discovery Reference Model (EDRM) Enron Data Set in her Ride the Lightning blog (Is the Enron E-Mail Data Set Worth All the Mudslinging?).  I wanted to repeat some of her key points here and offer some of my own perspective directly from sitting in on the Data Set team during the EDRM Annual Meeting earlier this month.

But, First a Recap

To recap, the EDRM Enron Data Set, sourced from the FERC Enron Investigation release made available by Lockheed Martin Corporation, has been a valuable resource for eDiscovery software demonstration and testing (we covered it here back in January 2011).  Initially, the data was made available for download on the EDRM site, then subsequently moved to Amazon Web Services (AWS).  However, after much recent discussion about personally-identifiable information (PII) data (including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers) available within FERC (and consequently the EDRM Data Set), the EDRM Data Set was taken down from the AWS site.

Then, a couple of weeks ago, EDRM, along with Nuix, announced that they have republished version 1 of the EDRM Enron PST Data Set (which contains over 1.3 million items) after cleansing it of private, health and personal financial information. Nuix and EDRM have also published the methodology Nuix’s staff used to identify and remove more than 10,000 high-risk items, including credit card numbers (60 items), Social Security or other national identity numbers (572), individuals’ dates of birth (292) and other personal data.  All personal data gone, right?

Not so fast.

As noted in this Law Technology News article by Sean Doherty (Enron Sandbox Stirs Up Private Data, Again), “Index Engines (IE) obtained a copy of the Nuix-cleansed Enron data for review and claims to have found many ‘social security numbers, legal documents, and other information that should not be made public.’ IE evidenced its ‘find’ by republishing a redacted version of a document with PII” (actually, a handful of them).  IE and others were quite critical of the effort by Nuix/EDRM and the extent of the PII data still remaining.

As he does so well, Rob Robinson has compiled a list of articles, comments and posts related to the PII issue, here is the link.

Collaboration, not criticism

Sharon’s post had several observations regarding the data set “controversy”, some of which are repeated here:

  • “Is the legal status of the data pretty clear? Yes, when a court refused to block it from being made public apparently accepting the greater good of its release, the status is pretty clear.”
  • “Should Nuix be taken to task for failure to wholly cleanse the data? I don’t think so. I am not inclined to let perfect be the enemy of the good. A lot was cleansed and it may be fair to say that Nuix was surprised by how much PII remained.”
  • “The terms governing the download of the data set made clear that there was no guarantee that all the PII was removed.” (more on that below in my observations)
  • “While one can argue that EDRM should have done something about the PII earlier, at least it is doing something now. It may be actively helpful to Nuix to point out PII that was not cleansed so it can figure out why.”
  • “Our expectations here should be that we are in the midst of a cleansing process, not looking at the data set in a black or white manner of cleansed or uncleansed.”
  • “My suggestion? Collaboration, not criticism. I believe Nuix is anxious to provide the cleanest version of the data possible – to the extent that others can help, it would be a public service.”

My Perspective from the Data Set Meeting

I sat in on part of the Data Set meeting earlier this month and there was a couple of points discussed during the meeting that I thought were worth relaying:

1.     We understood that there was no guarantee that all of the PII data was removed.

As with any process, we understood that there was no effective way to ensure that all PII data was removed after the process was complete and discussed needing a mechanism for people to continue to report PII data that they find.  On the download page for the data set, there was a link to the legal disclaimer page, which states in section 1.8:

“While the Company endeavours to ensure that the information in the Data Set is correct and all PII is removed, the Company does not warrant the accuracy and/or completeness of the Data Set, nor that all PII has been removed from the Data Set. The Company may make changes to the Data Set at any time without notice.”

With regard to a mechanism for reporting persistent PII data, there is this statement on the Data Set page on the EDRM site:

PII: These files may contain personally identifiable information, in spite of efforts to remove that information. If you find PII that you think should be removed, please notify us at mail@edrm.net.”

2.     We agreed that any documents with PII data should be removed, not redacted.

Because the original data set, with all of the original PII data, is available via FERC, we agreed that any documents containing sensitive personal information should be removed from the data set – NOT redacted.  In essence, redacting those documents is putting a beacon on them to make it easier to find them in the FERC set or downloaded copies of the original EDRM set, so the published redacted examples of missed PII only serves to facilitate finding those documents in the original sets.

Conclusion

Regardless of how effective the “cleansing” of the data set was perceived to be by some, it did result in removing over 10,000 items with personal data.  Yet, some PII data evidently remains.  While some people think (and they may have a point) that the data set should not have been published until after an independent audit for remaining PII data, it seems impractical (to me, at least) to wait until it is “perfect” before publishing the set.  So, when is it good enough to publish?  That appears to be open to interpretation.

Like Sharon, my hope is that we can move forward to continue to improve the Data Set through collaboration and that those who continue to find PII data in the set will notify EDRM, so that they can remove those items and continue to make the set better.  I’d love to see the Data Set page on the EDRM site reflect a history of each data set update, with the revision date, the number of additional PII items found and removed and who identified them (to give credit to those finding the data).  As Canned Heat would say, “Let’s Work Together”.

And, we haven’t even gotten to version 2 of the Data Set yet – more fun ahead!  🙂

So, what do you think?  Have you used the EDRM Enron Data Set?  If so, do you plan to download the new version?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Just a Reminder to Think Before You Hit Send – eDiscovery Best Practices

With Anthony Weiner’s announcement that he is attempting a political comeback by running for mayor on New York City, it’s worth remembering the “Twittergate” story that ultimately cost his congressional seat in the first place – not to bash him, but to remind all of us how important it is to think before you hit send (even if he did start his campaign by using a picture of Pittsburgh’s skyline instead of NYC’s — oops!).  Here is another reminder of that fact.

Chili’s Waitress Fired Over Facebook Post Insulting ‘Stupid Cops’

As noted on jobs.aol.com, a waitress at an Oklahoma City Chili’s posted a photo of three Oklahoma County Sheriff’s deputies on her Facebook page along with the comment: “Stupid Cops better hope I’m not their server FDP.” (A handy abbreviation for F*** Da Police.)

The woman, Ashley Warden, might have had reason to hold a grudge against her local police force. Last year she made national news when her potty-training toddler pulled down his pants in his grandmother’s front yard, and a passing officer handed Warden a public urination ticket for $2,500. (The police chief later apologized and dropped the charges, while the ticketing officer was fired.)

Nonetheless, Warden’s Facebook post quickly went viral on law enforcement sites and Chili’s was barraged with calls demanding that she be fired. Chili’s agreed. “With the changing world of digital and social media, Chili’s has Social Media Guidelines in place, asking our team members to always be respectful of our guests and to use proper judgement when discussing actions in the work place …,” the restaurant chain said in a statement. “After looking into the matter, we have taken action to prevent this from happening again.”

Best Practices and Social Media Guidelines

Another post on jobs.aol.com discusses some additional examples of people losing their jobs for Facebook posts, along with six tips for making posts that should keep you from getting fired, by making sure the posts would be protected by the National Labor Relations Board (NLRB), which is the federal agency tasked with protecting employees’ rights to association and union representation.

Perhaps so, though, as the article notes, the NLRB “has struggled to define how these rights apply to the virtual realm”.  It’s worth noting that, in their statement, Chili’s referred to violation of their social media guidelines as a reason for the termination.  As we discussed on this blog some time ago, having a social governance policy in place is a good idea to govern use of outside email, chat and social media that covers what employees should and should not do (and the post identified several factors that such a policy should address).

Thinking before you hit send in these days of pervasive social media means, among other things, being familiar with your organization’s social media policies and ensuring compliance with those policies.  If you’re going to post anything related to your job, that’s important to keep in mind.  To think before you hit send also involves educating yourself as to what you should and should not do when posting to social media sites.

Of course it’s also important to remember that social media factors into discovery more than ever these days, as these four cases (just from the first few months of this year) illustrate.

So, what do you think?  Does your organization have social media guidelines?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Daily will return after the Memorial Day Holiday.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Welcome to LegalTech West Coast 2013! – eDiscovery Trends

Today is the start of LegalTech® West Coast 2013 (LTWC) and eDiscoveryDaily is here to report about the latest eDiscovery trends being discussed at the show.  Today, we will provide a description of some of the sessions related to eDiscovery to give you a sense of the topics being covered.  If you’re in the Los Angeles area, come check out the show – there are a number of sessions available and 62 exhibitors providing information on their products and services, including (shameless plug warning!) my company, CloudNine Discovery, which just announced yesterday that we will be previewing a brand new, browser-independent version of our linear review application, OnDemand®.

OnDemand’s completely new interface also includes several new analytics and filtering capabilities, and we will be exhibiting at booth #111 along with our partners, First Digital Solutions.  Come by and say hi!  End of shameless plug!  🙂

Perform a “find” on today’s LTNY conference schedule for “discovery” and you’ll get 24 hits.  So, there is plenty to talk about!  Sessions in the main conference tracks include:

10:30 AM – 12:00 PM:

International Privacy and its Impact on e-Discovery

As technology makes the world smaller and smaller, the US market is running into significant challenges when it comes to e-discovery. Other countries, specifically those in Asia, have much more lax rules on privacy and this can prove an ED nightmare. Simply, what are the rules for international privacy and how will the impact your e-Discovery work. This panel of industry experts will examine the privacy implications you must consider when conducting e-discovery globally. With a special focus on Asia and those in the Southern California market, panelists will provide guidance and substantive information to address privacy and security concerns in relation to international e-Discovery.

Speakers are: Aaron Crews, Shareholder, Littler Mendelson; Therese Miller, Of Counsel, Shook, Hardy & Bacon and Cameron R. Krieger, eDiscovery Attorney, Latham & Watkins LLP.

A Panel of Experts: A Candid Conversation

A panel of expert judges and lawyers will discuss cutting edge ediscovery challenges. Bring your questions for prestigious members of the bench and bar.

Speakers are: Honorable Suzanne H. Segal, United States Chief Magistrate Judge, Central District of California; Honorable Jay C. Gandhi, United States Magistrate Judge, Central District of California; Jeffrey Fowler, Partner, O’Melveny & Myers LLP.  Moderator: David D. Lewis Ph. D., IR Consultant.

2:00 – 3:15 PM:

The E-Discovery Debate

Is there a silver bullet when it comes to Technology Assisted Review (TAR)? What about Predictive Coding? There are those in the market who feel there is a one stop solution for all organizations; others believe it is a case by case basis. This session will allow you to be the judge. Hear both sides of the equation to better apply the lessons learned to your own e-Discovery. The debaters will also cover the various points of view when it comes to the cloud, social media and security policies with regards to e-Discovery.

Speaker is: Jack Halprin, Head of Ediscovery, Enterprise, Google Inc.  Moderator: Hunter W. McMahon, JD, Senior Consultant, Driven, Inc.

Creative Ediscovery Problem Solving

There is no right answer. There is no wrong answer. There is only the BEST answer. This session will help you find the best possible solution to your ediscovery problems. In this brainstorm power session, you will:

  • Tackle the latest ediscovery problems
  • Develop action plans
  • Discuss meaningful ways to implement solutions

Speakers are: Linda Baynes, Associate Operations Director, Orrick, Herrington & Sutcliffe and Adam Sand, Associate General Counsel, Ancestry.com.  Moderator: John Reikes, Account Executive, Kroll Ontrack.

3:45 – 5:00 PM:

Judges’ Panel: The Current State of the ED Market

Join us for the always informative judges’ panel at LegalTech West Coast. We’ve assembled a panel of judges from the west coast to discuss their views of the ED market today and give their insight to where they see the market going. Make sure you are up on the current issues in the market and prepare your team for future complications and concerns by hearing what it is the bench is considering today.

5 Daunting Ediscovery Challenges: A Live Deliberation

This exercise will involve audience participation through a panel-led discussion. Be prepared to deliberate some of the most complex challenges facing ediscovery including:

  • Ever-changing technology rules
  • Unpredictable costs
  • Underutilization of Technology-assisted Review (TAR)
  • Primitive case analytics
  • Transactional ediscovery

Panelists are: Jack Halprin, Head of Ediscovery, Enterprise, Google Inc. and Adam Sand, Associate General Counsel, Ancestry.com.  Moderator: Chris Castaldini, Account Executive, Kroll Ontrack.

In addition to these, there are other eDiscovery-related sessions today.  For a complete description for all sessions today, click here.

So, what do you think?  Are you planning to attend LTWC this year?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Version 1 of the EDRM Enron Data Set NOW AVAILABLE – eDiscovery Trends

Last week, we reported from the Annual Meeting for the Electronic Discovery Reference Model (EDRM) group and discussed some significant efforts and accomplishments by each of the project teams within EDRM.  That included an update from the EDRM Data Set project, where an effort was underway to identify and remove personally-identifiable information (“PII”) data from the EDRM Data Set.  Now, version 1 of the Data Set is completed and available for download.

To recap, the EDRM Enron Data Set, sourced from the FERC Enron Investigation release made available by Lockheed Martin Corporation, has been a valuable resource for eDiscovery software demonstration and testing (we covered it here back in January 2011).  Initially, the data was made available for download on the EDRM site, then subsequently moved to Amazon Web Services (AWS).  However, after much recent discussion about PII data (including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers) available within FERC (and consequently the EDRM Data Set), the EDRM Data Set was taken down from the AWS site.

Yesterday, EDRM, along with Nuix, announced that they have republished version 1 of the EDRM Enron PST Data Set (which contains over 1.3 million items) after cleansing it of private, health and personal financial information. Nuix and EDRM have also published the methodology Nuix’s staff used to identify and remove more than 10,000 high-risk items.

As noted in the announcement, Nuix consultants Matthew Westwood-Hill and Ady Cassidy used a series of investigative workflows to identify the items, which included:

  • 60 items containing credit card numbers, including departmental contact lists that each contained hundreds of individual credit cards;
  • 572 items containing Social Security or other national identity numbers—thousands of individuals’ identity numbers in total;
  • 292 items containing individuals’ dates of birth;
  • 532 items containing information of a highly personal nature such as medical or legal matters.

While the personal data was (and still is) available via FERC long before the EDRM version was created, completion of this process will mean that many in the eDiscovery industry that rely on this highly useful data set for testing and software demonstration can now use a version which should be free from sensitive personal information!

For more information regarding the announcement, click here. The republished version 1 of the Data Set, as well as the white paper discussing the methodology is available at nuix.com/enron.  Nuix is currently applying the same methodology to the EDRM Enron Data Set v2 (which contains nearly 2.3 million items) and will publish to the same site when complete.

So, what do you think?  Have you used the EDRM Enron Data Set?  If so, do you plan to download the new version?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

More Updates from the EDRM Annual Meeting – eDiscovery Trends

Yesterday, we discussed some general observations from the Annual Meeting for the Electronic Discovery Reference Model (EDRM) group and discussed some significant efforts and accomplishments by the (suddenly heavily talked about) EDRM Data Set project.  Here are some updates from other projects within EDRM.

It should be noted these are summary updates and that most of the focus on these updates is on accomplishments for the past year and deliverables that are imminent.  Over the next few weeks, eDiscovery Daily will cover each project in more depth with more details regarding planned activities for the coming year.

Model Code of Conduct (MCoC)

The MCoC was introduced in 2011 and became available for organizations to subscribe last year.  To learn more about the MCoC, you can read the code online here, or download it as a 22 page PDF file here.  Subscribing is easy!  To voluntarily subscribe to the MCoC, you can register on the EDRM website here.  Identify your organization, provide information for an authorized representative and answer four verification questions (truthfully, of course) to affirm your organization’s commitment to the spirit of the MCoC, and your organization is in!  You can also provide a logo for EDRM to include when adding you to the list of subscribing organizations.  Pending a survey of EDRM members to determine if any changes are needed, this project has been completed.  Team leaders include Eric Mandel of Zelle Hofmann, Kevin Esposito of Rivulex and Nancy Wallrich.

Information Governance Reference Model (IGRM)

The IGRM team has continued to make strides and improvements on an already terrific model.  Last October, they unveiled the release of version 3.0 of the IGRMAs their press release noted, “The updated model now includes privacy and security as primary functions and stakeholders in the effective governance of information.”  IGRM continues to be one of the most active and well participated EDRM projects.  This year, the early focus – as quoted from Judge Andrew Peck’s keynote speech at Legal Tech this past year – is “getting rid of the junk”.  Project leaders are Aliye Ergulen from IBM, Reed Irvin from Viewpointe and Marcus Ledergerber from Morgan Lewis.

Search

One of the best examples of the new, more agile process for creating deliverables within EDRM comes from the Search team, which released its new draft Computer Assisted Review Reference Model (CARRM), which depicts the flow for a successful Computer Assisted Review project. The entire model was created in only a matter of weeks.  Early focus for the Search project for the coming year includes adjustments to CARRM (based on feedback at the annual meeting).  You can also still send your comments regarding the model to mail@edrm.net or post them on the EDRM site here.  A webinar regarding CARRM is also planned for late July.  Kudos to the Search team, including project leaders Dominic Brown of Autonomy and also Jay Lieb of kCura, who got unmerciful ribbing for insisting (jokingly, I think) that TIFF files, unlike Generalissimo Francisco Franco, are still alive.  🙂

Jobs

In late January, the Jobs Project announced the release of the EDRM Talent Task Matrix diagram and spreadsheet, which is available in XLSX or PDF format. As noted in their press release, the Matrix is a tool designed to help hiring managers better understand the responsibilities associated with common eDiscovery roles. The Matrix maps responsibilities to the EDRM framework, so eDiscovery duties associated can be assigned to the appropriate parties.  Project leader Keith Tom noted that next steps include surveying EDRM members regarding the Matrix, requesting and co-authoring case-studies and white papers, and creating a short video on how to use the Matrix.

Metrics

In today’s session, the Metrics project team unveiled the first draft of the new Metrics model to EDRM participants!  Feedback was provided during the session and the team will make the model available for additional comments from EDRM members over the next week or so, with a goal of publishing for public comments in the next two to three weeks.  The team is also working to create a page to collect Metrics measurement tools from eDiscovery professionals that can benefit the eDiscovery community as a whole.  Project leaders Dera Nevin of TD Bank and Kevin Clark noted that June is “budget calculator month”.

Other Initiatives

As noted yesterday, there is a new project to address standards for working with native files in the different EDRM phases led by Eric Mandel from Zelle Hofmann and also a new initiative to establish collection guidelines, spearheaded by Julie Brown from Vorys.  There is also an effort underway to refocus the XML project, as it works to complete the 2.0 version of the EDRM XML model.  In addition, there was quite a spirited discussion as to where EDRM is heading as it approaches ten years of existence and it will be interesting to see how the EDRM group continues to evolve over the next year or so.  As you can see, a lot is happening within the EDRM group – there’s a lot more to it than just the base Electronic Discovery Reference Model.

So, what do you think?  Are you a member of EDRM?  If not, why not?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Reporting from the EDRM Annual Meeting and a Data Set Update – eDiscovery Trends

The Electronic Discovery Reference Model (EDRM) Project was created in May 2005 by George Socha of Socha Consulting LLC and Tom Gelbmann of Gelbmann & Associates to address the lack of standards and guidelines in the electronic discovery market.  Now, beginning its ninth year of operation with its annual meeting in St. Paul, MN, EDRM is accomplishing more than ever to address those needs.  Here are some highlights from the meeting, and an update regarding the (suddenly heavily talked about) EDRM Data Set project.

Annual Meeting

Twice a year, in May and October, eDiscovery professionals who are EDRM members meet to continue the process of working together on various standards projects.  This will be my eighth year participating in EDRM at some level and, oddly enough, I’m assisting with PR and promotion (how am I doing so far?).  eDiscovery Daily has referenced EDRM and its phases many times in the 2 1/2 years plus history of the blog – this is our 144th post that relates to EDRM!

Some notable observations about today’s meeting:

  • New Participants: More than half the attendees at this year’s annual meeting are attending for the first time.  EDRM is not just a core group of “die-hards”, it continues to find appeal with eDiscovery professionals throughout the industry.
  • Agile Approach: EDRM has adopted an Agile approach to shorten the time to complete and publish deliverables, a change in philosophy that facilitated several notable accomplishments from working groups over the past year including the Model Code of Conduct (MCoC), Information Governance Reference Model (IGRM), Search and Jobs (among others).  More on that tomorrow.
  • Educational Alliances: For the first time, EDRM has formed some interesting and unique educational alliances.  In April, EDRM teamed with the University of Florida Levin College of Law to present a day and a half conference entitled E-Discovery for the Small and Medium Case.  And, this June, EDRM will team with Bryan University to provide an in-depth, four-week E-Discovery Software & Applied Skills Summer Immersion Program for Law School Students.
  • New Working Group: A new working group to be lead by Eric Mandel of Zelle Hoffman was formed to address standards for working with native files in the different EDRM phases.

Tomorrow, we’ll discuss the highlights for most of the individual working groups.  Given the recent amount of discussion about the EDRM Data Set group, we’ll start with that one today!

Data Set

The EDRM Enron Data Set has been around for several years and has been a valuable resource for eDiscovery software demonstration and testing (we covered it here back in January 2011).  The data in the EDRM Enron PST Data Set files is sourced from the FERC Enron Investigation release made available by Lockheed Martin Corporation.  It was reconstituted as PST files with attachments for the EDRM Data Set Project.  So, in essence EDRM took already public domain available data and made the data much more usable.  Initially, the data was made available for download on the EDRM site, then subsequently moved to Amazon Web Services (AWS).

In the past several days, there has been much discussion about the personally-identifiable information (“PII”) available within the FERC (and consequently the EDRM Data Set), including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers.  Consequently, the EDRM Data Set has been taken down from the AWS site.

The Data Set team led by Michael Lappin of Nuix and Eric Robi of Elluma Discovery has been working on a process (using predictive coding technology) to identify and remove the PII data from the EDRM Data Set.  Discussions about this process began months ago, prior to the recent discussions about the PII data contained within the set.  The team has completed this iterative process for V1 of the data set (which contains 1,317,158 items), identifying and removing 10,568 items with PII, HIPAA and other sensitive information.  This version of the data set will be made available within the EDRM community shortly for peer review testing.  The data set team will then repeat the process for the larger V2 version of the data set (2,287,984 items).  A timetable for republishing both sets should be available soon and the efforts of the Data Set team on this project should pay dividends in developing and standardizing processes for identifying and eliminating sensitive data that eDiscovery professionals can use in their own data sets.

The team has also implemented a Forensic Files Testing Project site where users can upload their own “modern”, non-copyrighted file samples that are typically encountered during electronic discovery processing to provide a more diverse set of data than is currently available within the Enron data set.

So, what do you think?  How has EDRM impacted how you manage eDiscovery?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Appeals Court Upholds Decision Not to Recuse Judge Peck in Da Silva Moore – eDiscovery Case Law

As reported by IT-Lex, the Second Circuit of the US Court of Appeals rejected the Plaintiff’s request for a writ of mandamus recusing Magistrate Judge Andrew J. Peck from Da Silva Moore v. Publicis Groupe SA.

The entire opinion is stated as follows:

“Petitioners, through counsel, petition this Court for a writ of mandamus compelling the recusal of Magistrate Judge Andrew J. Peck. Upon due consideration, it is hereby ORDERED that the mandamus petition is DENIED because Petitioners have not ‘clearly and indisputably demonstrate[d] that [Magistrate Judge Peck] abused [his] discretion’ in denying their district court recusal motion, In re Basciano, 542 F. 3d 950, 956 (2d Cir. 2008) (internal quotation marks omitted) (quoting In re Drexel Burnham Lambert Inc., 861 F.2d 1307, 1312-13 (2d Cir. 1988)), or that the district court erred in overruling their objection to that decision.”

Now, the plaintiffs have been denied in their recusal efforts in three courts.

Since it has been a while, let’s recap the case for those who may have not been following it and may be new to the blog.

Last year, back in February, Judge Peck issued an opinion making this case likely the first case to accept the use of computer-assisted review of electronically stored information (“ESI”) for this case.  However, on March 13, District Court Judge Andrew L. Carter, Jr. granted the plaintiffs’ request to submit additional briefing on their February 22 objections to the ruling.  In that briefing (filed on March 26), the plaintiffs claimed that the protocol approved for predictive coding “risks failing to capture a staggering 65% of the relevant documents in this case” and questioned Judge Peck’s relationship with defense counsel and with the selected vendor for the case, Recommind.

Then, on April 5, Judge Peck issued an order in response to Plaintiffs’ letter requesting his recusal, directing plaintiffs to indicate whether they would file a formal motion for recusal or ask the Court to consider the letter as the motion.  On April 13, (Friday the 13th, that is), the plaintiffs did just that, by formally requesting the recusal of Judge Peck (the defendants issued a response in opposition on April 30).  But, on April 25, Judge Carter issued an opinion and order in the case, upholding Judge Peck’s opinion approving computer-assisted review.

Not done, the plaintiffs filed an objection on May 9 to Judge Peck’s rejection of their request to stay discovery pending the resolution of outstanding motions and objections (including the recusal motion, which has yet to be ruled on.  Then, on May 14, Judge Peck issued a stay, stopping defendant MSLGroup’s production of electronically stored information.  On June 15, in a 56 page opinion and order, Judge Peck denied the plaintiffs’ motion for recusal.  Judge Carter ruled on the plaintiff’s recusal request on November 7, denying the request and stating that “Judge Peck’s decision accepting computer-assisted review … was not influenced by bias, nor did it create any appearance of bias”.

So, what do you think?  Will this finally end the recusal question in this case?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Is it Time to Ditch the Per Hour Model for Document Review? – eDiscovery Trends

Some of the recent stories involving alleged overbilling by law firms for legal work – much of it for document review – begs the question whether it’s time to ditch the per hour model for document review in place of a per document rate for review?

As discussed by D. Casey Flaherty in Law Technology News (DLA Piper Is Not Alone: Why Law Firms Overbill), DLA Piper has been sued by its client – to the tune of over $22 million – for overbilling.  When DLA Piper produced some 250,000 documents in response to its client’s eDiscovery requests, some embarrassing internal emails were included in that production.  For example:

  • “I hear we are already 200K over our estimate – that’s Team DLA Piper!”
  • “DLA seems to love to low ball the bills and with the number of bodies being thrown at this thing, it’s going to stay stupidly high and with the absurd litigation POA has been in for years, it does have lots of wrinkles.”
  •  “It’s a Thomson project, he goes full time on whatever debtor case he has running. Full time, 2 days a week.”
  • “[N]ow Vince has random people working full time on random research projects in standard ‘churn that bill, baby!’ mode. That bill shall show no limits.”
  • “Didn’t you use three associates to prepare for a first day hearing where you filed three documents?”

In his article, Flaherty provides two other examples of (at least) perceived overbilling:

  • In the Madoff case, the government “used 6,000 hours of attorney time to procure a $140 million settlement offer (more than $23,000 delivered per hour spent)”.  Your federal tax dollars hard at work!  However, the plaintiffs’ law firms “expended 118,000 additional attorney hours on the same matter to deliver the final version of that settlement at $219 million” and seek $40 million for delivering $39 million in incremental value (once you subtract their proposed $40 million in fees).  “It appears that most of the 110 lawyers are contract attorneys performing basic document review; the plaintiffs firms are just marking them up at many, many multiples of their actual cost.”
  • In the Citigroup derivatives class action settlement, plaintiffs firms “reached a $590 million settlement from which they now seek almost $100 million in fees for 87,000 hours of billable time (average, $1,150 per hour). The bulk of the hours were spent on low-level document review work” where contract attorneys were paid $40 to $60 per hour and “the plaintiffs firms are seeking $550 to $1,000 plus per hour for those services”.

While the DLA Piper example isn’t specifically about document review overbilling, it does reflect how cavalier some firms (or at least some attorneys at those firms) can be about the subject of overbilling.  For the other two examples above, document review overbilling appears to be at the core of those disputes.  There are admittedly different levels of document review, depending on whether the attorneys are performing a straightforward responsiveness review, a privilege review, or a more detailed subject matter/issue coding review.  Nonetheless, the number of documents in the collection is finite and the cost for review should be somewhat predictable, regardless of the level of review being conducted.

Why don’t more firms offer a per document rate for document review?  Or, perhaps a better question would be why don’t more organizations insist on a per document rate?  That seems like a better way to make document review costs more predictable and more consistent.  I’m not sure why, other than “that’s the way we’ve always done it”, that it hasn’t become more predominant.  Knowing the per document rate and the number of documents to be reviewed up front would seem to eliminate overbilling disputes for document review, at least.

So, what do you think?  Is it time to ditch the per hour model for document review?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.