Presentation Archives

For a Successful Outcome to Your Discovery Project, Work Backwards: eDiscovery Best Practices

May 22, 2015

Based on a recent experience with a client, it seemed appropriate to revisit this topic. Plus, it’s always fun to play with the EDRM model. Notice anything different? 🙂

While the Electronic Discovery Reference Model from EDRM has become the standard model for the workflow of the process for handling electronically stored information (ESI) in discovery, it might be helpful to think about the EDRM model and work backwards, whether you’re the producing party or the receiving party.

Why work backwards?

You can’t have a successful outcome without envisioning the successful outcome that you want to achieve. The end of the discovery process includes the production and presentation stages, so it’s important to determine what you want to get out of those stages. Let’s look at them.

Presentation

Whether you’re a receiving party or a producing party, it’s important to think about what types of evidence you need to support your case when presenting at depositions and at trial – this is the type of information that needs to be included in your production requests at the beginning of the case as well as the type of information that you’ll need to preserve as a producing party.

Production

The format of the ESI produced is important to both sides in the case. For the receiving party, it’s important to get as much useful information included in the production as possible. This includes metadata and searchable text for the produced documents, typically with an index or load file to facilitate loading into a review application. The most useful form of production is native format files with all metadata preserved as used in the normal course of business.

For the producing party, it’s important to be efficient and minimize costs, so it’s important to agree to a production format that minimizes production costs. Converting files to an image based format (such as TIFF) adds costs, so producing in native format can be cost effective for the producing party as well. It’s also important to determine how to handle issues such as privilege logs and redaction of privileged or confidential information.

Addressing production format issues up front will maximize cost savings and enable each party to get what they want out of the production of ESI. If you don’t, you could be arguing in court like our case participants from yesterday’s post.

Processing-Review-Analysis

It also pays to make decisions early in the process that affect processing, review and analysis. How should exception files be handled? What do you do about files that are infected with malware? These are examples of issues that need to be decided up front to determine how processing will be handled.

As for review, the review tool being used may impact how quick and easy it is to get started, to load data and to use the tool, among other considerations. If it’s Friday at 5 and you have to review data over the weekend, is it easy to get started? As for analysis, surely you test search terms to determine their effectiveness before you agree on those terms with opposing counsel, right?

Preservation-Collection-Identification

Long before you have to conduct preservation and collection for a case, you need to establish procedures for implementing and monitoring litigation holds, as well as prepare a data map to identify where corporate information is stored for identification, preservation and collection purposes.

And, before a case even begins, you need an effective Information Governance program to minimize the amount of data that you might have to consider for responsiveness in the first place.

As you can see, at the beginning of a case (and even before), it’s important to think backwards within the EDRM model to ensure a successful discovery process. Decisions made at the beginning of the case affect the success of those latter stages, so working backwards can help ensure a successful outcome!

So, what do you think? What do you do at the beginning of a case to ensure success at the end? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

This Guy Says that Computers Could Eventually Replace Lawyers – In the Courtroom: eDiscovery Trends

April 29, 2015

Over four years ago, we covered an article in The New York Times that discussed how the use of artificial intelligence could lead to replacing “armies of expensive lawyers” during the eDiscovery process. Now, a new article in The Wall Street Journal online goes a step further, speculating that “computers will eventually pass the legal bar exam and defendants will be given the right to be represented by a computational attorney if they so wish”.

What Big Data Means for the Legal System, written by Robert Plant (not the Led Zeppelin singer, but a professor at the University of Miami, as well as an author and blogger for Harvard Business Review & WSJ Leadership Expert) discusses how artificial intelligence researchers have used the legal domain as an exploratory space to test theories for decades, but with limited success. The advent of big data has changed that, enabling us to analyze not only text but many other data types such as pictures, email, video and voice. As Plant notes, this capability “allows lawyers to look for patterns and correlations across vast data sets previously inaccessible.”

Plant uses analysis of judges’ behavior in cases as an example, suggesting the ability to obtain answers to questions like: “How does the Judge rule on certain types of cases can be studied by date and time? Does the judge dismiss cases for a consistent pattern of reasoning? How do holidays affect decisions? Do they sentence harder at different times of the day?”

Because of big data analytics, Plant predicts that “[m]any of the routine tasks now performed by entry-level lawyers or paralegals will increasingly be undertaken by analytics; case and trial strategies will be developed by legal informatics as will increasingly jury-selection strategies.” As a result, Plant takes the concept to a somewhat controversial conclusion, as follows:

“It is clear that with advances in machine learning, computers will eventually pass the legal bar exam and defendants will be given the right to be represented by a computational attorney if they so wish and thus court rooms could see a truly new form of human computer interaction in which the computer answers the question ‘does the client have a case?’”

Must he “ramble on”? Computers replace lawyers?!? In the courtroom?!? He sure isn’t showing the legal profession a “whole lotta love”, is he? (sorry, I couldn’t resist)

Clearly, we’ve seen the application of artificial intelligence result in significant benefits during the eDiscovery process, with several cases over the past few years endorsing technology assisted review (including this latest case just last month) as well as initiatives to apply technology to information governance (such as the Information Governance Initiative launched last year). Is it that far of a stretch to apply technology to decision making in the courtroom too? Or is the author simply “dazed and confused”? (ok, I really will stop now)

So, what do you think? Will clients someday be represented by computers in the courtroom? Please share any comments you might have or if you’d like to know more about a particular topic.

Clipart from Clipartheaven.com

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Our 1,000th Post! – eDiscovery Milestones

September 3, 2014

When we launched nearly four years ago on September 20, 2010, our goal was to be a daily resource for eDiscovery news and analysis. Now, after doing so each business day (except for one), I’m happy to announce that today is our 1,000th post on eDiscovery Daily!

We’ve covered the gamut in eDiscovery, from case law to industry trends to best practices. Here are some of the categories that we’ve covered and the number of posts (to date) for each:

Case Law (326 posts), including those dealing with Sanctions (151)
Searching (238)
Proportionality (140)
Law Firm Departments (115)
Project Management (102)
Outsourcing (97)
Social Media (95)
Federal Discovery Rules (68)
SaaS Based Technologies (65)
State Discovery Rules (35)

We’ve also covered every phase of the EDRM (177) life cycle, including:

Every post we have published is still available on the site for your reference, which has made eDiscovery Daily into quite a knowledgebase! We’re quite proud of that.

Comparing our first three months of existence to now, we have seen traffic on our site grow an amazing 474%! Our subscriber base has more than tripled in the last three years! We want to take this time to thank you, our readers and subcribers, for making that happen. Thanks for making the eDiscoveryDaily blog a regular resource for your eDiscovery news and analysis! We really appreciate the support!

We also want to thank the blogs and publications that have linked to our posts and raised our public awareness, including Pinhawk, Ride the Lightning, Litigation Support Guru, Complex Discovery, Bryan University, The Electronic Discovery Reading Room, Litigation Support Today, Alltop, ABA Journal, Litigation Support Blog.com, InfoGovernance Engagement Area, EDD Blog Online, eDiscovery Journal, e-Discovery Team ® and any other publication that has picked up at least one of our posts for reference (sorry if I missed any!). We really appreciate it!

I also want to extend a special thanks to Jane Gennarelli, who has provided some serial topics, ranging from project management to coordinating review teams to what litigation support and discovery used to be like back in the 80’s (to which some of us “old timers” can relate). Her contributions are always well received and appreciated by the readers – and also especially by me, since I get a day off!

We always end each post with a request: “Please share any comments you might have or if you’d like to know more about a particular topic.” And, we mean it. We want to cover the topics you want to hear about, so please let us know.

Tomorrow, we’ll be back with a new, original post. In the meantime, feel free to click on any of the links above and peruse some of our 999 previous posts. Now is your chance to catch up! 😉

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Peruse, But Don’t Friend Potential Jurors on Social Media – eDiscovery Trends

April 25, 2014

Unless limited by law or court order, a lawyer may review a juror’s or potential juror’s Internet presence, which may include postings by the juror or potential juror in advance of and during a trial, but a lawyer may not communicate directly or through another with a juror or potential juror. So says a new formal opinion from the American Bar Association (ABA) Standing Committee on Ethics and Professionalism.

Formal Opinion 466 is a nine page PDF document which is designed to cover the responsibilities for lawyers who are reviewing jurors’ Internet presence. For the purposes of this opinion, Internet-based social media sites that readily allow account-owner restrictions on access are referred to as “electronic social media” or “ESM” sites – of which the opinion gives current examples like Facebook, MySpace, LinkedIn, and Twitter.

Under Model Rule 3.5(b) of the ABA Model Rules of Professional Conduct, a lawyer may not communicate with a potential juror leading up to trial or any juror during trial unless authorized by law or court order. With that in mind, the opinion addresses three levels of lawyer review of juror Internet presence:

1. passive lawyer review of a juror’s website or ESM that is available without making an access request where the juror is unaware that a website or ESM has been reviewed;

2. active lawyer review where the lawyer requests access to the juror’s ESM; and

3. passive lawyer review where the juror becomes aware through a website or ESM feature of the identity of the viewer.

To illustrate whether each activity violates Rule 3.5 (b), the opinion analogizes each of the activities to real world contact, as follows:

1. In the world outside of the Internet, a lawyer or another, acting on the lawyer’s behalf, would not be engaging in an improper ex parte contact with a prospective juror by driving down the street where the prospective juror lives to observe the environs in order to glean publicly available information that could inform the lawyer’s jury-selection decisions. So, passive review of a juror’s website or ESM, that is available without making an access request, and of which the juror is unaware, does not violate Rule 3.5(b).

2. This would be akin to driving down the juror’s street, stopping the car, getting out, and asking the juror for permission to look inside the juror’s house because the lawyer cannot see enough when just driving past and it would be the type of ex parte communication prohibited by Model Rule 3.5(b).

3. This is akin to a neighbor’s recognizing a lawyer’s car driving down the juror’s street and telling the juror that the lawyer had been seen driving down the street. A lawyer who uses a shared ESM platform to passively view juror ESM under these circumstances does not communicate with the juror. The lawyer is not communicating with the juror; the ESM service is communicating with the juror based on a technical feature of the ESM.

Also, under Model Rule 3.3(b), if a lawyer discovers criminal or fraudulent conduct by a juror related to the proceeding, the lawyer must take reasonable remedial measures including, if necessary, disclosure to the tribunal. However, the opinion hedged on a lawyer’s duty to notify the court when the conduct is merely “improper”, but stops short of being criminal or fraudulent.

So, what do you think? Do any of the parameters of this opinion surprise you? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

For Successful Discovery, Think Backwards – eDiscovery Best Practices

October 8, 2013

The Electronic Discovery Reference Model (EDRM) has become the standard model for the workflow of the process for handling electronically stored information (ESI) in discovery. But, to succeed in discovery, regardless whether you’re the producing party or the receiving party, it might be helpful to think about the EDRM model backwards.

Why think backwards?

You can’t have a successful outcome without envisioning the successful outcome that you want to achieve. The end of the discovery process includes the production and presentation stages, so it’s important to determine what you want to get out of those stages. Let’s look at them.

Presentation

As a receiving party, it’s important to think about what types of evidence you need to support your case when presenting at depositions and at trial – this is the type of information that needs to be included in your production requests at the beginning of the case.

Production

The format of the ESI produced is important to both sides in the case. For the receiving party, it’s important to get as much useful information included in the production as possible. This includes metadata and searchable text for the produced documents, typically with an index or load file to facilitate loading into a review application. The most useful form of production is native format files with all metadata preserved as used in the normal course of business.

For the producing party, it’s important to save costs, so it’s important to agree to a production format that minimizes production costs. Converting files to an image based format (such as TIFF) adds costs, so producing in native format can be cost effective for the producing party as well. It’s also important to determine how to handle issues such as privilege logs and redaction of privileged or confidential information.

Addressing production format issues up front will maximize cost savings and enable each party to get what they want out of the production of ESI.

Processing-Review-Analysis

It also pays to determine early in the process about decisions that affect processing, review and analysis. How should exception files be handled? What do you do about files that are infected with malware? These are examples of issues that need to be decided up front to determine how processing will be handled.

As for review, the review tool being used may impact production specs in terms of how files are viewed and production of load files that are compatible with the review tool, among other considerations. As for analysis, surely you test search terms to determine their effectiveness before you agree on those terms with opposing counsel, right?

Preservation-Collection-Identification

Long before you have to conduct preservation and collection for a case, you need to establish procedures for implementing and monitoring litigation holds, as well as prepare a data map to identify where corporate information is stored for identification, preservation and collection purposes.

As you can see, at the beginning of a case (and even before), it’s important to think backwards within the EDRM model to ensure a successful discovery process. Decisions made at the beginning of the case affect the success of those latter stages, so don’t forget to think backwards!

So, what do you think? What do you do at the beginning of a case to ensure success at the end? Please share any comments you might have or if you’d like to know more about a particular topic.

P.S. — Notice anything different about the EDRM graphic?

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Daily is Three Years Old!

September 20, 2013

We’ve always been free, now we are three!

It’s hard to believe that it has been three years ago today since we launched the eDiscoveryDaily blog. We’re past the “terrible twos” and heading towards pre-school. Before you know it, we’ll be ready to take our driver’s test!

We have seen traffic on our site (from our first three months of existence to our most recent three months) grow an amazing 575%! Our subscriber base has grown over 50% in the last year alone! Back in June, we hit over 200,000 visits on the site and now we have over 236,000!

We continue to appreciate the interest you’ve shown in the topics and will do our best to continue to provide interesting and useful posts about eDiscovery trends, best practices and case law. That’s what this blog is all about. And, in each post, we like to ask for you to “please share any comments you might have or if you’d like to know more about a particular topic”, so we encourage you to do so to make this blog even more useful.

We also want to thank the blogs and publications that have linked to our posts and raised our public awareness, including Pinhawk, Ride the Lightning, Litigation Support Guru, Complex Discovery, Bryan College, The Electronic Discovery Reading Room, Litigation Support Today, Alltop, ABA Journal, Litigation Support Blog.com, Litigation Support Technology & News, InfoGovernance Engagement Area, EDD Blog Online, eDiscovery Journal, Learn About E-Discovery, e-Discovery Team ® and any other publication that has picked up at least one of our posts for reference (sorry if I missed any!). We really appreciate it!

As many of you know by now, we like to take a look back every six months at some of the important stories and topics during that time. So, here are some posts over the last six months you may have missed. Enjoy!

Rodney Dangerfield might put it this way – “I Tell Ya, Information Governance Gets No Respect”

Is it Time to Ditch the Per Hour Model for Document Review? Here’s some food for thought.

Is it Possible for a File to be Modified Before it is Created? Maybe, but here are some mechanisms for avoiding that scenario (here, here, here, here, here and here). Best of all, they’re free.

Did you know changes to the Federal eDiscovery Rules are coming? Here’s some more information.

Count Minnesota and Kansas among the states that are also making changes to support eDiscovery.

By the way, since the Electronic Discovery Reference Model (EDRM) annual meeting back in May, several EDRM projects (Metrics, Jobs, Data Set and the new Native Files project) have already announced new deliverables and/or requested feedback.

When it comes to electronically stored information (ESI), ensuring proper chain of custody tracking is an important part of handling that ESI through the eDiscovery process.

Do you self-collect? Don’t Forget to Check for Image Only Files!

The Files are Already Electronic, How Hard Can They Be to Load? A sound process makes it easier.

When you remove a virus from your collection, does it violate your discovery agreement?

Do you think that you’ve read everything there is to read on Technology Assisted Review? If you missed anything, it’s probably here.

Consider using a “SWOT” analysis or Decision Tree for better eDiscovery planning.

If you’re an eDiscovery professional, here is what you need to know about litigation.

BTW, eDiscovery Daily has had 242 posts related to eDiscovery Case Law since the blog began! Forty-four of them have been in the last six months.

Our battle cry for next September? “Four more years!” 🙂

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

More Updates from the EDRM Annual Meeting – eDiscovery Trends

May 10, 2013

Yesterday, we discussed some general observations from the Annual Meeting for the Electronic Discovery Reference Model (EDRM) group and discussed some significant efforts and accomplishments by the (suddenly heavily talked about) EDRM Data Set project. Here are some updates from other projects within EDRM.

It should be noted these are summary updates and that most of the focus on these updates is on accomplishments for the past year and deliverables that are imminent. Over the next few weeks, eDiscovery Daily will cover each project in more depth with more details regarding planned activities for the coming year.

Model Code of Conduct (MCoC)

The MCoC was introduced in 2011 and became available for organizations to subscribe last year. To learn more about the MCoC, you can read the code online here, or download it as a 22 page PDF file here. Subscribing is easy! To voluntarily subscribe to the MCoC, you can register on the EDRM website here. Identify your organization, provide information for an authorized representative and answer four verification questions (truthfully, of course) to affirm your organization’s commitment to the spirit of the MCoC, and your organization is in! You can also provide a logo for EDRM to include when adding you to the list of subscribing organizations. Pending a survey of EDRM members to determine if any changes are needed, this project has been completed. Team leaders include Eric Mandel of Zelle Hofmann, Kevin Esposito of Rivulex and Nancy Wallrich.

Information Governance Reference Model (IGRM)

The IGRM team has continued to make strides and improvements on an already terrific model. Last October, they unveiled the release of version 3.0 of the IGRM. As their press release noted, “The updated model now includes privacy and security as primary functions and stakeholders in the effective governance of information.” IGRM continues to be one of the most active and well participated EDRM projects. This year, the early focus – as quoted from Judge Andrew Peck’s keynote speech at Legal Tech this past year – is “getting rid of the junk”. Project leaders are Aliye Ergulen from IBM, Reed Irvin from Viewpointe and Marcus Ledergerber from Morgan Lewis.

Search

One of the best examples of the new, more agile process for creating deliverables within EDRM comes from the Search team, which released its new draft Computer Assisted Review Reference Model (CARRM), which depicts the flow for a successful Computer Assisted Review project. The entire model was created in only a matter of weeks. Early focus for the Search project for the coming year includes adjustments to CARRM (based on feedback at the annual meeting). You can also still send your comments regarding the model to mail@edrm.net or post them on the EDRM site here. A webinar regarding CARRM is also planned for late July. Kudos to the Search team, including project leaders Dominic Brown of Autonomy and also Jay Lieb of kCura, who got unmerciful ribbing for insisting (jokingly, I think) that TIFF files, unlike Generalissimo Francisco Franco, are still alive. 🙂

Jobs

In late January, the Jobs Project announced the release of the EDRM Talent Task Matrix diagram and spreadsheet, which is available in XLSX or PDF format. As noted in their press release, the Matrix is a tool designed to help hiring managers better understand the responsibilities associated with common eDiscovery roles. The Matrix maps responsibilities to the EDRM framework, so eDiscovery duties associated can be assigned to the appropriate parties. Project leader Keith Tom noted that next steps include surveying EDRM members regarding the Matrix, requesting and co-authoring case-studies and white papers, and creating a short video on how to use the Matrix.

Metrics

In today’s session, the Metrics project team unveiled the first draft of the new Metrics model to EDRM participants! Feedback was provided during the session and the team will make the model available for additional comments from EDRM members over the next week or so, with a goal of publishing for public comments in the next two to three weeks. The team is also working to create a page to collect Metrics measurement tools from eDiscovery professionals that can benefit the eDiscovery community as a whole. Project leaders Dera Nevin of TD Bank and Kevin Clark noted that June is “budget calculator month”.

Other Initiatives

As noted yesterday, there is a new project to address standards for working with native files in the different EDRM phases led by Eric Mandel from Zelle Hofmann and also a new initiative to establish collection guidelines, spearheaded by Julie Brown from Vorys. There is also an effort underway to refocus the XML project, as it works to complete the 2.0 version of the EDRM XML model. In addition, there was quite a spirited discussion as to where EDRM is heading as it approaches ten years of existence and it will be interesting to see how the EDRM group continues to evolve over the next year or so. As you can see, a lot is happening within the EDRM group – there’s a lot more to it than just the base Electronic Discovery Reference Model.

So, what do you think? Are you a member of EDRM? If not, why not? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Reporting from the EDRM Annual Meeting and a Data Set Update – eDiscovery Trends

May 9, 2013

The Electronic Discovery Reference Model (EDRM) Project was created in May 2005 by George Socha of Socha Consulting LLC and Tom Gelbmann of Gelbmann & Associates to address the lack of standards and guidelines in the electronic discovery market. Now, beginning its ninth year of operation with its annual meeting in St. Paul, MN, EDRM is accomplishing more than ever to address those needs. Here are some highlights from the meeting, and an update regarding the (suddenly heavily talked about) EDRM Data Set project.

Annual Meeting

Twice a year, in May and October, eDiscovery professionals who are EDRM members meet to continue the process of working together on various standards projects. This will be my eighth year participating in EDRM at some level and, oddly enough, I’m assisting with PR and promotion (how am I doing so far?). eDiscovery Daily has referenced EDRM and its phases many times in the 2 1/2 years plus history of the blog – this is our 144th post that relates to EDRM!

Some notable observations about today’s meeting:

New Participants: More than half the attendees at this year’s annual meeting are attending for the first time. EDRM is not just a core group of “die-hards”, it continues to find appeal with eDiscovery professionals throughout the industry.
Agile Approach: EDRM has adopted an Agile approach to shorten the time to complete and publish deliverables, a change in philosophy that facilitated several notable accomplishments from working groups over the past year including the Model Code of Conduct (MCoC), Information Governance Reference Model (IGRM), Search and Jobs (among others). More on that tomorrow.
Educational Alliances: For the first time, EDRM has formed some interesting and unique educational alliances. In April, EDRM teamed with the University of Florida Levin College of Law to present a day and a half conference entitled E-Discovery for the Small and Medium Case. And, this June, EDRM will team with Bryan University to provide an in-depth, four-week E-Discovery Software & Applied Skills Summer Immersion Program for Law School Students.
New Working Group: A new working group to be lead by Eric Mandel of Zelle Hoffman was formed to address standards for working with native files in the different EDRM phases.

Tomorrow, we’ll discuss the highlights for most of the individual working groups. Given the recent amount of discussion about the EDRM Data Set group, we’ll start with that one today!

Data Set

The EDRM Enron Data Set has been around for several years and has been a valuable resource for eDiscovery software demonstration and testing (we covered it here back in January 2011). The data in the EDRM Enron PST Data Set files is sourced from the FERC Enron Investigation release made available by Lockheed Martin Corporation. It was reconstituted as PST files with attachments for the EDRM Data Set Project. So, in essence EDRM took already public domain available data and made the data much more usable. Initially, the data was made available for download on the EDRM site, then subsequently moved to Amazon Web Services (AWS).

In the past several days, there has been much discussion about the personally-identifiable information (“PII”) available within the FERC (and consequently the EDRM Data Set), including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers. Consequently, the EDRM Data Set has been taken down from the AWS site.

The Data Set team led by Michael Lappin of Nuix and Eric Robi of Elluma Discovery has been working on a process (using predictive coding technology) to identify and remove the PII data from the EDRM Data Set. Discussions about this process began months ago, prior to the recent discussions about the PII data contained within the set. The team has completed this iterative process for V1 of the data set (which contains 1,317,158 items), identifying and removing 10,568 items with PII, HIPAA and other sensitive information. This version of the data set will be made available within the EDRM community shortly for peer review testing. The data set team will then repeat the process for the larger V2 version of the data set (2,287,984 items). A timetable for republishing both sets should be available soon and the efforts of the Data Set team on this project should pay dividends in developing and standardizing processes for identifying and eliminating sensitive data that eDiscovery professionals can use in their own data sets.

The team has also implemented a Forensic Files Testing Project site where users can upload their own “modern”, non-copyrighted file samples that are typically encountered during electronic discovery processing to provide a more diverse set of data than is currently available within the Enron data set.

So, what do you think? How has EDRM impacted how you manage eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Daily Is Thirty! (Months Old, That Is)

March 21, 2013

Thirty months ago yesterday, eDiscovery Daily was launched. It’s hard to believe that it has been 2 1/2 years since our first three posts that debuted on our first day. 635 posts later, a lot has happened in the industry that we’ve covered. And, yes we’re still crazy after all these years for committing to a daily post each business day, but we still haven’t missed a business day yet. Twice a year, we like to take a look back at some of the important stories and topics during that time. So, here are just a few of the posts over the last six months you may have missed. Enjoy!

Industry Consolidation Continues: If you think there have been a lot of acquisitions in the eDiscovery industry, you’re right.
Don’t Be “Duped”: Files with Different HASH Values Can Still Be the Same.
Want the Right Balance of Recall and Precision in Your Search? Try Proximity Searches.
Are You Requesting the Best Production Format for Your Case? Maybe not, according to Craig Ball.
In a recent case, Both Sides Were Instructed to Use Predictive Coding or Show Cause Why Not.
Did you know that Only One in Eight Records Managers Trusts Their ESI?
Plaintiff Hammered with Case Dismissal for “Egregious” Discovery Violations: Apparently, destroying your first computer with a sledgehammer and using Evidence Eliminator and CCleaner on your second computer are not considered to be best practices for preservation.
Even a “Rap Weasel” can be sanctioned for spoliation of data. It isn’t every day that we cite The Hollywood Reporter for a story.
Problems with Review? It’s Not the End of the World.
$2.9 Billion? Is the eDiscovery Software Market Going to Double by 2017?
Want to catch up on 2012 eDiscovery cases? Here is your chance.
Is 31,000 Missed Relevant Documents an Acceptable Outcome for Predictive Coding? It might be, if the alternative is 62,000 missed relevant documents.
What do various eDiscovery thought leaders think about the industry? For the third year in a row, we find out.
Must Losing Plaintiff Pay Defendant $2.8 Million for Predictive Coding of One Million Documents? Court Says Yes.
Do you have some misperceptions about predictive coding? Maybe so. Here are Five Common Myths About Predictive Coding.

In addition, Jane Gennarelli has been publishing an excellent series to introduce new eDiscovery professionals to the litigation process and litigation terminology. Here is the latest post, which includes links to the previous twenty one posts.

Thanks for noticing us! We’ve nearly quadrupled our readership since the first six month period and almost septupled (that’s grown 7 times in size!) our subscriber base since those first six months! We appreciate the interest you’ve shown in the topics and will do our best to continue to provide interesting and useful eDiscovery news and analysis. And, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

No Bates Numbers in a Native Production? Get Over It! – eDiscovery Best Practices

October 22, 2012

Last week, we discussed the benefits of requesting document productions in native format, including the ability to use Early Data Assessment/FirstPass Review applications to analyze your opponents produced data and metadata, using capabilities like email analytics and message thread analysis (where missing emails in threads can be identified), synonym searching, fuzzy searching and domain categorization. If you don’t understand the benefits of receiving the underlying metadata, try reviewing an image of an Excel spreadsheet and see if you can understand how the numbers were calculated without the underlying formulas. Not so easy, is it?

However, one objection that attorneys provide against producing documents in native format is that they’re not conducive to Bates labeling. Some native file types, such as Excel files, are not stored in a typical paginated, document-oriented format, so it is difficult or even impossible determine the number of pages for each file. Other file types vary the number of pages and placement of text on pages based on the document styles applied. For example, Word uses document styles based on the fonts installed on the workstation to display the content of the Word document; however, if the chosen font is not available when the document is viewed on another workstation, Word will substitute with another font and style that can change the formatting and even which page content appears. Since attorneys are so used to having a Bates stamp on each page of a document, many are still known to produce (and request production) in an image format, adding costs unnecessarily. Would those same attorneys print out every email in their Inbox before reading them?

However, most courts today accept a file-level “Bates” or Unique Production Identifier (UPI) where each file is named with a prefix and a sequential number. These numbers look just like Bates numbers, except they’re not stamped in the file itself; instead, they are used as the file name. These productions are usually accompanied by a data file, containing metadata for loading into a review tool, which includes the original file name and path of each file being produced.

So, how do you get around the issue of referencing individual page numbers for presentation at deposition or trial? Those files can still be converted to image (or printed) and a number applied for presentation. It’s common to simply use the Bates number as the prefix, followed by a sequential number, so page 6 of the 45th file in the production could be stamped like this: PROD000045-00006. This enables you to tie back to the production, yet only convert to image those files that need to be presented.

So, what do you think? How do you handle production numbering in native productions? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.