EDRM Archives

Cost Calculator for Document Review – eDiscovery Best Practices

January 3, 2014

A couple of weeks ago, we discussed budget calculators available from the Metrics section of the Electronic Discovery Reference Model (EDRM) web site and, two days later, began a review of the budget calculators, beginning with the E-Discovery Cost Estimator for Processing and Review workbook provided by Julie Brown at Vorys law firm. Today, we will continue our review of the calculators with a look at the Doc Review Cost Calculator.

As described on the site, this budget calculator focuses on review, which is universally considered to be the most expensive phase of the eDiscovery process (by far). From assumptions entered by users, it calculates per-document and per-hour (a) low and high price estimates, (b) low and high costs on a per page basis, and (c) low and high costs on a per document basis.

To use it, enter assumptions in the white and yellow cells in columns B, C, and D. Calculations are shown in columns D through T.

Assumptions that you can provide include: pages per document, low and high page counts in the collection, low and high time to complete the review project (in weeks) and reviewer hours per week, proposed rates for review (hourly and per document), low and high pages per hour rates for review (from which documents per hour rates are computed), proposed rates for review management (hourly and per document) and percentage of the collection to QC.

From the entered assumptions, the model will provide calculations to illustrate the low and high cost estimates for the low and high page count estimates, for both a per-document and a per-hour review billing structure. It will also estimate a range of the number of reviewers needed to complete the project within the time frames specified, to help you plan on staffing necessary to meet proposed deadlines. The detailed calculations are stored in a hidden sheet called “Calculations” – you can unhide it if you want to see how the review calculation “sausage” is made.

This model uses an “old school” assessment of a document collection based on page counts, so to use it with native file collections (where page counts aren’t known), you have to set the pages per document to 1 – your review rate then becomes documents (files) per hour.

Suggestions for improvement:

Some of the enterable assumption cells are in yellow and some in white (the same color as the computed cells), it would be easier and clearer to identify the assumptions fields if they were all yellow to differentiate them from the computed cells;
Protect the sheet and lock down the computed cells (at least in the main sheet) to avoid accidental overwriting of calculations (with the ability to unprotect the sheet if a formula requires tweaking);
Tie a line or bar graph to the numbers to represent the differences graphically;
Provide some notes to explain some of the cells (especially the assumption cells) in more detail.

Nonetheless, this workbook would certainly be useful for estimating review costs and number of reviewers needed to complete a large scale review, not only at the start, but also to provide updated estimates as review commences, so you can adjust cost estimates and staffing needs as you go. You can download this calculator individually or a zip file containing all four calculators here. In a few days, we will continue our review of the current EDRM budget calculators in more detail with the ESI Cost Budget calculator from Browning Marean of DLA Piper law firm.

So, what do you think? How do you estimate eDiscovery costs? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Vorys Project Ballpark Cost Estimator for ESI Processing and Review – eDiscovery Best Practices

December 19, 2013

On Tuesday, we discussed budget calculators available from the Metrics section of the Electronic Discovery Reference Model (EDRM) web site. Today, we will begin a more in-depth discussion of the budget calculators, beginning with the E-Discovery Cost Estimator for Processing and Review workbook provided by Julie Brown at Vorys law firm.

As described on the site, this budget calculator contains two worksheets. The Linear-search-analytics worksheet allows users to calculate ballpark cost estimates for processing and review under three “cases” and compare the results. The cases are:

Case 1: Full blown processing and linear review
Case 2: Search terms used to cull data during processing
Case 3: Use analytical culling tool

With each case, users are able to see the cost consequences that result from changing variables such as Data Volume, Volume after culling, and Pre-processing cost/GB. The cost differences are shown numerically, as well as via two graphs, a 3D horizontal bar graph that shows the cost differences between the three cases (see above graphic for an example) and a 2D horizontal bar graph that shows the cost differences, with a breakdown of processing and review costs for each.

The Linear-size examples worksheet allows users to compare four versions of Case 1. Users are able to see the cost consequences (in both numeric and 2D vertical bar graph form) that result from changing any combination of six variables: Data Volume, Processing Cost/GB, Pages per GB, Docs Reviewed by Hour, Hourly Rate, and FTEs.

Both spreadsheets provide useful information and are well controlled to differentiate the data entry cells (with no fill color in the cell) from the calculation only cells (with a fill color) and the sheets are protected to prohibit accidental overwriting of the calculated cells (the sheets aren’t locked with a password, so you can override it if you want to make adjustments). The sheet is designed to help you generate a ballpark cost for processing and review based on the variables provided, so it doesn’t include any fixed overhead costs such as software, hardware or facility costs. It also doesn’t include any management overhead, so it’s essentially a model for variable costs only, but it could be useful to help you determine at what volume an analytical culling tool might pay for itself.

Suggestions for improvement:

Create a common section for data entry variables so you don’t have to re-enter them for each comparison case to save time and avoid data inconsistencies;
While you’re at it, add variables for pages per document and hours per week – right now, you have to unprotect the sheet and change the formulas if you want to change those variables (not all document sets or work weeks are the same);
Add sheets to compare versions of Case 2 and Case 3, like the sheet for Case 1.

Nonetheless, this workbook is quite useful if you want to obtain a ballpark estimate and comparison for processing and review and compare costs for alternatives. You can download this calculator individually or a zip file containing all four calculators here. After the first of the year, we will continue our review of the current EDRM budget calculators in more detail.

So, what do you think? How do you estimate eDiscovery costs? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Want to Estimate your eDiscovery Budget? Use One of These Calculators – eDiscovery Best Practices

December 17, 2013

It has been a busy year for the Electronic Discovery Reference Model (EDRM). In addition to announcing a transition to nonprofit status by May 2014, since the May annual meeting, several EDRM projects (Metrics, Jobs, Data Set and the new Native Files project) have already announced new deliverables and/or requested feedback. Now, another resource is available via the EDRM site – Budget Calculators!

It can be difficult to estimate the total costs for eDiscovery at the outset of a case. There are a number of variables and options that could impact the budget by a wide margin and it may be difficult to compare costs for various options for processing and review. However, thanks to the EDRM Metrics team and contributing members, budget calculator Excel workbooks are available to enable you to at least “ballpark” the costs. The budget calculator spreadsheets are designed to help organizations estimate likely eDiscovery costs, based on assumptions that you provide, such as average hourly rates for contract reviewers or average number of pages per document.

There are four budget calculators that are currently available. They are:

UF LAW E-Discovery Project Ballpark Cost Estimator for ESI Processing and Review: This budget calculator contains two worksheets. The first worksheet allows users to calculate ballpark cost estimates for processing and review under three “cases” (Full blown processing and linear review, Search terms used to cull data during processing and Use analytical culling tool) and compare the results. The second worksheet allows users to compare four versions of Case 1. This workbook has been provided by University of Florida Levin College of Law and Vorys law firm.
Doc Review Cost Calculator: This budget calculator focuses on review. From assumptions entered by users, it calculates per-document and per-hour (a) low and high price estimates, (b) low and high costs on a per page basis, and (c) low and high costs on a per document basis.
ESI Cost Budget: This budget calculator estimates costs by project phase. The phases are: ESI Collection, ESI Processing, Paper Collection and Processing, Document Review, Early Data Assessment, Phase 1 Review, Phase 2 Review, Production, Privilege Review, Review of Opposition’s Production and Hosting Costs. This workbook has been provided by Browning Marean, DLA Piper law firm.
EDRM UTBMS eDiscovery Code Set Calculator: This budget calculator uses the UTBMS e-discovery codes as a starting point for calculating estimated e-discovery expenses. Users enter anticipated average hour rates for: Partners, Associates, Paralegals, Contract reviewers, In-house resources and Vendors, along with total estimated hours for each relevant group and total estimated associated disbursements for each relevant L600-series UTMBS code. The spreadsheet then displays: a summary of the estimated costs, details of the estimated costs for each combination, totals by type of person and totals by individual and higher-level UTMBS codes. This workbook has been provided by Browning Marean, DLA Piper law firm; and George Socha, Socha Consulting.

You can download each calculator individually or a zip file containing all four calculators. If you have your own budget calculator, you can also submit yours to EDRM to share with others. The calculators are available here. On Thursday, we will begin reviewing the current budget calculators in more detail.

So, what do you think? How do you estimate eDiscovery costs? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Another New Deliverable from EDRM – eDiscovery Trends

November 14, 2013

Do you know what container files are? How about the L600 Code Series? Do you know common methods for culling data? What about the difference between a targeted and non-targeted collection strategy?

If you don’t know the answer to these and many other questions related to eDiscovery, you should check out the latest deliverable from the Electronic Discovery Reference Model (EDRM) Metrics team, the EDRM Metrics Glossary.

As noted in their press release announcement, the glossary contains definitions for 90 terms used in connection with the updated EDRM Metrics Model published in June 2013 (which was covered by the blog here). The EDRM Metrics Model provides a framework for planning, preparation, execution and follow-up of eDiscovery matters and projects by depicting the relationship between the eDiscovery process and how information, activities and outcomes may be measured.

The new glossary was developed by the EDRM Metrics team, led by Kevin Clark and Dera Nevin with special assistance from team members Erin Corken, Eric Derk, Matthew Knouff, Carla Pagan, David Robertson, Bob Rohlf, Jim Taylor, Vicki Towne and Sonia Waiters.

The entire EDRM Metrics Glossary can be found here.

It has been a busy year for EDRM. In addition to announcing a transition to nonprofit status by May 2014, since the May annual meeting, several EDRM projects (Metrics, Jobs, Data Set and the new Native Files project) have already announced new deliverables and/or requested feedback. And, just a couple of weeks ago, EDRM published new Collection Standards for collecting electronically stored information (ESI). And, there is still almost half a year to go before next year’s annual meeting. Wow.

So, what do you think? Will you use the new EDRM Metrics Glossary? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

EDRM Publishes Collection Standards – eDiscovery Trends

November 1, 2013

On the heels of announcing a transition to nonprofit status by May 2014, the Electronic Discovery Reference Model (EDRM) has now introduced Collection Standards for electronically stored information (ESI) for public comment.

In their press release to announce the new standards, EDRM noted that a group of attendees at this past May’s annual meeting “decided that ‘collection’ of ESI had evolved to the point that it made sense to document collection best practices and considerations for developing a collection strategy. The team, including Julie Brown, Teri Christensen, Kevin Clark, Sean d’Albertis, Kevin Esposito, Faisal Habib, Valerie Lloyd, Rick Nalle, Andrea Donovan Napp and John Wilson, has collaborated over the last several months to develop these standards which are now available for public comment.”

The collection standards page, which is available here, defines best practices to identify what processes are repeatable and the understandable risks and rewards that can be used to evaluate a strategy in various cases. It focuses on different approaches for collection, including:

Forensic Image (Physical or Logical Target)
Custom Content/Targeted Image
Non-Forensic Copy
Exports – Harvesting Email
Exports – Non-Email
Exceptions (technologies that the standards don’t yet address, including mobile devices, instant messaging, MACs, International Protocols, and social media/ other types of cloud storage).

Each approach includes definitions, pros and cons of that approach and a glossary of terms. Defined terms are hyperlinked with pop-up definitions, making it easy to define any terms that need it.

Want to know the different types of email formats that are typically exported for discovery purposes? This document has it. Want to know when you should consider creating a forensic image of the data in question? It’s there too. The standards provide clear best practices in easy-to-understand terms that should be a useful reference for anybody who will need to tackle ESI collection for their cases. Good move to publish the standards they have now instead of waiting to address the exception technologies, which are much more complex.

According to the press release, the public comment period extends through November 15, 2013, which is only 17 days later than the standards were officially published. That time period seems a bit short to me; hopefully, EDRM will consider extending it.

It’s shaping up to be a banner year for EDRM, as, since the May annual meeting, several EDRM projects (Metrics, Jobs, Data Set and the new Native Files project) have already announced new deliverables and/or requested feedback.

So, what do you think? Will these new Collection Standards be a useful best practices guide? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Transitional Times for Two Big Names in eDiscovery – eDiscovery Trends

October 29, 2013

The more things change, the more they stay the same. Even for popular entities such as EDRM and the eDJ Group.

As reported in Law Technology News (EDRM Transitions to Nonprofit Status) by none other than George Socha, co-founder (along with Tom Gelbmann) of the Electronic Discovery Reference Model (EDRM), by May 2014, EDRM will become a nonprofit organization.

As Socha notes in the article:

“When we launched EDRM, we figured it would have a one-year lifespan — focused on addressing two fundamental sets of questions:

1. What is electronic discovery?

2. What might we all do about it at a practical level?”

Now, they’re in their ninth year, growing from 35 participants at that first meeting in May 2005 to over 260 organizations that have participated in EDRM. On a personal note, I’ve participated since the second year and eDiscovery Daily has published 159 blog posts to date about EDRM and its phases.

For EDRM to be an ongoing entity, it has to be about more than the founders. As Socha stated in the article, “for EDRM to grow and remain relevant and viable over the long term it cannot continue to be viewed as ‘the George-and-Tom show.’ We heartily agree.”

Transition is also afoot for another organization that has been a terrific resource for eDiscovery information: The eDJ Group. If you’re not familiar with the name, you probably recognize their web site – eDiscovery Journal. Now, as Sean Doherty reports in Law Technology News (eDJ Group Puts a New Face on a New Website), eDiscoveryjournal.com is now retired and replaced by the new website (http://edjgroupinc.com).

As Doherty notes in his article, “The big news: The new eDJ website, unlike the eDiscovery Journal, is not supported by vendors. The eDJ Group now offers Platinum, Gold and free (with registration) subscriptions to content comprising research reports, surveys, analyst notes, blogs and the eDJ Matrix”, which is “a SQL database of e-discovery technology, applications and services”.

Doherty also reports that “Paid subscriptions to eDJ content start at $500 for Gold membership, which provides access to executive summaries, short reports, analysts’ notes and the eDJ Matrix. A platinum subscription provides full access to all content and a free subscription with registration includes access to blogs, free reports and the Matrix. Paid subscriptions are sans advertisement.”

It will be interesting to see how the changes impact both organizations.

So, what do you think? Where do you get your information about eDiscovery? Besides eDiscovery Daily, of course! Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

For Successful Discovery, Think Backwards – eDiscovery Best Practices

October 8, 2013

The Electronic Discovery Reference Model (EDRM) has become the standard model for the workflow of the process for handling electronically stored information (ESI) in discovery. But, to succeed in discovery, regardless whether you’re the producing party or the receiving party, it might be helpful to think about the EDRM model backwards.

Why think backwards?

You can’t have a successful outcome without envisioning the successful outcome that you want to achieve. The end of the discovery process includes the production and presentation stages, so it’s important to determine what you want to get out of those stages. Let’s look at them.

Presentation

As a receiving party, it’s important to think about what types of evidence you need to support your case when presenting at depositions and at trial – this is the type of information that needs to be included in your production requests at the beginning of the case.

Production

The format of the ESI produced is important to both sides in the case. For the receiving party, it’s important to get as much useful information included in the production as possible. This includes metadata and searchable text for the produced documents, typically with an index or load file to facilitate loading into a review application. The most useful form of production is native format files with all metadata preserved as used in the normal course of business.

For the producing party, it’s important to save costs, so it’s important to agree to a production format that minimizes production costs. Converting files to an image based format (such as TIFF) adds costs, so producing in native format can be cost effective for the producing party as well. It’s also important to determine how to handle issues such as privilege logs and redaction of privileged or confidential information.

Addressing production format issues up front will maximize cost savings and enable each party to get what they want out of the production of ESI.

Processing-Review-Analysis

It also pays to determine early in the process about decisions that affect processing, review and analysis. How should exception files be handled? What do you do about files that are infected with malware? These are examples of issues that need to be decided up front to determine how processing will be handled.

As for review, the review tool being used may impact production specs in terms of how files are viewed and production of load files that are compatible with the review tool, among other considerations. As for analysis, surely you test search terms to determine their effectiveness before you agree on those terms with opposing counsel, right?

Preservation-Collection-Identification

Long before you have to conduct preservation and collection for a case, you need to establish procedures for implementing and monitoring litigation holds, as well as prepare a data map to identify where corporate information is stored for identification, preservation and collection purposes.

As you can see, at the beginning of a case (and even before), it’s important to think backwards within the EDRM model to ensure a successful discovery process. Decisions made at the beginning of the case affect the success of those latter stages, so don’t forget to think backwards!

So, what do you think? What do you do at the beginning of a case to ensure success at the end? Please share any comments you might have or if you’d like to know more about a particular topic.

P.S. — Notice anything different about the EDRM graphic?

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Five More Things to Know Before Moving eDiscovery to the Cloud – eDiscovery Best Practices

October 1, 2013

Yesterday, we covered the first five items in Joel Jacob’s article in Information Management.com (10 Things to Know Before Moving E-Discovery to the Cloud), which provides an interesting checklist for those considering a move to cloud computing. Here are the remaining five items, with some comments from me.

6. Assess potential – and realistic – risks associated with security, data privacy and data loss prevention. The author notes the importance of assessing security risks, and, of course, it’s important to understand how the cloud provider handles security and that there are clear-cut policies and objectives in place. It’s also important to compare the cloud provider’s security mechanisms to your own security mechanisms. Any cloud provider “worth their salt” should have a comprehensive security plan that meets or exceeds that of most organizations.

7. Develop an implementation plan, including an internal communication strategy. The author advocates getting legal and IT on the same page, testing and conducting a proof of concept on work procedures and identifying quantifiable metrics for evaluating the system/service. All solid ideas.

8. Leverage the success or adoption of other SaaS solutions in the organization to lessen resistance. The author notes that “process of moving to the cloud and/or moving e-discovery to the cloud will need to be driven through cultural change management”. However, they already likely use several SaaS based solutions. Here are some of the most popular ones: Amazon, Facebook, Twitter, eBay and YouTube. Oh, and possibly Google Docs and SalesForce.com as well. That should address resistance concerns.

9. Run a pilot on a small project before moving to larger, mission-critical matters. The author advocates finding a test data set or dormant case that has known outcomes, and running it in the new cloud solution. The cloud provider should enable you to do so via a no risk trial (shameless plug warning, here’s ours), so that you can truly try it before you buy it, with your own data.

10. Understand you are still the ultimate custodian of all electronically stored information. As the author notes, “The data belongs to you, and the burden of controlling it falls on you. The Federal Rules of Civil Procedure state that no matter where the data is hosted, the company that owns it is ultimately responsible for it.” That’s why it’s critical to address questions about where the data is stored and mechanisms for securing your company’s data. If you can’t answer those questions to your satisfaction with the cloud provider you’re evaluating, perhaps they’re not the provider for you.

So, what do you think? Have you implemented a SaaS based solution for eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

10 Things to Know Before Moving eDiscovery to the Cloud – eDiscovery Best Practices

September 30, 2013

Software as a Service (SaaS) accounted for 49 percent of all eDiscovery software revenues tracked in 2011, according to Gartner’s report, Market Trends: Automated, Analytical Approaches Drive the Enterprise E-Discovery Software Market. Joel Jacob’s article in Information Management.com (10 Things to Know Before Moving E-Discovery to the Cloud) provides an interesting checklist for those considering a move to cloud computing. Here they are, with some comments from me.

1. Actively involve all stakeholders across multiple departments. The article promotes involving “as many stakeholders and members of management as possible, typically from legal, IT, compliance, security and any other department that may be impacted by a new model”. Legal should also include outside counsel when appropriate – they will often be the heaviest users of the application, so it should be easy for them to learn and use.

2. Document and define areas of potential cost savings. Jacob advocates considering the eDiscovery process as defined by the Electronic Discovery Reference Model (EDRM). It’s easy to forget some of the cost savings and benefits that cloud computing can offer – not only reduction or elimination of hardware and software costs, but also reduction or elimination of personnel to support in-house systems, as well.

3. Evaluate the e-discovery platform first and the cloud options second. Clearly, the eDiscovery platform must meet the needs of the organization and the users or it doesn’t matter where it’s located. However, it seems counter-productive to spend time evaluating platforms that could be ruled out because of the cloud options. At the very least, identify any cloud “deal breakers” and eliminate any platforms that don’t fit with the required cloud model.

4. Benchmark your existing e-discovery processes including data upload, processing, review and export. This, of course, assumes you have an existing solution that you are considering replacing. You will compare those benchmarks to those of the potential cloud solution when you perform a small pilot project (as we will discuss in an upcoming step). The eDiscovery platform that you choose should ideally give you the option to load and export your own data, as well as providing good or better turnaround by the vendor (when compared to your internal staff) for performing those same functions when needed.

5. Learn the differences between public and private clouds. As the article notes, “[c]ompanies need to understand where there [sic] data will go, how it is protected, and if it is secured according to any industry specific regulations that apply (e.g., HIPPA, Sarbanes-Oxley, etc.).” It’s especially important to know where your data will go – if it’s stored internationally, access to it may be subject to different rules. As for how it is protected, here is some more information regarding how data can be protected in a cloud environment.

Tomorrow, we will cover items 6 through 10 of the checklist. Oh, the anticipation!

So, what do you think? Have you implemented a SaaS based solution for eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

A Model for Reducing Private Data – eDiscovery Best Practices

September 24, 2013

Since the Electronic Discovery Reference Model (EDRM) annual meeting just four short months ago in May, several EDRM projects (Metrics, Jobs, Data Set and the new Native Files project) have already announced new deliverables and/or requested feedback. Now, the Data Set project has announced another new deliverable – a new Privacy Risk Reduction Model.

Announced in yesterday’s press release, the new model “is a process for reducing the volume of private, protected and risky data by using a series of steps applied in sequence as part of the information management, identification, preservation and collection phases” of the EDRM. It “is used prior to producing or exporting data containing risky information such as privileged or proprietary information.”

The model uses a series of six steps applied in sequence with the middle four steps being performed as an iterative process until the amount of private information is reduced to a desirable level. Here are the steps as described on the EDRM site:

Define Risk: Risk is initially identified by an organization by stakeholders who can quantify the specific risks a particular class or type of data may pose. For example, risky data may include personally identifiable information (PII) such as credit card numbers, attorney-client privileged communications or trade secrets.
Identify Available Data: Locations and types of risky data should be identified. Possible locations may include email repositories, backups, email and data archives, file shares, individual workstations and laptops, and portable storage devices. The quantity and type should also be specified.
Create Filters: Search methods and filters are created to ‘catch’ risky data. They may include keyword, data range, file type, subject line etc.
Run Filters: The filters are executed and the results evaluated for accuracy.
Verify Output: The data identified or captured by the filters is compared against the anticipated output. If the filters did not catch all the expected risky data, additional filters can be created or existing filters can be refined and the process run again. Additionally, the output from the filters may identify additional risky data or data sources in which case this new data should be subjected the risk reduction process.
Quarantine: After an acceptable amount of risky data has been identified through the process, it should be quarantined from the original data sets. This may be done through migration of non-risky data, or through extraction or deletion of the risky data from the original data set.

No EDRM model would be complete without a handy graphic to illustrate the process so, as you can see above, this model includes one that illustrates the steps as well as the risk-time continuum (not to be confused with the space-time continuum, relatively speaking)… 😉

Looks like a sound process, it will be interesting to see it in use. Hopefully, it will enable the Data Set team to avoid some of the “controversy” experienced during the process of removing private data from the Enron data set. Kudos to the Data Set team, including project co-leaders Michael Lappin, director of archiving strategy at Nuix, and Eric Robi, president of Elluma Discovery!

So, what do you think? What do you think of the process? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

EDRM

Cost Calculator for Document Review – eDiscovery Best Practices

Vorys Project Ballpark Cost Estimator for ESI Processing and Review – eDiscovery Best Practices

Want to Estimate your eDiscovery Budget? Use One of These Calculators – eDiscovery Best Practices

Another New Deliverable from EDRM – eDiscovery Trends

EDRM Publishes Collection Standards – eDiscovery Trends

Transitional Times for Two Big Names in eDiscovery – eDiscovery Trends

For Successful Discovery, Think Backwards – eDiscovery Best Practices

Five More Things to Know Before Moving eDiscovery to the Cloud – eDiscovery Best Practices

10 Things to Know Before Moving eDiscovery to the Cloud – eDiscovery Best Practices

A Model for Reducing Private Data – eDiscovery Best Practices

Status: Updated