Information Governance

eDiscovery Trends: ARMA International and EDRM Jointly Release Information Governance White Paper

 

A few months ago, the Electronic Discovery Reference Model (EDRM) and ARMA International announced that they would be collaborating on information governance guidelines for eDiscovery.  It only took them a little over three months to release their first work product.

On December 20 of last year, ARMA and EDRM announced the publication of a jointly developed white paper entitled, How the Information Governance Reference Model (IGRM) Complements ARMA International’s Generally Accepted Recordkeeping Principles (GARP).  The press release announcing the release of the white paper can be found on the EDRM site here.  The web version of the paper is located here and the PDF version can be downloaded here.

The core of the paper is to relate the EDRM Information Governance Reference Model (IGRM) to ARMA’s GARP® principles.  There are eight GARP principles, as follows:

  1. Accountability
  2. Transparency
  3. Integrity
  4. Protection
  5. Compliance
  6. Availability
  7. Retention
  8. Disposition

The white paper provides a chart for assigning ownership to each business unit for each GARP principle and describes the Maturity Model with five levels of effective Information Governance, ranging from Level 1 (Sub-standard) to Level 5 (Transformational).  Transformational describes “an organization that has integrated information governance into its overall corporate infrastructure and business processes to such an extent that compliance with the program requirements is routine”.  Based on the CGOC Information Governance Benchmark Report from a little over a year ago, most organizations have quite a bit of maturing still to do.

The white paper then proceeds to describe each of the eight principles “According to GARP” at Level 5 Transformational Maturity.  Where’s Robin Williams when you need him?  The white paper finishes with several conclusions noting that “the IGRM complements the metrics defined by ARMA International’s Information Governance Maturity Model”.

This white paper provides a great overview of both the IGRM and ARMA GARP principles and is well worth reading to develop an understanding of both models.  It will be interesting to see how the EDRM and ARMA joint effort proceeds from here to help organizations achieve a higher level of “maturity” when it comes to information governance.

So, what do you think?  Have you read the white paper yet?  Do you think the EDRM/ARMA collaboration will lead to greater information governance within organizations?  As always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Jason R. Baron

 

This is the first of the Holiday Thought Leader Interview series.  I interviewed several thought leaders to get their perspectives on various eDiscovery topics.

Today’s thought leader is Jason R. Baron. Jason has served as the National Archives' Director of Litigation since May 2000 and has been involved in high-profile cases for the federal government. His background in eDiscovery dates to the Reagan Administration, when he helped retain backup tapes containing Iran-Contra records from the National Security Council as the Justice Department’s lead counsel. Later, as director of litigation for the U.S. National Archives and Records Administration, Jason was assigned a request to review documents pertaining to tobacco litigation in U.S. v. Philip Morris.

He currently serves as The Sedona Conference Co-Chair of the Working Group on Electronic Document Retention and Production. Baron is also one of the founding coordinators of the TREC Legal Track, a search project organized through the National Institute of Standards and Technology to evaluate search protocols used in eDiscovery. This year, Jason was awarded the Emmett Leahy Award for Outstanding Contributions and Accomplishments in the Records and Information Management Profession.

You were recently awarded the prestigious Emmett Leahy Award for excellence in records management. Is it unusual that a lawyer wins such an award? Or is the job of the litigator and records manager becoming inextricably linked?

Yes, it was unusual: I am the first federal lawyer to win the Emmett Leahy award, and only the second lawyer to have done so in the 40-odd years that the award has been given out. But my career path in the federal government has been a bit unusual as well: I spent seven years working as lead counsel on the original White House PROFS email case (Armstrong v. EOP), followed by more than a decade worrying about records-related matters for the government as Director of Litigation at NARA. So with respect to records and information management, I long ago passed at least the Malcolm Gladwell test in "Outliers" where he says one needs to spend 10,000 hours working on anything to develop a level of "expertise."  As to the second part of your question, I absolutely believe that to be a good litigation attorney these days one needs to know something about information management and eDiscovery — since all evidence is "born digital" and lots of it needs to be searched for electronically. As you know, I also have been a longtime advocate of a greater linking between the fields of information retrieval and eDiscovery.

In your acceptance speech you spoke about the dangers of information overload and the possibility that it will make it difficult for people to find important information. How optimistic that we can avoid this dystopian future? How can the legal profession help the world avoid this fate? 

What I said was that in a world of greater and greater retention of electronically stored information, we need to leverage artificial intelligence and specifically better search algorithms to keep up in this particular information arms race. Although Ralph Losey teased me in a recent blog post that I was being unduly negative about future information dystopias, I actually am very optimistic about the future of search technology assisting in triaging the important from the ephemeral in vast collections of archives. We can achieve this through greater use of auto-categorization and search filtering methods, as well as a having a better ability in the future to conduct meaningful searches across the enterprise (whether in the cloud or not). Lawyers can certainly advise their clients how to practice good information governance to accomplish these aims.

You were one of the founders of the TREC Legal Track research project. What do you consider that project’s achievement at this point?

The initial idea for the TREC Legal Track was to get a better handle on evaluating various types of alternative search methods and technologies, to compare them against a "baseline" of how effective lawyers were in relying on more basic forms of keyword searching. The initial results were a wake-up call, in showing lawyers that sole reliance on simple keywords and Boolean strings sometimes results in a large quantity of relevant evidence going missing. But during the half-decade of research that now has gone into the track, something else of perhaps even greater importance has emerged from the results, namely: we have a much better understanding now of what a good search process looks like, which includes a human in the loop (known in the Legal Track as a topic authority) evaluating on an ongoing, iterative basis what automated search software kicks out by way of initial results. The biggest achievement however may simply be the continued existence of the TREC Legal Track itself, still going in its 6th year in 2011, and still producing important research results, on an open, non-proprietary platform, that are fully reproducible and that benefit both the legal profession as well as the information retrieval academic world. While I stepped away after 4 years from further active involvement in the Legal Track as a coordinator, I continue to be highly impressed with the work of the current track coordinators, led by Professor Doug Oard at the University of Maryland, who was remained at the helm since the very beginning.

To what extent has TREC’s research proven the reliability of computer-assisted review in litigation? Is there a danger that the profession assumes the reliability of computer-assisted review is a settled matter?

The TREC Legal Track results I am most familiar with through calendar year 2010 have shown computer-assisted review methods finding in some cases on the order of 85% of relevant documents (a .85 recall rate) per topic while only producing 10% false positives (a .90 precision rate). Not all search methods have had these results, and there has been in fact a wide variance in success achieved, but these returns are very promising when compared with historically lower rates of recall and precision across many information retrieval studies. So the success demonstrated to date is highly encouraging. Coupled with these results has been additional research reported by Maura Grossman & Gordon Cormack, in their much-cited paper Technology-Assisted Review in EDiscovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, which makes the case for the greater accuracy and efficiency of computer-assisted review methods.

Other research conducted outside of TREC, most notably by Herbert Roitblat, Patrick Oot and Anne Kershaw, also point in a similar direction (as reported in their article Mandating Reasonableness in a Reasonable Inquiry). All of these research efforts buttress the defensibility of technology-assisted review methods in actual litigation, in the event of future challenges. Having said this, I do agree that we are still in the early days of using many of the newer predictive types of automated search methods, and I would be concerned about courts simply taking on faith the results of past research as being applicable in all legal settings. There is no question however that the use of predictive analytics, clustering algorithms, and seed sets as part of technology-assisted review methods is saving law firms money and time in performing early case assessment and for multiple other purposes, as reported in a range of eDiscovery conferences and venues — and I of course support all of these good efforts.

You have discussed the need for industry standards in eDiscovery. What benefit would standards provide?

Ever since I served as Co-Editor in Chief on The Sedona Conference Commentary on Achieving Quality in eDiscovery (2009), I have been thinking that the process for conducting good eDiscovery. That paper focused on project management, sampling, and imposing various forms of quality controls on collection, review, and production. The question is, is a good eDiscovery process capable of being fit into a maturity model of sorts, and might be useful to consider whether vendors and law firms would benefit from having their in-house eDiscovery processes audited and certified as meeting some common baseline of quality? To this end, the DESI IV workshop ("Discovery of ESI") held in Pittsburgh last June, as part of the Thirteenth International AI and Law Conference (ICAIL 2011), had as its theme exploring what types of model standards could be imposed on the eDiscovery discipline, so that we all would be able to work from some common set of benchmarks, Some 75 people attended and 20-odd papers were presented. I believe the consensus in the room was that we should be pursuing further discussions as to what an ISO 9001-type quality standard would look like as applied to the specific eDiscovery sector, much as other industry verticals have their own ISO standards for quality. Since June, I have been in touch with some eDiscovery vendors have actually undergone an audit process to achieve ISO 9001 certification. This is an area where no consensus has yet emerged as to the path forward — but I will be pursuing further discussions with DESI workshop attendees in the coming months and promise to report back in this space as to what comes of these efforts.

What sort of standards would benefit the industry? Do we need standards for pieces of the eDiscovery process, like a defensible search standard, or are you talking about a broad quality assurance process?

DESI IV started by concentrating on what would constitute a defensible search standard; however, it became clear at the workshop and over the course of the past few months that we need to think bigger, in looking across the eDiscovery life cycle as to what constitutes best practices through automation and other means. We need to remember however that eDiscovery is a very young discipline, as we're only five years out from the 2006 Rules Amendments. I don't have all the answers, by any means, on what would constitute an acceptable set of standards, but I like to ask questions and believe in a process of continuous, lifelong learning. As I said, I promise I'll let you know about what success has been achieved in this space.

Thanks, Jason, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Best Practices: Production is the “Ringo” of the eDiscovery Phases

 

Since eDiscovery Daily debuted over 14 months ago, we’ve covered a lot of case law decisions related to eDiscovery.  65 posts related to case law to date, in fact.  We’ve covered cases associated with sanctions related to failure to preserve data, issues associated with incomplete collections, inadequate searching methodologies, and inadvertent disclosures of privileged documents, among other things.  We’ve noted that 80% of the costs associated with eDiscovery are in the Review phase and that volume of data and sources from which to retrieve it (including social media and “cloud” repositories) are growing exponentially.  Most of the “press” associated with eDiscovery ranges from the “left side of the EDRM model” (i.e., Information Management, Identification, Preservation, Collection) through the stages to prepare materials for production (i.e., Processing, Review and Analysis).

All of those phases lead to one inevitable stage in eDiscovery: Production.  Yet, few people talk about the actual production step.  If Preservation, Collection and Review are the “John”, “Paul” and “George” of the eDiscovery process, Production is “Ringo”.

It’s the final crucial step in the process, and if it’s not handled correctly, all of the due diligence spent in the earlier phases could mean nothing.  So, it’s important to plan for production up front and to apply a number of quality control (QC) checks to the actual production set to ensure that the production process goes as smooth as possible.

Planning for Production Up Front

When discussing the production requirements with opposing counsel, it’s important to ensure that those requirements make sense, not only from a legal standpoint, but a technical standpoint as well.  Involve support and IT personnel in the process of deciding those parameters as they will be the people who have to meet them.  Issues to be addressed include, but not limited to:

  • Format of production (e.g., paper, images or native files);
  • Organization of files (e.g., organized by custodian, legal issue, etc.);
  • Numbering scheme (e.g., Bates labels for images, sequential file names for native files);
  • Handling of confidential and privileged documents, including log requirements and stamps to be applied;
  • Handling of redactions;
  • Format and content of production log;
  • Production media (e.g., CD, DVD, portable hard drive, FTP, etc.).

I was involved in a case recently where opposing counsel was requesting an unusual production format where the names of the files would be the subject line of the emails being produced (for example, “Re: Completed Contract, dated 12/01/2011”).  Two issues with that approach: 1) The proposed format only addressed emails, and 2) Windows file names don’t support certain characters, such as colons (:) or slashes (/).  I provided that feedback to the attorneys so that they could address with opposing counsel and hopefully agree on a revised format that made more sense.  So, let the tech folks confirm the feasibility of the production parameters.

The workflow throughout the eDiscovery process should also keep in mind the end goal of meeting the agreed upon production requirements.  For example, if you’re producing native files with metadata, you may need to take appropriate steps to keep the metadata intact during the collection and review process so that the metadata is not inadvertently changed. For some file types, metadata is changed merely by opening the file, so it may be necessary to collect the files in a forensically sound manner and conduct review using copies of the files to keep the originals intact.

Tomorrow, we will talk about preparing the production set and performing QC checks to ensure that the ESI being produced to the requesting party is complete and accurate.

So, what do you think?  Have you had issues with production planning in your cases?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Trends: Potential ESI Sources Abound in Penn State Case

 

Whether you’re a college football fan or not, chances are you’ve heard about the scandal associated with the allegations of serial child abuse by former Penn State football coach Jerry Sandusky.  There seems to be new developments almost daily and the scandal has already cost the jobs of the university president, vice president, athletic director and the head football coach, Joe Paterno, who had been head coach since 1965 and on the coaching staff since 1950 (most of us weren’t even born yet!).  Numerous lawsuits seem highly likely to arise as a result of the alleged abuse against a variety of defendants, including the university, individuals alleged to be involved in the abuse and cover-up and also the Second Mile Foundation founded by Sandusky.

Seth Row, an attorney with Parsons Farnell & Grein LLP in Portland (OR), has written an article published in the Association of Certified eDiscovery Specialists (ACEDS) web site providing a detailing of potential sources of ESI that may be relevant in the case.  The article illustrates the wide variety of sources that might be responsive to the litigation.  Here are some of the sources cited by Row:

  • Videotape of entry and exit from the athletic facilities at Penn State, to which Paterno gave Sandusky access after the latter resigned in 1999;
  • Entry/exit logs, which are likely housed in a database if keycards were used, for the Lasch Football Building, where abuse was allegedly witnessed
  • Phone records of incoming and outgoing calls;
  • Electronic rosters of football players, coaches, staff, student interns, and volunteers affiliated with the Penn State football program over time;
  • The personal records of these individuals, including telephone logs, internet search histories, email accounts, medical and financial records, and related information created over time;
  • University listservs;
  • Internet forums – a New York Times article reported last week that a critical break in the investigation came via a posting on the Internet, mentioning that a Penn State football coach might have seen something ugly, but kept silent;
  • Maintenance logs maintained by the two custodial employees who allegedly witnessed abuse;
  • Identities of all media beat reporters who covered the Penn State football team;
  • Passenger and crew manifests for all chartered flights of the Penn State football team in which Sandusky was a passenger;
  • Sandusky's credit card records to document meals and outings where he may have been accompanied by victims, and records of gifts he purchased for them;
  • All records of the Second Mile Foundation identifying boys who participated in its programs, as well as the names of donors and officers, directors and staff;
  • Paper record equivalents of this ESI that were produced in the 1990s before electronic recordkeeping became prevalent;
  • All electronic storage and computing devices owned or maintained by Sandusky, Paterno and other central figures in the scandal, including cell phones, personal computers, tablet computers, flash drives, and related hardware.

With such a wide variation of potential custodians and time frames, it will be difficult to quickly narrow down the potential ESI sources.  As the author points out, it seems likely that Penn State has already locked down its records retention policies throughout the university.  They certainly would seem to have a reasonable expectation of litigation.  Investigators and attorneys will likely be racing against time to identify as many other parties as possible with potentially responsive ESI.

So, what do you think?  Have you been involved in litigation with such a wide distribution of potentially responsive ESI?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Data Mapping Doesn’t Have to be Complicated

 

Some time ago, we talked about the importance of preparing a data map of your organization’s data to be ready when litigation strikes.

Back then, we talked about four steps to create and maintain an effective data map, including:

  • Obtaining early “buy-in” with various departments throughout the organization;
  • Document and educate to develop logical and comprehensive practices for managing data;
  • Communicate regularly so that new data stores (or changes to existing ones) can be addressed as they occur;
  • Update periodically to keep up with changes in technology that create new data sources.

The data map itself doesn’t have to be complicated.  It can be as simple as a spreadsheet (or series of spreadsheets, one for each department or custodian, depending on what level of information is likely to be requested).  Here are examples of types of information that you might see in a typical data map spreadsheet:

  • Type of Data: Prepare a list and continue to add to it to ensure all of the types or data are considered.  These can include email, work product documents, voice mail, databases, web site, social media content, hard copy documents, and any other type of data in use within your organization.
  • Department/Custodian: A data map is no good unless you identify the department or custodian responsible for the data.  Some of these may be kept by IT (e.g., Exchange servers for the entire organization) while others could be down to the individual level (e.g., Access databases kept on an individual’s laptop).
  • Storage Classification: The method(s) by which the data is stored by the department or custodian is important to track.  You’ll typically have Online, Nearline, Offline and Inaccessible Data.  A type of data can apply to multiple or even all storage classifications.  For example, email can be stored Online in Exchange servers, Nearline in an email archiving system, Offline in backup tapes and Inaccessible in a legacy format.  Therefore, you’ll need a column in your spreadsheet for each storage classification.
  • Retention Policy: Track the normal retention policy for each type of data stored by each department of custodian (e.g., retain email for 5 years).  While a spreadsheet won’t automatically identify when specific data is “expired”, a regular process of looking for data older than the retention time period will enable your organization to purge “expired” data.
  • Litigation Hold Applied: Unless of course, that data is subject to an active litigation hold.  If so, you’ll want to identify the case(s) for which the hold is applied and be prepared to update to remove those cases from the list once the hold obligation is released.  If all holds are released on normally “expired” data and no additional hold obligations are expected, that may be the opportunity to purge that data.
  • Last Update Date: It’s always a good idea to keep track of when the information in the data map was last updated.  If it’s been a while since that last update, it might be time to coordinate with that department or custodian to bring their portion of the data map current.

As you see, a fairly simple 9 or 10 column spreadsheet might be all you need to start gathering information about the data stores in your organization.

So, what do you think?  Has your organization implemented a data mapping program?  If not, why not? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Trends: Is Email Still the Most Common Form of Requested ESI?

 

Email has historically been the most common form of requested electronically stored information (ESI), but that has changed, according to a survey performed by Symantec and reported in Law Technology News.

According to the article, Symantec’s survey, conducted this past June and July, included lawyers and technologists at 2,000 enterprises worldwide.  However, the article doesn’t indicate the total number of respondents or whether that’s the number of organizations receiving the survey or the number actually responding.

Regarding how frequently (percentage of situations requested) various types of ESI are requested during legal and regulatory processes, the survey yielded some surprising answers:

  • Files and Documents: 67 percent
  • Application and Database Records: 61 percent
  • Email: 58 percent
  • Microsoft SharePoint records: 51 percent
  • Messaging Formats (e.g., instant messaging, texts, and BlackBerry PIN messages): 44 percent
  • Social Media Data: 41 percent

Email requested in legal and regulatory processes just over half the time?  That’s more than surprising, that’s shocking!

Symantec’s survey also asked about implementation of a formal data retention policy, with 30 percent of responding companies indicating that they have discussed but have not implemented a policy and 14 percent indicating that they have no plans to implement a policy (44 percent total that have not implemented a policy).  Reasons for not doing so were as follows (respondents were allowed to pick multiple reasons):

  • No Need Identified: 41 percent
  • Cost: 38 percent
  • No Designated Employee (to implement the policy): 27 percent
  • Too Time Consuming: 26 percent
  • Lack of Expertise: 21 percent

Many of these companies may not feel compelled to implement a policy because they are not frequently in litigation nor are they in regulated industries.

So, what do you think?  Do the percentages above reflect your experience as to how frequently the different types of ESI are requested?  Does the email percentage seem significantly low?  In my experience, it does.  Please share any comments you might have or if you’d like to know more about a particular topic.

A Marriage Made for eDiscovery: EDRM and ARMA

 

EDRM has been busy lately, with a new Model Code of Conduct drafted recently and now this announcement.

As discussed in our recent twopart series on eDiscovery standards, there is a growing movement to develop industry standards, frameworks, or reference models to help manage eDiscovery. This week, there was perhaps a major move in that direction as the Electronic Discovery Reference Model (EDRM) and ARMA International announced that they would be collaborating on information governance guidelines for eDiscovery.  

According to EDRM, the partnership began at LegalTech in New York back in February when ARMA reached out to suggest working together. The plan is still vague, but together these two groups hope to provide a framework for records management in the eDiscovery context. “I don’t know where this partnership will take us, but it’s just silly that two groups with similar goals and ideals would work in isolation,” says George Socha, an eDiscovery consultant and one of the co-founders and co-managers of EDRM.

Two years ago, EDRM started its Information Governance Reference Model, providing a conceptual framework for information governance. Today, the Information Governance Reference Model is primarily a rough guide for developing information management programs. But EDRM, which is a relatively small volunteer effort, hopes that the weight of ARMA, which boasts 11,000 members, will help flesh out the framework.

By contrast, the Association for Information Management Professionals (ARMA) International is an established and relatively large and influential group claiming 11,000 members in 30 countries. ARMA international has developed its Generally Accepted Record-keeping Principles, or GARP, framework to provide best practices for information management. The framework is designed generally for records-keeping management, but has been designed to account for the demands of eDiscovery. Though ARMA’s core constituency is records managers, the demands of litigation have been driving many of the group’s recent initiatives. 

Interestingly, as we’ve noted previously, ARMA has previously described the EDRM effort as falling “short of describing standards or best practices that can be applied to the complex issues surrounding the creation, management, and governance of electronic information.” However, the organization clearly believes EDRM’s network of experienced litigators and IT professionals will help it address the demands of eDiscovery.

If broad industry standards efforts are going to be developed, it will take more such efforts like this that cut across industries and bring expertise from different areas into alignment. Socha believes that though the EDRM and ARMA have traditionally served different groups, they have both realized that they are concerned with many of the same problems.  “A lot of the root causes of eDiscovery issues come from a failure to have your electronic house in order,” says Socha. “What the Information Governance Reference Model and GARP are about is addressing that issue.”

So, what do you think? Does the EDRM need ARMA? Or vice versa? Please share any comments you might have or if you'd like to know more about a particular topic.

eDiscovery Standards: Does the Industry Need Them?

 

eDiscovery Daily recently ran a three part series analyzing eDiscovery cost budgeting. Cost has long been a driving force in eDiscovery decision-making, but it is just one dimension in choosing EDD services. Other industries have well-established standards for quality – think of the automotive or software industries, which have standard measures for defects or bugs. This year there has been a rising call for developing industry standards in eDiscovery to provide quality measures.

There is a belief that eDiscovery is becoming more routine and predictable, which means standards of service can be established. But is eDiscovery really like manufacturing? Can you assess the level of service in EDD in terms of number of defects? Quality is certainly a worthy aim – government agencies have shifted away from cost being the single biggest justification for contract award, more heavily weighting quality of service in such decisions.  The question is how to measure quality in EDD.

Quality standards that offer some type of objective measures could theoretically provide another basis for decision-making in addition to cost. Various attempts have been made at creating industry standards over the years, very little has yet been standardized. The recent DESI (Discovery of Electronically Stored Information) IV workshop at the International Conference on Artificial Intelligence and Law in June investigated possible standards. In the background to the conference, organizers bemoaned that “there is no widely agreed-upon set of standards or best practices for how to conduct a reasonable eDiscovery search for relevant evidence.” 

Detractors say standards are just hoops for vendors to jump through or a checkbox to check that don’t do much to differentiate one company from another. However, proponents believe industry standards could define issues like document defensibility, defining output, or how to go about finding responsive documents in a reasonable way, issues that can explode if not managed properly.

The Sedona Conference, Electronic Discovery Reference Model (EDRM), and Text Retrieval Conference (TREC) Legal Track all have efforts of one kind or another to establish standards for eDiscovery. EDRM provides a model for eDiscovery and standards of production. It has also led an effort to create a standard, generally accepted XML model to allow vendors and systems to more easily share electronically stored information (ESI). However, that applies to software vendors, and really doesn’t help the actual work of eDiscovery.

The Sedona Commentary on Achieving Quality in eDiscovery calls for development of standards and best practices in processing electronic evidence. Some of the standards being considered for broad industry standards are the ISO 9000 standard, which provides industry-specific frameworks for certifying organizations or the Capability Maturity Model Integration (CMMI), centered around improving processes.

The Association for Information Management Professionals (ARMA) is pushing its Generally Accepted Record-keeping Principles (GARP) framework to provide best practices for information management in the eDiscovery context. This article from ARMA is dismissive of information governance efforts such as the EDRM, which it says provides a framework for eDiscovery projects, but “falls short of describing standards or best practices that can be applied to the complex issues surrounding the creation, management, and governance of electronic information.”

Meanwhile, there are efforts underway to standardize pieces of the eDiscovery process. Law.com says that billing code standards are in the works to help clients understand what they are buying when they sign a contract for eDiscovery services.

Perhaps the most interesting and important effort is the TREC Legal Track, which began as government research project into improving search results. The project garnered a fair amount of attention when it discovered that keyword searching was as effective as or better than many advanced concept searches and other technology that was becoming popular in the industry. Since that time, researchers have been trying to develop objective criteria for comparing methods for searching large collections of documents in civil litigation.

As of today, these efforts are largely unrelated, disjointed, or even dismissive of competing efforts. In my next post, I’ll dig into specific efforts to see if any make sense for the industry. So, what do you think? Are standards needed, or is it just a lot of wheel spinning? Please share any comments you might have or if you'd like to know more about a particular topic.

Editor's Note: Welcome Jason Krause as a guest author to eDiscovery Daily blog!  Jason is a freelance writer in Madison, Wisconsin. He has written about technology and the law for more than a dozen years, and has been writing about EDD issues since the first Zubulake decisions. Jason began his career in Silicon Valley, writing about technology for The Industry Standard, and later served as the technology reporter for the ABA Journal. He can be reached at jasonkrause@hotmail.com.

eDiscovery Trends: Cloud Covered by Ball

 

What is the cloud, why is it becoming so popular and why is it important to eDiscovery? These are the questions being addressed—and very ably answered—in the recent article Cloud Cover (via Law Technology News) by computer forensics and eDiscovery expert Craig Ball, a previous thought leader interviewee on this blog.

Ball believes that the fears about cloud data security are easily dismissed when considering that “neither local storage nor on-premises data centers have proved immune to failure and breach”. And as far as the cloud's importance to the law and to eDiscovery, he says, "the cloud is re-inventing electronic data discovery in marvelous new ways while most lawyers are still grappling with the old."

What kinds of marvelous new ways, and what do they mean for the future of eDiscovery?

What is the Cloud?

First we have to understand just what the cloud is.  The cloud is more than just the Internet, although it's that, too. In fact, what we call "the cloud" is made up of three on-demand services:

  • Software as a Service (SaaS) covers web-based software that performs tasks you once carried out on your computer's own hard drive, without requiring you to perform your own backups or updates. If you check your email virtually on Hotmail or Gmail or run a Google calendar, you're using SaaS.
  • Platform as a Service (PaaS) happens when companies or individuals rent virtual machines (VMs) to test software applications or to run processes that take up too much hard drive space to run on real machines.
  • Infrastructure as a Service (IaaS) encompasses the use and configuration of virtual machines or hard drive space in whatever manner you need to store, sort, or operate your electronic information.

These three models combine to make up the cloud, a virtual space where electronic storage and processing is faster, easier and more affordable.

How the Cloud Will Change eDiscovery

One reason that processing is faster is through distributed processing, which Ball calls “going wide”.  Here’s his analogy:

“Remember that scene in The Matrix where Neo and Trinity arm themselves from gun racks that appear out of nowhere? That's what it's like to go wide in the cloud. Cloud computing makes it possible to conjure up hundreds of virtual machines and make short work of complex computing tasks. Need a supercomputer-like array of VMs for a day? No problem. When the grunt work's done, those VMs pop like soap bubbles, and usage fees cease. There's no capital expenditure, no amortization, no idle capacity. Want to try the latest concept search tool? There's nothing to buy! Just throw the tool up on a VM and point it at the data.”

Because the cloud is entirely virtual, operating on servers whose locations are unknown and mostly irrelevant, it throws the rules for eDiscovery right out the metaphorical window.

Ball also believes that everything changes once discoverable information goes into the cloud. "Bringing ESI beneath one big tent narrows the gap between retention policy and practice and fosters compatible forms of ESI across web-enabled applications".

"Moving ESI to the cloud," Ball adds, "also spells an end to computer forensics." Where there are no hard drives, there can be no artifacts of deleted information—so, deleted really means deleted.

What's more, “[c]loud computing makes collection unnecessary”. Where discovery requires that information be collected to guarantee its preservation, putting a hold on ESI located in the cloud will safely keep any users from destroying it. And because cloud computing allows for faster processing than can be accomplished on a regular hard drive, the search for discovery documents will move to where they're located, in the cloud. Not only will this approach be easier, it will also save money.

Ball concludes his analysis with the statement, "That e-discovery will live primarily in the cloud isn't a question of whether but when."

So, what do you think? Is cloud computing the future of eDiscovery? Is that future already here? Please share any comments you might have or if you'd like to know more about a particular topic.

eDiscovery Trends: An Insufficient Password Will Thwart Even The Most Secure Site

 

Several months ago, we talked about how most litigators have come to accept that Software-as-a-Service (SaaS) systems are secure.  For example, at Trial Solutions, the servers hosting data for our OnDemand® and FirstPass® (powered by Venio FPR™) platforms are housed in a Tier 4 data center in Houston (which is where our headquarters is).  The security at this data center is military grade: 24 x 7 x 365 onsite security guards, video surveillance, biometric and card key security required just to get into the building.  Not to mention a building that features concrete bollards, steel lined walls, bulletproof glass, and barbed wire fencing.

Pretty secure, huh?  Hacking into a system like this would be very difficult, wouldn’t you think?  I’ll bet that the CIA, PBS and Sony had secure systems as well; however, they were recently “hacked” by the hacker group LulzSec.  According to a recent study by the Ponemon Institute (linked to here via the Ride the Lightning blog), the chance of any business being hacked in the next 12 months is a “statistical certainty”.

No matter how secure a system is, whether it’s local to your office or stored in the “cloud”, an insufficient password that can be easily guessed can allow hackers to get in and steal your data.  Some dos and don’ts:

Dos:

  • If you need to write passwords down, write them down without the corresponding user IDs and keep the passwords with important documents like your passport, social security card and other important documents you’re unlikely to lose.  Or, better yet, use a password management application that encrypts and stores all of your passwords.
  • Mnemonics make great passwords.  For example, “I work for Trial Solutions in Houston, Texas” could become a password like “iw4tsiht”. (by the way, that’s not a password for any of my accounts, so don’t even try)  😉
  • Change passwords every few months.  Some systems require this anyway.

Don’ts:

  • Don’t use the same password for multiple accounts, especially if they have sensitive data such as bank account or credit card information.
  • Don’t email passwords to yourself – if someone is able to hack into your email, then they have access to those accounts as well.
  • Personal information may be easy to remember, but it can also be easily guessed, so avoid using things like your kids’ names, birthday or other information that can be guessed by someone who knows you.
  • Avoid logging into sensitive accounts when using public Wi-Fi as it is much easier for hackers to tap into what you’re doing in those environments.  If you’re thinking of checking your bank balance while having a latte at Starbucks, don’t.

So, what do you think?  Are you guilty of any of the “don’ts” listed above?  Please share any comments you might have or if you’d like to know more about a particular topic.

Full disclosure: I work for Trial Solutions, which provides SaaS-based eDiscovery review applications FirstPass® (for first pass review) and OnDemand® (for linear review and production).  Our clients’ data is hosted in a secured, SAS 70 Type II certified Tier 4 Data Center in Houston, Texas.