Information Governance

eDiscovery Trends: George Socha of Socha Consulting

 

This is the seventh of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is George Socha.  A litigator for 16 years, George is President of Socha Consulting LLC, offering services as an electronic discovery expert witness, special master and advisor to corporations, law firms and their clients, and legal vertical market software and service providers in the areas of electronic discovery and automated litigation support. George has also been co-author of the leading survey on the electronic discovery market, The Socha-Gelbmann Electronic Discovery Survey.  In 2005, he and Tom Gelbmann launched the Electronic Discovery Reference Model project to establish standards within the eDiscovery industry – today, the EDRM model has become a standard in the industry for the eDiscovery life cycle and there are eight active projects with over 300 members from 81 participating organizations. George has a J.D. for Cornell Law School and a B.A. from the University of Wisconsin – Madison.

What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?

On the very “flip” side, the number one trend to date in 2011 is predictions about trends in 2011.  They are part of a consistent and long-term pattern, which is that many of these trend predictions are not trend predictions at all – they are marketing material and the prediction is “you will buy my product or service in the coming year”.

That said, there are a couple of things of note.  Since I understand you talked to Tom about Apersee, it’s worth noting that corporations are struggling with working through a list of providers to find out who provides what services.  You would figure that there is somewhere in the range of 500 or so total providers.  But, my ever-growing list, which includes both external and law firm providers, is at more than 1,200.  Of course, some of those are probably not around anymore, but I am confident that there are at least 200-300 that I do not yet have on the list.  My guess when the list shakes out is that there are roughly 1,100 active providers out there today.  If you look at information from the National Center for State Courts and the Federal Judicial Center, you’ll see that there are about 11 million new lawsuits filed every year.  I saw an article in the Cornell Law Forum a week or two ago which indicated that there are roughly 1.1 million lawyers in the country.  So, there are 11 million lawsuits, 1.1 million lawyers and 1,100 providers.  Most of those lawyers have no experience with eDiscovery and most of those lawsuits have no provider involved, which means eDiscovery is still very much an emerging market, not even close to being a mature market.  As fast as providers disappear, through attrition or acquisition, new providers enter the market to take their place.

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the second afternoon of LTNY}  Maybe this is overly optimistic, but part of what I’m seeing in leading up to the conference, on various web sites and at the conference itself, is that a series of incremental changes taking place over a long period are finally leading to some radical differences.  One of those differences is that we finally are reaching a point where a number of providers can make the claim to being “end-to-end providers” with some legitimacy.  For as long as we’ve had the EDRM model, we’ve had providers that have professed to cover the full EDRM landscape, by which they generally have meant Identification through Production.  A growing number of providers not only cover that portion of the EDRM spectrum but have some ability to address Information Management, Presentation, or both   By and large, those providers are getting there by building their software and services based on experience and learning over the past 8 to 10 to 12 years, introducing new offerings at the show that reflect that learned experience.

A couple of days ago, I only half-jokingly issued “the Dyson challenge” (as in the Dyson vacuum cleaner).  Every year, come January, our living room carpet is strewn with pine tree needles and none of the vacuum cleaners that we have ever had have done a good job of picking up those needles.  The Dyson vacuum cleaner claims it cyclones capture more dirt than anything, but I was convinced that could not include those needles.  Nonetheless I tried, and to my surprise it worked like a charm!  I want to see the providers offering products able to perform at that high level, not just meeting but exceeding expectations.

I also see a feeling of excitement and optimism that wasn’t apparent at last year’s show.

What are you working on that you’d like our readers to know about?

As I mentioned, we have launched the Apersee web site, designed to allow consumers to find providers and products that fit their specific needs.  The site is in beta and the link is live.  It’s in beta because we’re still working on features to make it as useful as possible to customers and providers.  We’re hoping it’s a question of weeks, not months, before those features are implemented.  Once we go fully live, we will go two months with the system “wide open” – where every consumer can see all the provider and product information that any provider has put in the system.  After that, consumers will be able to see full provider and product profiles for providers who have purchased blocks of views.  Even if a provider does not purchase views, all selection criteria it enters are searchable, but search results will display only the provider’s name and website name.  Providers will be able to get stats on queries and how many times their information is viewed, but not detailed information as to which customers are connecting and performing the queries.

As for EDRM, we continue to make progress with an array of projects and a growing number of collaborative efforts, such as the work the Data Set group has down with TREC Legal and the work the Metrics group has done with the LEDES Committee. We not only want to see membership continue to grow, but we also want to continue to push for more active participation to continue to make progress in the various working groups.  We’ve just met at the show here regarding the EDRM Testing pilot project to address testing standards.  There are very few guidelines for testing of electronic discovery software and services, so the Testing project will become a full EDRM project as of the EDRM annual meeting this May to begin to address the need for those guidelines.

Thanks, George, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Deidre Paknad of PSS Systems

 

This is the sixth of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Deidre Paknad.  Deidre is President & CEO of PSS Systems, an IBM Company.  Deidre is widely credited with having conceived of and launched the first commercial applications for legal holds, collections and retention management in 2004. A well-known thought leader in the legal and information governance domain, Deidre founded the Compliance, Governance and Oversight Council (CGOC), a professional community on retention and preservation that analyst firm IDC labeled a "think tank." She has been a member of several Sedona working groups since 2005 and leads the EDRM Information Management Reference Model (IMRM) working group.  Deidre is a seasoned entrepreneur and executive with 20 years' experience applying technology to poor-functioning business processes to reduce cost and risk. Prior to PSS, she helped Certus launch its Sarbanes Oxley software solution. Deidre previously founded and was CEO of CoVia Technologies from 1996 to 2000, where she was inducted into the Smithsonian Institution for innovation in 1999 and again in 2000.

What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?

Well, certainly the social media explosion is one of the most talked about current trends.  Social media has brought about a huge change in the way we communicate, both personally and within organizations.  It’s one of the factors that is causing organizations to revisit where information comes from, where “messages” come from.  And, now there are more communications via social media than email.  In 2010, there were an estimated 1 trillion emails sent worldwide, but 89% of all emails sent is spam, so the number of “true emails” is far less, only about 110 billion.  Conversely, there were nearly 400 billion Facebook communications last year, over 700 billion views on YouTube and over 200 billion Twitter messages.  Organizations will have to face forward in addressing new sources of data and how to handle them as there will continue to be more social media communications (many viewed via mobile devices) with customers, employees, etc.  While most corporate social media tools today aren’t “discovery ready”, social and mobile media may level the information playing field between small and large litigants.

Another trend on which organizations are finally focusing more, that has been a significant focus of mine for some time, is information governance.  Since the Federal evidence rules were extended to electronic data in 2006, preservation sanctions are at an all-time high, despite the fact that organizations have adopted a mindset of “save everything”, which has led to unrestrained growth in data within organizations.  So, saving more data did not translate to less risk for organizations, but it did translate to more cost.  As noted in the 2009 Fulbright & Jaworski Litigation Report, the average cost to collect, cull and review information per case for large organizations has risen to $3 million, but the amount of that reviewed data that needed to be retained was only 30% and 70% was wasteful legal effort.   Even worse, organizations are spending 3.5% of revenues on information management – for the Fortune 50, that’s several billion dollars and a good chunk of it goes to managing unnecessary information and infrastructure.

Last year, the CGOC conducted a survey of legal, records management (RIM) and IT practitioners in Global 1000 companies and published the findings in an October report titled Information Governance Benchmark Report in Global 1000 Companies (You can request a copy of the report here and read eDiscovery Daily’s blog post about it here.).  75% of respondents identified the inability to defensibly dispose of data as their greatest challenge, and 70% of respondents indicated that they depend on “liaisons and people glue” to link discovery and regulatory obligations to information.  It’s an enterprise issue where Legal understands the obligations for data, business teams know the information value of the data and IT has the data, but no visibility to its obligations or business value.  So, there’s a big disconnect.

I think you’ll see that information governance and eDiscovery in general will become more connected to the overall business strategy.  When asked what they believe are the essential elements of information governance, 77% agreed retention schedules that reflect both regulatory and business needs and 85% of respondents agreed consistent collaboration and systematic linkage across legal, records and IT and were essential elements.  I think the Information Governance Benchmark Report has opened some eyes as to the importance of associating the legal obligations for and value of information to the assets IT is managing and the benefits of connecting legal, records and IT stakeholders and processes as an essential corporate strategy.

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the second afternoon of LTNY}  I think there’s some “retreading” of topics at this year’s show, for example, the Legal vs. IT keynote speech.  That’s really more of an issue for 2 or 3 years ago.  Legal and IT do collaborate narrowly on discovery responsiveness.  But the issues of the day are more at an overall company level – high costs and high risk associated with the unrestrained growth in data are caused by practices across the company, not just in the legal department.   Responding to discovery simply deals with the symptoms, but doesn’t treat the disease.

I think discussion about FRCP reform aimed at easing the burden of discovery is more timely and survey data from the CGOC community published in the legal holds and information governance benchmark reports provided evidence in the FRCP Preservation Comment of November 10, 2010 of the need to reshape the rules to reflect current needs.

What are you working on that you’d like our readers to know about?

Well, in addition to the significant reception that the information governance benchmark report has received, CGOC just conducted its 2011 Summit last month, with participation from a number of large corporations including Exxon Mobil, Travelers, Bank of America and Novartis.  The Summit included a number of presentations, and a mock discovery hearing conducted by Judge {Andrew J.} Peck {Magistrate Judge, SDNY} on how prevailing practices break down in cases like Harkabi where everyone took the right steps but still got the wrong results.  It also included breakout sessions for Legal, RIM and IT to discuss prevailing practices for discovery, retention and data disposal, improving processes within each of these departments to support the enterprise as well as starting and advancing the cross-functional dialogue between the departments.

I’m also very excited about the IMRM project within EDRM, a group I co-chair.  It aims to offer guidance and a responsibility framework for Legal, IT, Records Management, line-of-business leaders and other business stakeholders within organizations.  It’s an entirely new reference model that is a separate counterpart to EDRM and the model links the duty and value to information assets to result in efficient and effective management of information.

There is nothing I’m more excited about, however, than working with my new colleagues at IBM on solutions that help our customers to do rigorous, efficient eDiscovery, value-based retention, smarter archiving and defensible disposal. 

Thanks, Deidre, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Jack Halprin of Autonomy

 

This is the fifth of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Jack Halprin.  As Vice President, eDiscovery and Compliance with Autonomy, Jack serves as internal and external legal subject matter expert for best practices and defensible processes around litigation, electronic discovery, legal hold, and compliance issues. He speaks frequently on enterprise legal risk management, compliance, and eDiscovery at industry events and seminars, and has authored numerous articles on eDiscovery, legal hold, social media, and knowledge management. He is actively involved in The Sedona Conference, ACC, and Electronic Discovery Reference Model (EDRM). With a BA in Chemistry from Yale University, a JD from the University of California-Los Angeles, and certifications from the California, Connecticut, Virginia and Patent Bars, Mr. Halprin has varied expertise that lends itself well to both the legal and technical aspects of electronic discovery collection and preservation.

What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?

If I look at the overall trends, social media and the cloud are probably the two hottest topics from a technology perspective and also a data management perspective.  From the legal perspective, you’re looking at preservation issues and sanctions as well as the idea of proportionality.  You also see a greater need for technology that can meet the needs of attorneys and understand the meaning of information.  More and more, everyone is realizing that keyword searches are lacking – they aren’t really as effective as everyone thinks they are.

We’re also starting to see two other technology related trends.  The industry is consolidating and customers are really starting to look for a single platform.  The current process of importing/exporting of data from storage to legal hold collection, to early case assessment, to review, to production and creating several extra copies of the documents in the process is not manageable going forward.  Customers want to be able to preserve in place, to analyze in place, and they don’t want to have to collect and duplicate the data again and again.  If you look at the left side of EDRM, the more proactive side, they don’t want put data or documents in a special repository unless it’s a true record that no one needs to access on a regular basis.  They want to work with active data where it lives.

You’ll see a reduction in the number of vendors in the next year or two, and the technology will not only be able to handle the current data sources, but the increased data volumes and new types of data we’re seeing.  Everyone is looking at social media and saying “how are we going to handle this”, when it’s really just another data source that has to be addressed.  Yes, it’s challenging because there is so much of it and it is even more conversational than email, taking it to a whole new level, but it’s really no different from other data sources.  A keyword search on a social media site is not going to net you the results you’re looking for, but conceptual search to understand the context of what people mean will help you identify the relevant information.  Growth rates are predicted at more than 60 percent for unstructured information, but social media is growing at a much faster clip.  A lot of people are looking at social media and moving to the cloud to manage this data, reducing some of the infrastructure costs, taking strain off the network and reducing their IT footprint.

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the first afternoon of LTNY}  I’ll take it first from the Autonomy perspective.  We have social media solutions, which we’ve had for our marketing business (Interwoven) for some time.  We’ve also had social media governance technology for quite some time as well, and we announced today new capabilities for identifying, preserving and collecting social media for eDiscovery, which is part of and builds on our end-to-end solution.  I haven’t spent much time on the floor yet, but based on everything I’ve seen in the eDiscovery space, a lot of people are talking about social media, but no one really understands how to address it.  You’ve got people scraping {social media} pages, but if you scrape the page without the active link or without capturing the context behind it, you’re missing the wealth of the information.  We’re taking a different approach, we take the entire page, including the context and active links.

There’s also a wide disparity in terms of the cloud.  Is it public?  Is it private?  How much control do you have over your data when it’s in the cloud?  You’ve got a lot of vendors out there that aren’t transparent about their data centers.  You’ve got vendors that say they’re SAS 70 Type II certified, but it’s their data center, not the vendor itself, that is certified.  So, who’s got the experience?  Every year at LegalTech, there are probably forty new vendors out there and the next year, half or more of them are gone.

As for the tone of the show, I think it’s certainly more upbeat than last year when attendance was down, and it’s a bit more “bouncy” this year.  With that in mind, you’ll continue to see acquisitions and you’ll have the issue companies merged through acquisition using different technologies and different search engines, meaning they’re not on a single platform and not really a single solution.  So, that gets back to the idea that customers are really looking for a single platform with a single engine underneath it.  That’s how we approach it, and I think others are trying to get to that point, but I don’t think there are many vendors there yet.  That’s where the trend is heading.

What are you working on that you’d like our readers to know about?

In addition to the new social media eDiscovery capabilities described above, we’ve announced the Autonomy Chaining Console, which is a dashboard to provide corporate legal departments with greater visibility and defensibility across the entire process and to eliminate those risky data import/export handoffs through each step.  Many of the larger corporations have hundreds of cases, dozens of outside law firms, and terabytes of data to manage.  The process today is very “silo” oriented – data is sent to processing vendors, it is sent to law firms, etc.  So, you get these “weak links in the chain” where data can get lost and risks of spoliation and costs increase.  Autonomy announced the whole idea of chaining last year promoting the idea that we can seamlessly connect law firms and their corporate clients in a secure manner, so that the law firm can login to a secure portal and can manage the data that they’re allowed to access.  The Chaining Console strengthens that capability, and it adds Autonomy IDOL’s ability to understand meaning and allows corporate and outside counsel to look at the same data on the same solution.  It uses IDOL to determine potential custodians, understand fact patterns and identify other companies that may be involved by really analyzing the data and providing an understanding of what’s there.  It can also monitor and track risk, so you can set up certain policies around key issues; for example, insider trading, securities fraud, FCPA, etc.  Using those policies, it can alert you to the risks that are there and possibly identify the custodians that are engaging in risky behavior.  And, of course, it tracks the data from start to finish, giving corporate counsel, legal IT, IT, litigation support, litigation counsel as well as outside counsel a single view of the data on a single dashboard.  It strengthens our message and takes us to the next step in really providing the end-to-end platform for our clients.

We’ve also announced iManage in the cloud for legal information management in the cloud.  The cloud-based Information Management platform combines WorkSite, Records Manager, Universal Search, Process Automation and ConflictsManager to help attorneys manage the content throughout the matter lifecycle from inception to disposition.  It uses IDOL’s ability to group concepts, so if you have a conflict with Apple, it knows that you’re searching for terms related to Apple computer such as Mac, iPhone, Steve Jobs, Steve Wozniak, Jonathon Ives and understands that these are related terms and individuals.  And, we’ve just announced the cloud-based version of that.  We’re already managing information governance in the cloud for a lot of our clients and the platform leverages our private cloud, which is the world’s largest private cloud with over 17 petabytes of data.

And, then we have a market leadership announcement with additional major law firms that are using our solutions, such as Brownstein Hyatt Farber Schreck LLP, Brown Rudnick LLP, Fennemore Craig, etc.  So, we have four press releases with new developments at Autonomy that we’ve announced here at the show.

Thanks, Jack, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Christine Musil of Informative Graphics Corporation (IGC)

 

This is the fourth of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Christine Musil.  Christine has a diverse career in engineering and marketing spanning 15 years. Christine has been with IGC since March 1996, when she started as a technical writer and a quality assurance engineer. After moving to marketing in 2001, she has applied her in-depth knowledge of IGC's products and benefits to marketing initiatives, including branding, overall messaging, and public relations. She has also been a contributing author to a number of publications on archiving formats, redaction, and viewing technology in the enterprise.

What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?

For us, the biggest trend is elevation of the importance of eDiscovery, from what happens the minute you find out you have a lawsuit until the end of the case.  There’s a lot more discussion about how you can prevent it, how you can be better prepared, and I think that’s where the new buzzword, information governance, comes in.  We partner with OpenText and we partner with EMC on their content management side and we definitely see them pushing into the eDiscovery market to provide an end-to-end solution and stop trying to treat eDiscovery as an isolated issue. I think that the elevation of eDiscovery and inclusion of eDiscovery more into the regular business workflow of an organization is a pretty significant trend to watch.

Another trend that I see is an elevation of the use of search and how people can get more out of their searches to save time and cost.  This may be somewhat skewed based on our perspective in the market, but we’ve had a lot of requests for our redaction software to pick up the search that has already been done. Providers work so hard to come up with amazingly complicated algorithms to find data.  Why reinvent the wheel?  The companies all ask why all the other vendors can’t just take those search results and use it. 

Since you’ve written a white paper about native review and redaction, where do you see that heading?  Well, I hope that people will stop printing things out, scanning it back in to TIFF, then OCRing it and handing everybody back a disk of flat images and a separate disk with OCR text.  I sort of understand why they do it, but I think a less paranoid or adversarial approach through more effective “meet and confer” agreements on how you are going to present things are going to make it so much easier for everybody.  I hope in three to five years people say “I’m not afraid to hand you my native files because I know how to check them and know what metadata they contain and whether there are any tracked changes or other potential issues”.  So, the paranoia and fear that people have about the unknown that they can’t see in their documents and whether there is a smoking gun in there should die down.  I think people are getting smarter – now that they’re not producing paper – as to what  electronic files contain.  Hopefully, they will understand that native format is OK and when they need to redact, it’s OK to use PDF format to do so.  You tell the other side what you’re doing and what they’re going to get and it becomes a more open and well understood process.

I’m also on the EDRM XML committee and hope a standard load file format that transmits data seamlessly from one side to the other and contains all the information about what has been redacted, among other things, will make things easier on everybody, getting information through the process more seamlessly.  We’re writing white papers about the data set to educate the vendors on how to use it and I have high hopes for what we will be able to accomplish there.

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the first morning of LTNY}  Well, that’s hard since LegalTech just started [smiles].  I can tell you that in discussions with some of our partners, we’re seeing more support for mobile devices, support for the iPad, etc., to help lawyers work wherever they are and be more efficient wherever they are.  And, I think that literally goes all the way to the courtroom.  So, you’re seeing support for more devices and smaller screens, wherever attorneys get information.

What are you working on that you’d like our readers to know about?

I’m moderating a panel discussion {at LegalTech} titled, The Debate on Native Format Production and Redaction, which includes Craig Ball, George Socha, Tom O’Connor and Browning Marean.  I wrote a white paper last year entitled The Reality of Native Format Production and Redaction, which has inspired this panel discussion here at LegalTech.  So, that should be informative and interesting.  We’ve noticed that there’s just an awful lot of confusion in terms of what’s really required and what’s acceptable and the white paper and panel discussion really speaks to that.  We’re trying to educate our customers and help our partners educate their clients.

The other thing we’re announcing here is the release of integration to OpenText eDOCS.  We’ve been partners with OpenText for content management since 2002 and are very excited to extend our partnership to include this new area. eDOCS has a great presence in the legal space and we look forward to working with them.

Thanks, Christine, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Jim McGann of Index Engines

 

This is the third of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Jim McGann.  Jim is Vice President of Information Discovery at Index Engines.  Jim has extensive experience with the eDiscovery and Information Management in the Fortune 2000 sector. He has worked for leading software firms, including Information Builders and the French-based engineering software provider Dassault Systemes.  In recent years he has worked for technology-based start-ups that provided financial services and information management solutions.

What do you consider to be the current significant trends in eDiscovery on which people in the industry are, or should be, focused?

What we’re seeing is that companies are becoming a bit more proactive.  Over the past few years we’ve seen companies that have simply been reacting to litigation and it’s been a very painful process because ESI collection has been a “fire drill” – a very last minute operation.  Not because lawyers have waited and waited, but because the data collection process has been slow, complex and overly expensive.  But things are changing. Companies are seeing that eDiscovery is here to stay, ESI collection is not going away and the argument of saying that it’s too complex or expensive for us to collect is not holding water. So, companies are starting to take a proactive stance on ESI collection and understanding their data assets proactively.  We’re talking to companies that are not specifically responding to litigation; instead, they’re building a defensible policy that they can apply to their data sources and make data available on demand as needed.    

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the first morning of LTNY}  Well, in walking the floor as people were setting up, you saw a lot of early case assessment last year; this year you’re seeing a lot of information governance..  That’s showing that eDiscovery is really rolling into the records management/information governance area.  On the CIO and General Counsel level, information governance is getting a lot of exposure and there’s a lot of technology that can solve the problems.  Litigation support’s role will be to help the executives understand the available technology and how it applies to information governance and records management initiatives.  You’ll see more information governance messaging, which is really a higher level records management message.

As for other trends, one that I’ll tie Index Engines into is ESI collection and pricing.  Per GB pricing is going down as the volume of data is going up.  Years ago, prices were a thousand per GB, then hundreds of dollars per GB, etc.  Now the cost is close to tens of dollars per GB. To really manage large volumes of data more cost-effectively, the collection price had to become more affordable.  Because Index Engines can make data on backup tapes searchable very cost-effectively, for as little as $50 per tape, data on tape has become  as easy to access and search as online data. Perhaps even easier because it’s not on a live network.  Backup tapes have a bad reputation because people think of them as complex or expensive, but if you take away the complexity and expense (which is what Index Engines has done), then they really become “full point-in-time” snapshots.  So, if you have litigation from a specific date range, you can request that data snapshot (which is a tape) and perform discovery on it.  Tape is really a natural litigation hold when you think about it, and there is no need to perform the hold retroactively.

So, what does the ease of which the information can be indexed from tape do to address the inaccessible argument for tape retrieval?  That argument has been eroding over the years, thanks to technology like ours.  And, you see decisions from judges like Judge Scheindlin saying “if you cannot find data in your primary network, go to your backup tapes”, indicating that they consider backup tapes in the next source right after online networks.  You also see people like Craig Ball writing that backup tapes may be the most convenient and cost-effective way to get access to data.  If you had a choice between doing a “server crawl” in a corporate environment or just asking for a backup tape of that time frame, tape is the much more convenient and less disruptive option.  So, if your opponent goes to the judge and says it’s going to take millions of dollars to get the information off of twenty tapes, you must know enough to be in front of a judge and say “that’s not accurate”.  Those are old numbers.  There are court cases where parties have been instructed to use tapes as a cost-effective means of getting to the data.  Technology removes the inaccessible argument by making it easier, faster and cheaper to retrieve data from backup tapes.

The erosion of the accessibility burden is sparking the information governance initiatives. We’re seeing companies come to us for legacy data remediation or management projects, basically getting rid of old tapes. They are saying “if I’ve got ten years of backup tapes sitting in offsite storage, I need to manage that proactively and address any liability that’s there” (that they may not even be aware exists).  These projects reflect a proactive focus towards information governance by remediating those tapes and getting rid of data they don’t need.  Ninety-eight percent of the data on old tapes is not going to be relevant to any case.  The remaining two percent can be found and put into the company’s litigation hold system, and then they can get rid of the tapes.

How do incremental backups play into that?  Tapes are very incremental and repetitive.  If you’re backing up the same data over and over again, you may have 50+ copies of the same email.  Index Engines technology automatically gets rid of system files and applies a standard MD5Hash to dedupe.  Also, by using tape cataloguing, you can read the header and say “we have a Saturday full backup and five incremental during the week, then another Saturday full backup”. You can ignore the incremental tapes and just go after the full backups.  That’s a significant percent of the tapes you can ignore.

What are you working on that you’d like our readers to know about?

Index Engines just announced today a partnership with LeClairRyan. This partnership combines legal expertise for data retention with the technology that makes applying the policy to legacy data possible.   For companies that want to build policy for the retention of legacy data and implement the tape remediation process we have advisors like LeClairRyan that can provide legacy data consultation and oversight.  By proactively managing the potential liability  of legacy data, you are also saving the IT costs to explore that data.

Index Engines  also just announced a new cloud-based tape load service that will provide full identification, search and access to tape data for eDiscovery. The Look & Learn service, starting at $50 per tape, will provide clients with full access to the index of their tape data without the need to install any hardware or software. Customers will be able to search the index and gather knowledge about content, custodians, email and metadata all via cloud access to the Index Engines interface, making discovery of data from tapes even more convenient and affordable.

Thanks, Jim, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Tom Gelbmann of Gelbmann & Associates, LLC

 

This is the first of the LegalTech New York (LTNY) Thought Leader Interview series.  eDiscoveryDaily interviewed several thought leaders at LTNY this year and asked each of them the same three questions:

  1. What do you consider to be the current significant trends in eDiscovery that people in the industry are, or should be, focused on?
  2. Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?
  3. What are you working on that you’d like our readers to know about?

Today’s thought leader is Tom Gelbmann. Tom is Principal of Gelbmann & Associates, LLC, co-author of the Socha-Gelbmann Electronic Discovery Survey and co-founder of the Electronic Discovery Reference Model (EDRM).  Since 1993, Gelbmann & Associates, LLC has helped law firms and Corporate Law Departments realize the full benefit of their investments in Information Technology.  As today is Valentine’s Day, consider this interview with Tom as eDiscoveryDaily’s Valentine’s Day present to you!

What do you consider to be the current significant trends in eDiscovery that people in the industry are, or should be, focused on?

The first thing that comes to mind is the whole social media thing, which is something you’re probably getting quite a bit of (in your interviews), but with the explosion of the use of social media, personally and within organizations, we’re seeing a huge explosion (in eDiscovery).  One of the issues is that there is very little in terms of policy and management around that, and I look at it in a very similar vein to the late ’80s and early ‘90s when electronic mail came about and there were no real defining guidelines.  It wasn’t until we got to a precipitating event where “all of a sudden, organizations get religion” and say “oh my god, we better have a policy for this”.  So, I think the whole social media thing is one issue.

On top of that, another area that is somewhat of an umbrella to all this is information management and EDRM with the Information Management Reference Model (IMRM) is certainly part of that. What is important in this context is that corporations are beginning to realize the more they get their “electronic house in order”, the better off they’re going to be in many ways.  Less cost, less embarrassment and so forth.

The third thing is that, and this is something that I’ve been tracking for awhile, the growth in tools and solutions available for small organizations and small cases.  For a long time, everything was about millions of documents and gigabytes of data – that’s what got the headlines and that what the service bureaus and providers were focusing on.  The real “gold” in my mind is the small cases, the hundreds of thousands of small cases that are out there.  The providers that can effectively reach that market in a cost-effective way will be positioned very well and I think we’re starting to see that happen.  And, I think the whole “cloud” concept of technology is helping that.

Which of those trends are evident here at LTNY, which are not being talked about enough, and/or what are your general observations about LTNY this year?

{Interviewed on the first afternoon of the show} Well, so far it’s been a blur [laughs].  But, I think we’re definitely seeing social media as a big issue at this LegalTech and I also think we’re seeing more solutions toward the smaller cases and smaller organizations here at this year’s show.

What are you working on that you’d like our readers to know about?

From an EDRM standpoint, I just came from a meeting for the EDRM Testing pilot project.  Last fall, at the mid-year meeting, there was a groundswell to address testing, and the basic issue is applying some principles of testing to software products associated with electronic discovery to answer the question of “how do you know?” when the court asks if the results are true and what sort of testing process did you go through.  There is very little as far as a testing regimen or even guidelines on a testing regimen for electronic discovery software and so the EDRM testing group is looking to establish some guidelines, starting very basically looking at bands of rigor associated with bands of risk.  So, you will see that at this year’s EDRM annual meeting in May that EDRM Testing will become a full-fledged project.

And the other thing that I’m happy to announce is that George Socha and I have launched a web site called Apersee, which is the next step in the evolution of the (Socha-Gelbmann) rankings.  We killed the rankings two years ago because they were being misused.  Consumers wanted to know who do I send the RFP to, who do I engage and they would almost mindlessly send to the Socha-Gelbmann Top Ten.  But, now the consumers can specify what they’re looking for, starting with areas of the model, whether it’s Collection, Preservation, Review, etc., and provide other information such as geography and types of ESI and what will be returned on those searches is a list of providers with those services or products.  We have right now about 800 providers in the database and many of those have very basic listings at this point.  As this is currently in beta, we have detailed information that we pre-populated for about 200 providers and are expanding rapidly.  Over the next couple of months, we’re working hard with providers to populate their sites with whatever content is appropriate to describe their products and services in terms of what they do, where they do it, etc., that can feed the search engine.  And, we have been getting very good feedback from both the consumer side and the provider side as being a very valuable service.

Thanks, Tom, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Best Practices: Database Discovery Pop Quiz ANSWERS

 

So, how did you do?  Did you know all the answers from Friday’s post – without “googling” them?  😉

Here are the answers – enjoy!

What is a “Primary Key”? The primary key of a relational table uniquely identifies each record in the table. It can be a normal attribute that you expect to be unique (e.g., Social Security Number); however, it’s usually best to be a sequential ID generated by the Database Management System (DBMS).

What is an “Inner Join” and how does it differ from an “Outer Join”?  An inner join is the most common join operation used in applications, creating a new result table by combining column values of two tables.  An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record in one of the tables – even if no other matching record exists.  Sometimes, there is a reason to keep all of the records in one table in your result, such as a list of all employees, whether or not they participate in the company’s benefits program.

What is “Normalization”?  Normalization is the process of organizing data to minimize redundancy of that data. Normalization involves organizing a database into multiple tables and defining relationships between the tables.

How does a “View” differ from a “Table”?  A view is a virtual table that consists of columns from one or more tables. Though it is similar to a table, it is a query stored as an object.

What does “BLOB” stand for?  A Binary Large OBject (BLOB) is a collection of binary data stored as a single entity in a database management system. BLOBs are typically images or other multimedia objects, though sometimes binary executable code is stored as a blob.  So, if you’re not including databases in your discovery collection process, you could also be missing documents stored as BLOBs.  BTW, if you didn’t click on the link next to the BLOB question in Friday’s blog, it takes you to the amusing trailer for the 1958 movie, The Blob, starring a young Steve McQueen (so early in his career, he was billed as “Steven McQueen”).

What is the different between a “flat file” and a “relational” database?  A flat file database is a database designed around a single table, like a spreadsheet. The flat file design puts all database information in one table, or list, with fields to represent all parameters. A flat file is prone to considerable duplicate data, as each value is repeated for each item.  A relational database, on the other hand, incorporates multiple tables with methods (such as normalization and inner and outer joins, defined above) to store data efficiently and minimize duplication.

What is a “Trigger”?  A trigger is a procedure which is automatically executed in response to certain events in a database and is typically used for keeping the integrity of the information in the database. For example, when a new record (for a new employee) is added to the employees table, a trigger might create new records in the taxes, vacations, and salaries tables.

What is “Rollback”?  A rollback is the undoing of partly completed database changes when a database transaction is determined to have failed, thus returning the database to its previous state before the transaction began.  Rollbacks help ensure database integrity by enabling the database to be restored to a clean copy after erroneous operations are performed or database server crashes occur.

What is “Referential Integrity”?  Referential integrity ensures that relationships between tables remain consistent. When one table has a foreign key to another table, referential integrity ensures that a record is not added to the table that contains the foreign key unless there is a corresponding record in the linked table. Many databases use cascading updates and cascading deletes to ensure that changes made to the linked table are reflected in the primary table.

Why is a “Cartesian Product” in SQL almost always a bad thing?  A Cartesian Product occurs in SQL when a join condition (via a WHERE clause in a SQL statement) is omitted, causing all combinations of records from two or more tables to be displayed.  For example, when you go to the Department of Motor Vehicles (DMV) to pay your vehicle registration, they use a database with an Owners and a Vehicles table joined together to determine for which vehicle(s) you need to pay taxes.  Without that join condition, you would have a Cartesian Product and every vehicle in the state would show up as registered to you – that’s a lot of taxes to pay!

If you didn’t know the answers to most of these questions, you’re not alone.  But, to effectively provide the information within a database responsive to an eDiscovery request, knowledge of databases at this level is often necessary to collect and produce the appropriate information.    As Craig Ball noted in his Law.com article Ubiquitous Databases, “Get the geeks together, and get out of their way”.  Hey, I resemble that remark!

So, what do you think?  Did you learn anything?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Database Discovery Pop Quiz

 

Databases: You can’t live with them, you can’t live without them.

Or so it seems in eDiscovery.  On a regular basis, I’ve seen various articles and discussions related to discovery of databases and other structured data and I remain very surprised how few legal teams understand database discovery and know how to handle it.  A colleague of mine (who I’ve known over the years to be honest and reliable) even claimed to me a few months back while working for a nationally known eDiscovery provider that their collection procedures actually excluded database files.

Last month, Law.com had an article written by Craig Ball, called Ubiquitous Databases, which provided a lot of good information about database discovery. It included various examples how databases touch our lives every day, while noting that eDiscovery is still ultra document-centric, even when those “documents” are generated from databases.  There is some really good information in that article about Database Management Software (DBMS), Structured Query Language (SQL), Entity Relationship Diagrams (ERDs) and how they are used to manage, access and understand the information contained in databases.  It’s a really good article especially for database novices who need to understand more about databases and how they “tick”.

But, maybe you already know all you need to know about databases?  Maybe you would already be ready to address eDiscovery on your databases today?

Having worked with databases for over 20 years (I stopped counting at 20), I know a few things about databases.  So, here is a brief “pop” quiz on database concepts.  Call them “Database 101” questions.  See how many you can answer!

  • What is a “Primary Key”? (hint: it is not what you start the car with)
  • What is an “Inner Join” and how does it differ from an “Outer Join”?
  • What is “Normalization”?
  • How does a “View” differ from a “Table”?
  • What does “BLOB” stand for? (hint: it’s not this)
  • What is the different between a “flat file” and a “relational” database?
  • What is a “Trigger”?
  • What is “Rollback”? (hint: it has nothing to do with Wal-Mart prices)
  • What is “Referential Integrity”?
  • Why is a “Cartesian Product” in SQL almost always a bad thing?

So, what do you think?  Are you a database guru or a database novice?  Please share any comments you might have or if you’d like to know more about a particular topic.

Did you think I was going to provide the answers at the bottom?  No cheating!!  I’ll answer the questions on Monday.  Hope you can stand it!!

eDiscovery Trends: 2011 Predictions — By The Numbers

 

Comedian Nick Bakay”>Nick Bakay always ends his Tale of the Tape skits where he compares everything from Married vs. Single to Divas vs. Hot Dogs with the phrase “It's all so simple when you break things down scientifically.”

The late December/early January time frame is always when various people in eDiscovery make their annual predictions as to what trends to expect in the coming year.  We’ll have some of our own in the next few days (hey, the longer we wait, the more likely we are to be right!).  However, before stating those predictions, I thought we would take a look at other predictions and see if we can spot some common trends among those, “googling” for 2011 eDiscovery predictions, and organized the predictions into common themes.  I found serious predictions here, here, here, here and here.  Oh, also here and here.

A couple of quick comments: 1) I had NO IDEA how many times that predictions are re-posted by other sites, so it took some work to isolate each unique set of predictions.  I even found two sets of predictions from ZL Technologies, one with twelve predictions and another with seven, so I had to pick one set and I chose the one with seven (sorry, eWEEK!). If I have failed to accurately attribute the original source for a set of predictions, please feel free to comment.  2) This is probably not an exhaustive list of predictions (I have other duties in my “day job”, so I couldn’t search forever), so I apologize if I’ve left anybody’s published predictions out.  Again, feel free to comment if you’re aware of other predictions.

Here are some of the common themes:

  • Cloud and SaaS Computing: Six out of seven “prognosticators” indicated that adoption of Software as a Service (SaaS) “cloud” solutions will continue to increase, which will become increasingly relevant in eDiscovery.  No surprise here, given last year’s IDC forecast for SaaS growth and many articles addressing the subject, including a few posts right here on this blog.
  • Collaboration/Integration: Six out of seven “augurs” also had predictions related to various themes associated with collaboration (more collaboration tools, greater legal/IT coordination, etc.) and integration (greater focus by software vendors on data exchange with other systems, etc.).  Two people specifically noted an expectation of greater eDiscovery integration within organization governance, risk management and compliance (GRC) processes.
  • In-House Discovery: Five “pundits” forecasted eDiscovery functions and software will continue to be brought in-house, especially on the “left-side of the EDRM model” (Information Management).
  • Diverse Data Sources: Three “soothsayers” presaged that sources of data will continue to be more diverse, which shouldn’t be a surprise to anyone, given the popularity of gadgets and the rise of social media.
  • Social Media: Speaking of social media, three “prophets” (yes, I’ve been consulting my thesaurus!) expect social media to continue to be a big area to be addressed for eDiscovery.
  • End to End Discovery: Three “psychics” also predicted that there will continue to be more single-source end-to-end eDiscovery offerings in the marketplace.

The “others receiving votes” category (two predicting each of these) included maturing and acceptance of automated review (including predictive coding), early case assessment moving toward the Information Management stage, consolidation within the eDiscovery industry, more focus on proportionality, maturing of global eDiscovery and predictive/disruptive pricing.

Predictive/disruptive pricing (via Kriss Wilson of Superior Document Services and Charles Skamser of eDiscovery Solutions Group respective blogs) is a particularly intriguing prediction to me because data volumes are continuing to grow at an astronomical rate, so greater volumes lead to greater costs.  Creativity will be key in how companies deal with the larger volumes effectively, and pressures will become greater for providers (even, dare I say, review attorneys) to price their services more creatively.

Another interesting prediction (via ZL Technologies) is that “Discovery of Databases and other Structured Data will Increase”, which is something I’ve expected to see for some time.  I hope this is finally the year for that.

Finally, I said that I found serious predictions and analyzed them; however, there are a couple of not-so-serious sets of predictions here and here.  My favorite prediction is from The Posse List, as follows: “LegalTech…renames itself “EDiscoveryTech” after Law.com survey reveals that of the 422 vendors present, 419 do e-discovery, and the other 3 are Hyundai HotWheels, Speedway Racers and Convert-A-Van who thought they were at the Javits Auto Show.”

So, what do you think?  Care to offer your own “hunches” from your crystal ball?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Trends: Social Media in Litigation

Yesterday, we introduced the Virtual LegalTech online educational session Facing the Legal Dangers of Social Media and discussed what factors a social media governance policy should address.  To get background information regarding the session, including information about the speakers (Harry Valetk, Daniel Goldman and Michael Lackey), click here.

The session also addressed social media in litigation, discussing several considerations about social media, including whether it’s discoverable, how it’s being used in litigation, how to request it, how to preserve it, and how to produce it.  Between wall postings, status updates, personal photos, etc., there’s a lot of content out there and it’s just as discoverable as any other source of ESI – depending on its relevance to the case and the burden to collect, review and produce.  The relevance of privacy settings may be a factor in the discoverability of this information as at least one case, Crispin v. Christian Audigier, Inc.,(C.D. Cal. May 26, 2010), held that private email messaging on Facebook, MySpace and Media Temple was protected as private.

So, how is social media content being used in litigation?  Here are some examples:

  • Show Physical Health: A person claiming to be sick or injured at work who has photos on their Facebook profile showing them participating in strenuous recreation activities;
  • Discrimination and Harassment: Statements made online which can be considered discriminatory or harassing or if the person “likes” certain groups with “hate” agendas;
  • False Product Claims: Statements online about a product that are not true or verifiable;
  • Verify or Refute Alibis: Social media content (photos, location tracking, etc.) can verify or refute alibis provided by suspects in criminal cases;
  • Pre-Sentencing Reports: Social media content can support or refute claims of remorse – in one case, the convicted defendant was sentenced more harshly because of statements made online that refuted his statements of remorse in the courtroom;
  • Info Gathering: With so much information available online, you can gather information about opposing parties, witnesses, attorneys, judges, or even jurors.  In some cases, attorneys have paid firms to ensure that positive information will bubble to the top when jurors “Google” those attorneys.  And, in Ohio, at least, judges may not only have Facebook friends, but those friends can include attorneys appearing before them (interesting…).

If possible, request the social media content from your opponent as the third-party provider will probably fight having to provide the content, usually citing the Stored Communications Act.  As noted previously on this blog, Facebook and Twitter have guidelines for requesting data – through subpoena and law enforcement agencies.

Social media content is generally stored by third-party Software as a Service (SaaS) providers (Facebook and Twitter are examples of SaaS providers), so it’s important to be prepared to address several key eDiscovery issues to proactively prepare to be able to preserve and produce the data for litigation purposes, just as you would with any SaaS provider.

So, what do you think?  Has your organization been involved in litigation where social media content was requested?  Please share any comments you might have or if you’d like to know more about a particular topic.