eDiscoveryDaily

Life is Short, But Can Seem Long if You’re a Cheater About to Be Exposed in the Ashley Madison Hack: eDiscovery Trends

One of the most discussed topics at LegalTech® New York 2015 (LTNY) earlier this year was cybersecurity.  We’ve started covering some of the trends related to security breaches with posts here, here and here and even my hometown baseball team, the Houston Astros, was recently hacked by a competitor.  The latest victims of cyber hacking – the purported 37 million subscribers of the online cheating site AshleyMadison.com – may find little sympathy in their plight.

According to Brian Krebs in Krebs on Security, an authoritative Web site that monitors hacking worldwide, large caches of data  have been stolen from the site and some has been posted online by an individual or group that claims to have completely compromised the company’s user databases, financial records and other proprietary information.  The breach was confirmed in a statement from Toronto-based Avid Life Media Inc. (ALM*), which owns AshleyMadison as well as related hookup sites Cougar Life and Established Men. ALM stated that “We apologize for this unprovoked and criminal intrusion into our customers’ information” and also claimed that “At this time, we have been able to secure our sites, and close the unauthorized access points.”

That’s probably little comfort to the subscribers who have had their personal information compromised.

The hacker or hackers identify themselves as The Impact Team and is threatening to expose all customer records (including “profiles with all the customers’ secret sexual fantasies, nude pictures, and conversations and matching credit card transactions, real names and addresses, and employee documents and emails”) unless ALM takes AshleyMadison and Established Men offline “permanently in all forms.”

As stated in the article in Krebs on Security, “In a long manifesto posted alongside the stolen ALM data, The Impact Team said it decided to publish the information in response to alleged lies ALM told its customers about a service that allows members to completely erase their profile information for a $19 fee.

According to the hackers, although the ‘full delete’ feature that Ashley Madison advertises promises ‘removal of site usage history and personally identifiable information from the site,’ users’ purchase details — including real name and address — aren’t actually scrubbed.”  On Monday, ALM said it would offer all users the ability to fully delete their personal information from the site and waive the fee (presumably fully).

Ashley Madison’s slogan is “Life is short.  Have an affair.®”  For those that have chosen to do so, life may start to seem very long, at least for a while.

So, what do you think?  Is there anything that can be done to stem the tide of data breaches throughout the world?  Please share any comments you might have or if you’d like to know more about a particular topic.

* Not to be confused with American Lawyer Media, which goes by the same acronym.  🙂

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Denies Plaintiff’s Request for Spoliation Sanctions, as Most Documents Destroyed Before Duty to Preserve: eDiscovery Case Law

In Giuliani v. Springfield Township, et al., Civil Action No. 10-7518 (E.D.Penn. June 9, 2015), Pennsylvania District Judge Thomas N. O’Neill, Jr. denied the plaintiffs’ motion for spoliation sanctions, finding that the duty to preserve began when the case was filed and finding that “plaintiffs have not shown that defendants had any ill motive or bad intent in failing to retain the documents which plaintiffs seek”.

Case Background

In this harassment and discrimination case, the plaintiff owned land within the defendant’s township and alleged that the defendant’s zoning decisions violated the plaintiff’s civil rights violations. In June 2009, the defendant withdrew its opposition to the plaintiffs’ application for use of the property and its Zoning Hearing Board granted the plaintiffs’ zoning appeal, ending the zoning dispute.   The plaintiff then filed this new complaint against the defendant in January 2011.

The plaintiffs contended that the defendants’ production had been deficient because defendants “provided a miniscule number [of emails] in response to Plaintiffs’ [discovery] request[s] – just 24 emails spanning a seventeen-year period of near-constant controversy.”  In response, the defendants noted that, during the time period relevant to this case, it did not generate large volumes of email and also cited it’s document retention policy, which stated that “e-mail messages and attachments that do not meet the definition of records and are not subject to litigation and other legal proceedings should be deleted immediately after they are read”.

The defendants also did not preserve data relating to the case until the case was filed in 2011, believing that all of the outstanding issues related to the plaintiffs’ land development applications had finally been resolved after the zoning dispute was resolved in 2009.  The plaintiffs disputed that interpretation of when the duty to preserve arose and also pointed out instances where the defendants failed to instruct key custodians to preserve data related to the case.

Judge’s Ruling

With regard to the beginning of the duty to preserve by the defendants, Judge O’Neill stated that “Plaintiffs’ arguments are not sufficient to meet their burden to show that defendants’ duty to preserve files related to other properties, emails or planning commission board minutes was triggered at any time prior to the commencement of this action. They have not set forth any reason why I should disbelieve ‘the Township’s assertion that it had absolutely no reason to anticipate litigation until it was served with the Complaint on January 7, 2011,’…and that in June 2009, ‘with the property being leased in its entirety to one tenant, the Township . . . believed that all disputes with the Giulianis had come to an end.’”

As for alleged preservation failures after the duty to preserve commenced, Judge O’Neill determined that “Plaintiffs have not met their burden to establish that defendants actually suppressed the evidence they seek. At most, defendants lost or deleted the evidence plaintiffs seek as the result of mere inadvertent negligence. Plaintiffs have not set forth any proof that defendants in fact failed to preserve emails, documents relating to other properties or Planning Commission Board Minutes at any time after January 7, 2011…Further plaintiffs have not shown that defendants had any ill motive or bad intent in failing to retain the documents which plaintiffs seek.”  As a result, Judge O’Neill denied the plaintiffs’ motion for spoliation sanctions.

So, what do you think?  Should the duty to preserve have been applied earlier?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Quality Control, Making Sure the Numbers Add Up: eDiscovery Best Practices

Having touched on this topic a few years ago, a recent client experience spurred me to revisit it.

Friday, we wrote about tracking file counts from collection to production, the concept of expanded file counts, and the categorization of files during processing.  Today, let’s walk through a scenario to show how the files collected are accounted for during the discovery process.

Tracking the Counts after Processing

We discussed the typical categories of excluded files after processing – obviously, what’s not excluded is available for searching and review.  Even if your approach includes technology assisted review (TAR) as part of your methodology, it’s still likely that you will want to do some culling out of files that are clearly non-responsive.

Documents during review may be classified in a number of ways, but the most common ways to classify documents as to whether they are responsive, non-responsive, or privileged.  Privileged documents are also often classified as responsive or non-responsive, so that only the responsive documents that are privileged need be identified on a privilege log.  Responsive documents that are not privileged are then produced to opposing counsel.

Example of File Count Tracking

So, now that we’ve discussed the various categories for tracking files from collection to production, let’s walk through a fairly simple eMail based example.  We conduct a fairly targeted collection of a PST file from each of seven custodians in a given case.  The relevant time period for the case is January 1, 2013 through December 31, 2014.  Other than date range, we plan to do no other filtering of files during processing.  Identified duplicates will not be reviewed or produced.  We’re going to provide an exception log to opposing counsel for any file that cannot be processed and a privilege log for any responsive files that are privileged.  Here’s what this collection might look like:

  • Collected Files: After expansion and processing, 7 PST files expand to 101,852 eMails and attachments.
  • Filtered Files: Filtering eMails outside of the relevant date range eliminates 23,564
  • Remaining Files after Filtering: After filtering, there are 78,288 files to be processed.
  • NIST/System Files: eMail collections typically don’t have NIST or system files, so we’ll assume zero (0) files here. Collections with loose electronic documents from hard drives typically contain some NIST and system files.
  • Exception Files: Let’s assume that a little less than 1% of the collection (912) is exception files like password protected, corrupted or empty files.
  • Duplicate Files: It’s fairly common for approximately 30% or more of the collection to include duplicates, so we’ll assume 24,215 files here.
  • Remaining Files after Processing: We have 53,161 files left after subtracting NIST/System, Exception and Duplicate files from the total files after filtering.
  • Files Culled During Searching: If we assume that we are able to cull out 67% (approximately 2/3 of the collection) as clearly non-responsive, we are able to cull out 35,618.
  • Remaining Files for Review: After culling, we have 17,543 files that will actually require review (whether manual or via a TAR approach).
  • Files Tagged as Non-Responsive: If approximately 40% of the document collection is tagged as non-responsive, that would be 7,017 files tagged as such.
  • Remaining Files Tagged as Responsive: After QC to ensure that all documents are either tagged as responsive or non-responsive, this leaves 10,526 documents as responsive.
  • Responsive Files Tagged as Privileged: If roughly 8% of the responsive documents are determined to be privileged during review, that would be 842 privileged documents.
  • Produced Files: After subtracting the privileged files, we’re left with 9,684 responsive, non-privileged files to be produced to opposing counsel.

The percentages I used for estimating the counts at each stage are just examples, so don’t get too hung up on them.  The key is to note the numbers in red above.  Excluding the interim counts in black, the counts in red represent the different categories for the file collection – each file should wind up in one of these totals.  What happens if you add the counts in red together?  You should get 101,852 – the number of collected files after expanding the PST files.  As a result, every one of the collected files is accounted for and none “slips through the cracks” during discovery.  That’s the way it should be.  If not, investigation is required to determine where files were missed.

So, what do you think?  Do you have a plan for accounting for all collected files during discovery?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Quality Control By The Numbers: eDiscovery Best Practices

Having touched on this topic a few years ago, a recent client experience spurred me to revisit it.

A while back, we wrote about Quality Assurance (QA) and Quality Control (QC) in the eDiscovery process.  Both are important in improving the quality of work product and making the eDiscovery process more defensible overall.  With regard to QC, an overall QC mechanism is tracking of document counts through the discovery process, especially from collection to production, to identify how every collected file was handled and why each non-produced document was not produced.

Expanded File Counts

Scanned counts of files collected are not the same as expanded file counts.  There are certain container file types, like Outlook PST files and ZIP archives that exist essentially to store a collection of other files.  So, the count that is important to track is the “expanded” file count after processing, which includes all of the files contained within the container files.  So, in a simple scenario where you collect Outlook PST files from seven custodians, the actual number of documents (emails and attachments) within those PST files could be in the tens of thousands.  That’s the starting count that matters if your goal is to account for every document or file in the discovery process.

Categorization of Files During Processing

Of course, not every document gets reviewed or even included in the search process.  During processing, files are usually categorized, with some categories of files usually being set aside and excluded from review.  Here are some typical categories of excluded files in most collections:

  • Filtered Files: Some files may be collected, and then filtered during processing. A common filter for the file collection is the relevant date range of the case.  If you’re collecting custodians’ source PST files, those may include messages outside the relevant date range; if so, those messages may need to be filtered out of the review set.  Files may also be filtered based on type of file or other reasons for exclusion.
  • NIST and System Files: Many file collections also contain system files, like executable files (EXEs) or Dynamic Link Library (DLLs) that are part of the software on a computer which do not contain client data, so those are typically excluded from the review set. NIST files are included on the National Institute of Standards and Technology list of files that are known to have no evidentiary value, so any files in the collection matching those on the list are “De-NISTed”.
  • Exception Files: These are files that cannot be processed or indexed, for whatever reason. For example, they may be password-protected or corrupted.  Just because these files cannot be processed doesn’t mean they can be ignored, depending on your agreement with opposing counsel, you may need to at least provide a list of them on an exception log to prove they were addressed, if not attempt to repair them or make them accessible (BTW, it’s good to establish that agreement for disposition of exception files up front).
  • Duplicate Files: During processing, files that are exact duplicates may be put aside to avoid redundant review (and potential inconsistencies). Some exact duplicates are typically identified based on the HASH value, which is a digital fingerprint generated based on the content and format of the file – if two files have the same HASH value, they have the same exact content and format.  Emails (and their attachments) may be identified as duplicates based on key metadata fields, so an attachment cannot be “de-duped” out of the collection by a standalone copy of the same file.

All of these categories of excluded files can reduce the set of files to actually be searched and reviewed.  On Monday, we’ll illustrate an example of a file set from collection to production to illustrate how each file is accounted for during the discovery process.

So, what do you think?  Do you have a plan for accounting for all collected files during discovery?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

This Study Discusses the Benefits of Including Metadata in Machine Learning for TAR: eDiscovery Trends

A month ago, we discussed the Discovery of Electronically Stored Information (DESI) workshop and the papers describing research or practice presented at the workshop that was held earlier this month and we covered one of those papers a couple of weeks later.  Today, let’s cover another paper from the study.

The Role of Metadata in Machine Learning for Technology Assisted Review (by Amanda Jones, Marzieh Bazrafshan, Fernando Delgado, Tania Lihatsh and Tamara Schuyler) attempts to study the  role of metadata in machine learning for technology assisted review (TAR), particularly with respect to the algorithm development process.

Let’s face it, we all generally agree that metadata is a critical component of ESI for eDiscovery.  But, opinions are mixed as to its value in the TAR process.  For example, the Grossman-Cormack Glossary of Technology Assisted Review (which we covered here in 2012) includes metadata as one of the “typical” identified features of a document that are used as input to a machine learning algorithm.  However, a couple of eDiscovery software vendors have both produced documentation stating that “machine learning systems typically rely upon extracted text only and that experts engaged in providing document assessments for training should, therefore, avoid considering metadata values in making responsiveness calls”.

So, the authors decided to conduct a study that established the potential benefit of incorporating metadata into TAR algorithm development processes, as well as evaluate the benefits of using extended metadata and also using the field origins of that metadata.  Extended metadata fields included Primary Custodian, Record Type, Attachment Name, Bates Start, Company/Organization, Native File Size, Parent Date and Family Count, to name a few.  They evaluated three distinct data sets (one drawn from Topic 301 of the TREC 2010 Interactive Task, two other proprietary business data sets) and generated a random sample of 4,500 individual documents for each (split into a 3,000 document Control Set and a 1,500 document Training Set).

The metric they used throughout to compare model performance is Area Under the Receiver Operating Characteristic Curve (AUROC). Say what?  According to the report, the metric indicates the probability that a given model will assign a higher ranking to a randomly selected responsive document than a randomly selected non-responsive document.

As indicated by the graphic above, their findings were that incorporating metadata as an integral component of machine learning processes for TAR improved results (based on the AUROC metric).  Particularly, models incorporating Extended metadata significantly outperformed models based on body text alone in each condition for every data set.  While there’s still a lot to learn about the use of metadata in modeling for TAR, it’s an interesting study and start to the discussion.

A copy of the twelve page study (including Bibliography and Appendix) is available here.  There is also a link to the PowerPoint presentation file from the workshop, which is a condensed way to look at the study, if desired.

So, what do you think?  Do you agree with the report’s findings?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Similar Spoliation Case, Somewhat Different Outcome: eDiscovery Case Law

Remember the Malibu Media, LLC v. Tashiro case that we covered a couple of weeks ago, which involved spoliation sanctions against a couple accused of downloading its copyrighted adult movies via a BitTorrent client?  Here’s a similar case with the same plaintiff and similar spoliation claims, but with a somewhat different outcome (at least for now).

In Malibu Media, LLC v. Michael Harrison, Case No. 12-cv-1117 (S.D. Ind. June 8, 2015), Indiana District Judge William T. Lawrence denied the plaintiff’s motion for summary judgment, upholding the magistrate judge’s ruling which found an adverse inference instruction for destroying a hard drive with potentially responsive data on it to be not warranted, and ruled that “it will be for a jury to decide” if such a sanction is appropriate.

Case Background

The plaintiff alleged that the defendant installed a BitTorrent Client onto his computer and then went to a torrent site to upload and download its copyrighted Work, specifically, six adult films (or portions thereof).  As in the Tashiro case, the plaintiff used a German company to identify certain IP addresses that were being used to distribute the plaintiff’s copyrighted movies, and the defendant was eventually identified by Comcast as the subscriber assigned to this particular IP address.

After the lawsuit was filed, in January 2013, the defendant’s hard drive on his custom-built gaming computer crashed and he took it to an electronics recycling company, to have it “melted”. He then replaced the gaming computer’s hard drive. In addition to his gaming computer, the defendant also had another laptop. During discovery, that laptop and the new hard drive were examined by forensic experts; while the laptop revealed extensive BitTorrent use, it did not contain any of the plaintiff’s movies or files and the new hard drive did not reveal any evidence of BitTorrent use.  Nonetheless, because of the destroyed hard drive, the plaintiff filed a motion for sanctions for the Intentional Destruction of Material Evidence, as well as a motion for summary judgment.

In an evidentiary hearing in December 2014, the magistrate judge recommended that the motion for sanctions be denied, concluding that the defendant “did not destroy the hard drive in bad faith”, that “[h]ad [Harrison] truly wished to hid adverse information, the Court finds it unlikely that [Harrison] would have waited nearly five months to destroy such information” and noted that he found the defendant’s testimony to be credible.  The plaintiff filed an objection to that report and recommendation, arguing that “bad faith should be inferred from the undisputed evidence.”

Judge’s Ruling

Regarding both the summary judgment motion and the motion for sanctions, Judge Lawrence stated the following:

“The Court agrees with Magistrate Judge Dinsmore that default judgment was not warranted in this case. That said, Magistrate Judge Dinsmore found an adverse inference not to be warranted because he found Harrison’s testimony to be credible. While the Court does not necessarily disagree with Magistrate Judge Dinsmore—in that it is certainly possible a jury would find Harrison’s testimony to be credible—ultimately, the Court believes this is an issue best left for a jury to decide. Malibu Media has presented sufficient evidence to the contrary, and in light of the fact that Malibu Media’s motion for summary judgment was denied on the same grounds, the Court believes leaving the issue of spoliation to the jury to be the best approach. Accordingly, at trial the Court will instruct the jury that if it finds that Harrison destroyed the gaming computer’s hard drive in bad faith, it can assume that the evidence on the gaming computer’s hard drive would have been unfavorable to Harrison.”

So, what do you think?  Should this case have been handled the same way the Malibu Media, LLC v. Tashiro case was handled?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Tuesday LTWC 2015 Sessions: eDiscovery Trends

As noted yesterday, LegalTech West Coast 2015 (LTWC) is happening this week – nearly a month later than usual and in a new locale (San Francisco!) – and eDiscovery Daily is reporting about the latest eDiscovery trends being discussed at the show.  If you’re in the San Francisco area, today is the last day to come check out the show – there are a number of sessions (both paid and free) available and at least 58 exhibitors providing information on their products and services.

Perform a “find” on today’s LTNY conference schedule for “discovery” or “information governance” and you’ll get 64 hits.  Sessions in the main conference tracks include:

10:30 AM – 11:30 AM:

“Preserve or Perish” vs. “Destroy or Drown”: Managing Electronically Stored Information (ESI)

  • ”Less is more” both day-to-day and in eDiscovery, in that the many risks of over-saving trump potential concerns about under-saving.
  • Cleaning one’s ESI “garage” in a routinized way hinges on a combination of people, process and platforms.
  • Learning what an organization has and where is one step toward a solid IG regime as well as synergy between different internal constituencies.

Speakers are: Lael Andara, Partner, Ropers, Majeski, Kohn & Bentley; Vicki Lee Clewes, Vice President, Global Records & Information Management, McKesson; John Isaza, Esq., FAI, Partner, Rimon, PC; James Schellhase, Business Leader, Information Lifecycle Governance, IBM, Founder and President, StoredIQ, an IBM Company.  Discussion Leader: Robert D. Brownstone, Technology & eDiscovery Counsel and Chair, Electronic Information Management (EIM) Group, Fenwick & West LLP.

Everyday E-Discovery: Bringing It In-House or Outsourcing It

It is not easy deciding whether to bring everyday eDiscovery in-house, outsource it, or change nothing. With every organization starting from a different point and with many possible outcomes, this decision-making process can seem overwhelmingly complicated.  Join our panelists as they discuss:

  • How to determine where your everyday eDiscovery stands today: Who does it, what they do, and how they do it;
  • How to define where you want to be at the end of the process: what people, what processes, what technology;
  • How to gather the information needed to make an informed decision; and
  • How to arrive at an actionable decision on whether to bring everyday eDiscovery in-house or to outsource it.

Speakers to include are: David R. Cohen, Partner and Practice Group Leader, Global Records & E-Discovery Group, Reed Smith; Amy DeCesare, Assistant Vice President, Litigation Management, Allied World; David Popham, eDiscovery and Litigation Management Specialist, LexisNexis.  Discussion Leader: George Socha, President, Socha Consulting.

E-Discovery Challenges in Government Investigations and Regulatory Actions

Stakes are high when organizations face government investigations or enforcement activity. And when dealing with the government, unique e-discovery challenges arise. Many government legal professionals lack deep e-discovery expertise, and have limited technical support available. Yet e-discovery technology plays a role in virtually every matter. Other issues that complicate discovery in government matters include:

  • The “cooperative” posture often associated with governmental investigations when no judge is available to resolve discovery disputes;
  • The increased transparency requested by the government;
  • The broad breadth and compressed timelines associated with many government requests; and
  • The government’s increasing tendency to request specific discovery protocols, including technology assisted review.

In this program, the panelists will explore how organizations can overcome these challenges and more effectively handle discovery in government matters. They will discuss how to identify the scope of the government’s request and appropriately tailor a discovery solution that is reasonable, cost efficient, and defensive—and if necessary, educate the agency about e-discovery along the way. The program will also explain the obligations to preserve information, and the ramifications—including criminal liability—for spoliation of evidence. Finally, the panelists will discuss using predictive coding and other advanced analytics in government discovery.

Speakers to include are: Scott Coonan, Senior Director of IP, Litigation & Strategy, Juniper Networks; Mira Edelman, Senior Discovery Counsel, Google; Dawson Horn, III, Esq., Associate General Counsel, Vice President & Deputy Director of eDiscovery, AIG; Sylvie Stulic, Manager of Legal Operations and Litigation, Electronic Arts, Inc.  Discussion Leader: Amy Hinzmann, Senior Vice President, Managed Review, DiscoverReady.

1:30 PM – 2:30 PM:

New World Cyber Threats: Having a Good IG Foundation Can Help Guard Against Internal and External Threats

  • High profile data breaches, such as at Anthem and Target, emphasize the need for all companies – not just retailers – to fine-tune proactive policies and practices for managing sensitive electronically stored information.
  • Mapping and categorizing data sets help identify the more sensitive types of information warranting stronger protection measures
  • A sound infosec compliance regime should include: role-based –access control (RBAC); encryption of data at rest and in transit; a robust password regime; employee training as to phishing schemes and other threats; and an incident-response plan.

Speakers to include: Cary Calderone, Esq, Founder, SandHill Law, Faculty, University of Phoenix; Sylvia Johnson, Senior Counsel, Wells Fargo; Tyler Newby, Partner, Litigation Group, Fenwick & West LLP; James Schellhase, Business Leader, Information Lifecycle Governance, IBM, Founder and President, StoredIQ, an IBM Company.  Discussion Leader: Robert D. Brownstone, Technology & eDiscovery Counsel and Chair, Electronic Information Management (EIM) Group, Fenwick & West LLP.

Practical Pointers for Bringing Everyday E-Discovery Into Your Organization

You’ve decided to bring your everyday eDiscovery in-house.  Now comes the hard part: execution.  Our panelists will frame the issues, of course, but they also will deliver a plethora of practical pointers on how to bring everyday eDiscovery in-house in ways that are affordable and achievable:

  • How to develop and implement processes that are well-defined, can be repeated and become routine, and can best tested for quality control and quality assurance;
  • How to find, develop and support the people who will run and manage the processes; and
  • How to choose and implement appropriate technologies those people can use to run those processes.

Speakers to include are: Meghan Brosnahan, Director of eDiscovery Services, Sutter Health; Alon Israely. Esq., CISSP, Strategic Partnerships, BIA; David Popham, eDiscovery and Litigation Management specialist, LexisNexis.  Discussion Leader: George Socha, President, Socha Consulting.

Leveraging Technology and Analytics to Control the Information Deluge

As the volumes of information generated and stored by organizations grow, corporate counsel battle ever-increasing amounts of documents flowing into discovery. Counsel must find ways to effectively understand and use that information in the litigation, and they must also bring volumes down to reduce costs. This program will address how corporate practitioners can creatively use available technology and analytics tools—both in-house and with trusted technology partners—to control the document deluge.  Specific topics will include:

  • Using technology to preserve and collect narrowly and strategically;
  • Creative, new ways to cull down document collections;
  • Minimizing the number of documents subject to human review;
  • Deploying statistical sampling and analysis to boost defensibility; and
  • Harnessing information learned in discovery through effective knowledge management.

Speakers to include are: Pallab Chakraborty, Director of eDiscovery, Oracle; Kelly Lack, Litigation Attorney, Pacific Gas and Electric Company (PG&E); Alex Ponce De Leon, Corporate Counsel, Discovery, Google; James A. Sherer, Counsel and Co-Chair, Information Governance Practice Team, BakerHostetler.  Discussion Leader: Patrick Oot, Partner, Shook, Hardy & Bacon LLP.

3:00 PM – 4:00 PM:

IG 2020: Impact of Emerging Technologies on Proactive IG and Reactive eDiscovery –  Wearables,  IoT and Social Media  . . . Oh My!

  • Who is in “possession, custody or control” of data on BYOD/WYOD devices, in the “accidental”/shadow cloud and in social-networking sites?
  • Which policies and practices can help organizations adjust to the rapid pace of technological change?
  • What are the best ways to manage and collect data stored in these challenging environments?

Speakers to include: Laura D. Berger, Attorney, Div. of Privacy and Identity Protection, Federal Trade Commission; Patrick Heim, Head of Trust and Security, Dropbox; Heidi Maher, Executive Director, Compliance, Governance & Oversight Council (CGOC); Adam Sand, General Counsel, Shopkick.  Discussion Leader: Robert D. Brownstone, Technology & eDiscovery Counsel and Chair, Electronic Information Management (EIM) Group, Fenwick & West LLP.

Outsourcing Everyday E-Discovery: Managed Services Providers Versus  Outside Counsel

You’ve decided you want to outsource at least a portion of your everyday eDiscovery to someone else, but now you need to figure out who that will be and make sure it all works well for a price you can afford.  Our panelists will enlighten you, discussing the key outsourcing options and exploring their pros and cons:

  • How you decide to whom your everyday eDiscovery work should go;
  • What factors to consider with establishing contractual relationships with outsourcers;
  • Ways to manage the outsourcing relationships and work; and
  • Practicing good information governance in an outsourcing structure.

Speakers to include are: Shimmy Messing, Chief Technology Officer, Advanced Discovery, Patrick Oot, Partner, Shook, Hardy & Bacon LLP, David Yerich, Director, eDiscovery, UnitedHealth Group.  Discussion Leader: George Socha, President, Socha Consulting.

Beyond the Corporate Walls: Managing Data Security and Privacy in Discovery

Data privacy and security score the top spot on many lists of corporate counsel concerns. It’s difficult enough for organizations to secure sensitive information within their own four walls—when information must leave the organization for litigation discovery, the challenge increases. In this CLE program, in-house counsel and e-discovery professionals will discuss how they meet this challenge and protect the company’s valuable assets. The panelists will address:

  • Data security expectations for their e-discovery providers and law firms;
  • Measures to protect information turned over to opposing parties and the court;
  • When and how to insist that certain information may not even leave the organization, and must be kept behind the corporate firewall; and
  • Effective ways to screen for uber-sensitive information like trade secrets, source code, unreleased products, and personally identifying information.

Speakers to include are: Scott Carlson, Partner and Chair, eDiscovery and Information Governance Group, Seyfarth Shaw; John Davis, Executive Director and Counsel Global eDiscovery, UBS; Amie Taal, Vice President of Digital Forensics/Investigations, Deutsche Bank; Patrick E. Zeller, Director and Senior Counsel for eDiscovery and Privacy, Gilead Sciences.  Discussion Leader: Maureen O’Neill, Senior Vice President, Discovery Strategy, DiscoverReady.

In addition to these, there are other sessions today that might be of interest.  For a complete description for all sessions today, click here.

So, what do you think?  Did you attend LTWC this year?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Welcome to LegalTech West Coast 2015!: eDiscovery Trends

Today is the start of LegalTech® West Coast 2015 (LTWC) – nearly a month later than usual and in a new locale (San Francisco!) and eDiscovery Daily is reporting about the latest eDiscovery trends being discussed at the show.  Today and tomorrow, we will provide a description of some of the sessions related to eDiscovery to give you a sense of the topics being covered.  If you’re in the San Francisco area, come check out the show – there are a number of sessions (both paid and free) available and at least 58 exhibitors providing information on their products and services.

Perform a “find” on today’s LTNY conference schedule for “discovery” or “information governance” and you’ll get 23 hits.  Sessions in the main conference tracks include:

10:30 AM – 11:45 AM:

Laying the Foundation: An Information Governance Framework

Effective information governance involves multiple functions within an organization and requires a top-down, overarching structure that enables an organization to make decisions about information consistent with an organization’s mission, vision, and strategy. With such a structure, organizations can make proactive policy decisions about what information is important to the organization, how to keep and manage it, and how to defensibly dispose of it. This interactive panel discussion will offer practical steps to developing an information governance framework, including the strategic and tactical challenges that may arise during the process.

Speakers are: Jae Kim, Senior Vice President and General Counsel, Rambus Inc.; Jon M. Talotta, Partner, Hogan Lovells; Brett Tarr, Counsel, Litigation & E-Discovery, Caesars Entertainment.  Discussion Leader: Laurie Fischer, Managing Director, Huron Legal.

Analytics: The Revolution will be Visualized

Many generally understand the concept of analytics, but don’t know how to apply these technology advancements to the practice of law. Data mining technology, and the visual representation of mined data, offer a paradigm shift for how legal teams can uncover key facts. These technologies can quickly and effectively reveal the small subset of critical data in a universe of hundreds of millions of emails, effectively circumventing comprehensive review or greatly accelerating the review process.

Attendees will learn about common analytical and visualization technology and how to apply these tools to speed fact-finding and reduce e-discovery costs.

Speakers to include are: Amy DeCesare, Assistant Vice President, Litigation Management, Allied World; David Houlihan, Principal Analyst, Blue Hill Research; Caroline Sweeney, Global Director, E-Discovery & Client Technology, Dorsey.  Discussion Leader: Jason Ray, Managing Director, FTI Technology.

12:30 PM – 1:30 PM:

Taking TAR to the Next Level: Recent Research and the Promise of Continuous Active Learning

Three years ago, Judge Andrew J. Peck and Maura R. Grossman introduced Technology-Assisted Review (TAR) to a standing-room-only crowd at LegalTech. Since then, TAR—with its promise of substantial reductions in review costs—has entered the mainstream of high-volume discovery, both in the U.S. and abroad.

In 2015, the grand challenge is to make TAR even more accessible and effective, while addressing  the real-world limitations of first-generation TAR products. Our panel, featuring TAR pioneers Maura R. Grossman and Gordon V. Cormack, will talk about their groundbreaking research on TAR protocols, including methods such as Continuous Active Learning, (“CAL”), which have been shown to identify relevant documents more quickly while significantly reducing review costs.

Discussion topics include:

  • How does CAL work, and how does it differ from other TAR protocols?
  • Which seeds are more effective in TAR training, random or judgmental, and why?
  • Are subject-matter experts required for TAR training or can review teams do the job just as well?
  • What savings can you expect from Continuous Active Learning compared to traditional linear review?
  • What are the courts saying about TAR and CAL?

Join us for an informative hour on the future of TAR for 2015 and beyond. Be among the first to learn about the latest research comparing TAR protocols. Also, pick up a free copy of the new book, TAR for Smart People, How Technology Assisted Review Works and Why It Matters for Legal Professionals.

Speakers to include: John Tredennick, CEO and Founder, Catalyst Repository Systems, Inc.; Gordon V. Cormack, Professor, David R. Cheriton School of Computer Science, University of Waterloo; Maura R. Grossman, Of Counsel, Wachtell, Lipton, Rosen & Katz; Emi Ohira, Attorney-at-law (California), Patent attorney, Japan and President, DSA Legal Solutions, Professional Corporation.  Discussion Leader: Erin E. Harrison, Editor in Chief, Legaltech News.

2:00 PM – 3:15 PM:

Retention, Defensible Disposition, and How Analytics Can Help with Both

One of the challenges “big data” poses to an organization is the need to identify and retain the information of value that must be kept for legal or business needs and to defensibly dispose of that which is no longer required. Some organizations are using data analytics to help with these processes. The most promising use of analytics in information governance is its potential for automatic classification of data, which can aid in data clean-up, classification of existing information, and classification of information at its creation. This panel will discuss the principles of defensible disposition as well as the promise and difficulties involved in using analytics to aid in retention, disposition, and reducing downstream costs.

Speakers to include are: Keith M. Angle, Global Head of Records Management and Associate General Counsel, AIG; Pallab Chakraborty, Director of eDiscovery, Oracle; Keith Grochow, SR IT Technology Analyst – Records, Genentech.  Discussion Leader: Jon M. Talotta, Partner, Hogan Lovells.

The Seismic Effects of Mobile Device Data and BYOD Culture on E-Discovery

Data from mobile devices is either your current – or will be your next – biggest challenge. Whether you are collecting and reviewing for e-discovery or investigating for internal purposes, mobile device data remains tricky, hard to get and important. Complications range from increased encryption to legal and logistical issues with BYOD to keeping up with the newest operating systems and devices. With the mobilization of society and corporate culture showing no signs of abating, the effects of mobile data on legal disputes is becoming seismic. Join our experienced panel of legal practitioners and technical experts to learn strategies for dealing with the growing challenge of mobile device data in e-discovery. We’ll discuss:

  • Case law and regulatory drivers regarding mobile data
  • Planning and documenting mobile data policies
  • Coping with the logistical and privacy challenges of BYOD culture
  • Apps and the specific legal & technical challenges they present

Speakers to include are: Gareth Evans, Partner, Gibson, Dunn & Crutcher LLP; Veeral Gosalia, Senior Managing Director, FTI Technology; Anthony Knaapen, Manager Litigation Discovery, Chevron Corporation; Christopher Sitter, EnCE, eDiscovery & Digital Forensics Senior Manager, Juniper Networks.

3:45 PM – 5:00 PM:

Protecting Information Assets: Data Privacy and Security

Special attention needs to be paid to information if it contains personally identifiable information (PII), protected health information (PHI), or other sensitive data. There are legal requirements regarding the retention and disposition of much of this information, and there may be conflicting business needs to retain the information longer. At the same time, there are security concerns, especially for data housed in the cloud, concerns underscored by the abundance of recent breaches and cyber-attacks. This panel will discuss the development of a privacy policy and program as the first steps in developing preventive measures an organization can take to secure its most sensitive data. Additional topics will include data minimization and anonymization, data security programs, and breach response plans.

Speakers to include are: Andy Blair, Managing Associate, Dentons US LLP; Scott M. Giordano, Esq., Data Privacy Project Manager, Esterline Technologies Corporation; Jack Yang, Vice President, Visa Inc.  Discussion Leader: David Ray, Director, Huron Legal.

Disruption: Five Forces Shaping the Legal Landscape

From mobile and global work environments to alternative billing models to a perceived crisis in legal education, the legal industry is in the midst of a major transformation. Some changes are evolutionary, yet other developments may feel revolutionary for those unprepared for change. What are the five key trends that will disrupt the legal industry and impact how you do your job? What are the skills and mindset needed to adjust, innovate and thrive in this new legal landscape?

Attend this no-holds-barred, interactive discussion as leading legal minds and futurists outline the five key forces shaping the legal industry of tomorrow, and how you can remain ahead of the game.

Speakers to include are: David R. Cohen, Partner and Practice Group Leader, Global Records & E-Discovery Group, Reed Smith; Honorable John M. Facciola, United States Magistrate Judge, District of Columbia; Christopher Mooney, Corporate Counsel, Samsung Semiconductor, Inc.; Christopher Sitter, EnCE, eDiscovery & Digital Forensics Senior Manager, Juniper Networks.  Discussion Leader: Sophie Ross, Senior Managing Director, FTI Technology.

In addition to these, there are other sessions today that might be of interest.  For a complete description for all sessions today, click here.

So, what do you think?  Are you planning to attend LTWC this year?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball Explains HASH Deduplication As Only He Can: eDiscovery Best Practices

Ever wonder why some documents are identified as duplicates and others are not, even though they appear to be identical?  Leave it to Craig Ball to explain it in plain terms.

In the latest post (Deduplication: Why Computers See Differences in Files that Look Alike) in his excellent Ball in your Court blog, Craig states that “Most people regard a Word document file, a PDF or TIFF image made from the document file, a printout of the file and a scan of the printout as being essentially “the same thing.”  Understandably, they focus on content and pay little heed to form.  But when it comes to electronically stored information, the form of the data—the structure, encoding and medium employed to store and deliver content–matters a great deal.”  The end result is that two documents may look the same, but may not be considered duplicates because of their format.

Craig also references a post from “exactly” three years ago (it’s four days off Craig, just sayin’) that provides a “quick primer on deduplication” that shows the three approaches where deduplication can occur, including the most common approach of using HASH values (MD5 or SHA-1).

My favorite example of how two seemingly duplicate documents can be different is the publication of documents to Adobe Portable Document Format (PDF).  As I noted in our post from (nowhere near exactly) three years ago, I “publish” marketing slicks created in Microsoft® Publisher, “publish” finalized client proposals created in Microsoft Word and “publish” presentations created in Microsoft PowerPoint to PDF format regularly (still do).  With a free PDF print driver, you can conceivably create a PDF file for just about anything that you can print.  Of course, scans of printed documents that were originally electronic are another way where two seemingly duplicate documents can be different.

The best part of Craig’s post is the exercise that he describes at the end of it – creating a Word document of the text of the Gettysburg Address (saved as both .DOC and .DOCX), generating a PDF file using the Save As and Print As PDF file methods and scanning the printed document to both TIFF and PDF at different resolutions.  He shows the MD5HASH value and the file size of each file.  Because the format of the file is different each time, the MD5HASH value is different each time.  When that happens for the same content, you have what some of us call “near dupes”, which have to be analyzed based on the text content of the file.

The file size is different in almost every case too.  We performed a similar test (still not exactly) three years ago (but much closer).  In our test, we took one of our one page blog posts about the memorable Apple v. Samsung litigation and saved it to several different formats, including TXT, HTML, XLSX, DOCX, PDF and MSG – the sizes ranged from 10 KB all the way up to 221 KB.  So, as you can see, the same content can vary widely in both HASH value and file size, depending on the file format and how it was created.

As usual, I’ve tried not to steal all of Craig’s thunder from his post, so please check out it out here.

So, what do you think?  What has been your most unique deduplication challenge?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Plaintiff Once Again Sanctioned with an Adverse Inference Instruction, But Still No Complete Dismissal: eDiscovery Case Law

In Lynn M. Johnson v. BAE Systems, Inc. et. al., Civil Action No. 11-cv-02172 (RLW) (D.D.C. May 27, 2015), District of Columbia District Judge Robert L. Wilkins granted the defendants’ motion for summary judgment with respect to the plaintiff’s claims for negligence, battery, and defamation, but chose to “impose lesser, but nonetheless severe, sanctions” in the form of an adverse inference instruction for her remaining claim for intentional infliction of emotional distress.

Case Background

The plaintiff, a U.S. government employee deployed in Iraq, sued the defendants for actions taken by its employee during a project that they worked on together, alleging “severe physical and emotional health problems”.  During discovery, the defendant requested medical records in preparation for an expert witness’s examination of the plaintiff – she provided the defendant with falsified medical records which she had edited in an effort to eliminate references to health issues that predated her deployment to Iraq. The defendant filed a motion for sanctions seeking dismissal and the Court granted in part and denied in part the motion, sanctioning the plaintiff and her counsel with fees and an adverse inference instruction.

Then, on September 25, 2013, the defendant requested a forensic examination of the plaintiff’s computer.  That evening, the plaintiff contracted with a local computer technician who performed various maintenance functions, which included running a program called CCleaner that is capable of permanently deleting files.  Subsequent forensic analysis showed that several Microsoft Outlook .pst email storage files were placed into the recycling bin and deleted on September 27.  The technician testified that the plaintiff did not tell him she was in litigation, she did not ask him not to delete anything from her computer and he did not place the Outlook files in the recycle bin. The defendants also requested Facebook messages, and the court found evidence that the plaintiff had tampered with those messages, as well.

Judge’s Ruling

Regarding the latest activities by the plaintiff, Judge Wilkins stated that “The Court finds by clear and convincing evidence that Ms. Johnson destroyed, attempted to destroy, or caused to be destroyed files on her computer with potential relevance to this case”, noting that “under no circumstances should Ms. Johnson have contracted with a computer technician to ‘clean up’ a computer sought for forensic imaging, particularly without making a disk image or even informing the technician of ongoing litigation. That she chose to do so is very troubling.”  Judge Wilkins expressed similar concern by the plaintiff’s failure to produce Facebook messages from earlier than February 2013.

Summarizing the behavior by the plaintiff, Judge Wilkins stated “Over the course of this suit, Ms. Johnson has repeatedly obfuscated the truth. She has altered medical records, contradicted herself in depositions and testimony before the Court, and failed to preserve and produce relevant documents during discovery.”  Still, Judge Wilkins could not bring himself to dismiss the case, stating “Although it is an exceedingly close question, the Court concludes that Ms. Johnson’s conduct does not merit this most serious of remedies.”

As a result, Judge Wilkins awarded the defendant an adverse inference instruction sanction against the plaintiff, awarded the forensic expert’s fees spent by the defendant’s expert and dismissed the plaintiff’s claims for negligence, battery, and defamation.

So, what do you think?  Should the repeated violations by the plaintiff have led to full dismissal?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.