eDiscoveryDaily Archives

eDiscovery Daily Is Thirty! (Months Old, That Is)

March 21, 2013

Thirty months ago yesterday, eDiscovery Daily was launched. It’s hard to believe that it has been 2 1/2 years since our first three posts that debuted on our first day. 635 posts later, a lot has happened in the industry that we’ve covered. And, yes we’re still crazy after all these years for committing to a daily post each business day, but we still haven’t missed a business day yet. Twice a year, we like to take a look back at some of the important stories and topics during that time. So, here are just a few of the posts over the last six months you may have missed. Enjoy!

Industry Consolidation Continues: If you think there have been a lot of acquisitions in the eDiscovery industry, you’re right.
Don’t Be “Duped”: Files with Different HASH Values Can Still Be the Same.
Want the Right Balance of Recall and Precision in Your Search? Try Proximity Searches.
Are You Requesting the Best Production Format for Your Case? Maybe not, according to Craig Ball.
In a recent case, Both Sides Were Instructed to Use Predictive Coding or Show Cause Why Not.
Did you know that Only One in Eight Records Managers Trusts Their ESI?
Plaintiff Hammered with Case Dismissal for “Egregious” Discovery Violations: Apparently, destroying your first computer with a sledgehammer and using Evidence Eliminator and CCleaner on your second computer are not considered to be best practices for preservation.
Even a “Rap Weasel” can be sanctioned for spoliation of data. It isn’t every day that we cite The Hollywood Reporter for a story.
Problems with Review? It’s Not the End of the World.
$2.9 Billion? Is the eDiscovery Software Market Going to Double by 2017?
Want to catch up on 2012 eDiscovery cases? Here is your chance.
Is 31,000 Missed Relevant Documents an Acceptable Outcome for Predictive Coding? It might be, if the alternative is 62,000 missed relevant documents.
What do various eDiscovery thought leaders think about the industry? For the third year in a row, we find out.
Must Losing Plaintiff Pay Defendant $2.8 Million for Predictive Coding of One Million Documents? Court Says Yes.
Do you have some misperceptions about predictive coding? Maybe so. Here are Five Common Myths About Predictive Coding.

In addition, Jane Gennarelli has been publishing an excellent series to introduce new eDiscovery professionals to the litigation process and litigation terminology. Here is the latest post, which includes links to the previous twenty one posts.

Thanks for noticing us! We’ve nearly quadrupled our readership since the first six month period and almost septupled (that’s grown 7 times in size!) our subscriber base since those first six months! We appreciate the interest you’ve shown in the topics and will do our best to continue to provide interesting and useful eDiscovery news and analysis. And, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Outlook Emails Can Take Many Forms – eDiscovery Best Practices

March 19, 2013

Most discovery requests include a request for emails of parties involved in the case. Email data is often the best resource for establishing a timeline of communications in the case and Microsoft® Outlook is the most common email program used in business today. Outlook emails can be stored in several different forms, so it’s important to be able to account for each file format when collecting emails that may be responsive to the discovery request.

There are several different file types that contain Outlook emails, including:

EDB (Exchange Database): The server files for Microsoft Exchange, which is the server environment which manages Outlook emails in an organization. In the EDB file, a user account is created for each person authorized at the company to use email (usually, but not always, employees). The EDB file stores all of the information related to email messages, calendar appointments, tasks, and contacts for all authorized email users at the company. EDB files are the server-side collection of Outlook emails for an organization that uses Exchange, so they are a primary source of responsive emails for those organizations. Not all organizations that use Outlook use Exchange, but larger organizations almost always do.

OST (Outlook Offline Storage Table): Outlook can be configured to keep a local copy of a user’s items on their computer in an Outlook data file that is named an offline Outlook Data File (OST). This allows the user to work offline when a connection to the Exchange computer may not be possible or wanted. The OST file is synchronized with the Exchange computer when a connection is available. If the synchronization is not current for a particular user, their OST file could contain emails that are not on the EDB server file, so OST files may also need to be searched for responsive emails.

PST (Outlook Personal Storage Table): A PST file is another Outlook data file that stores a user’s messages and other items on their computer. It’s the most common file format for home users or small organizations that don’t use Exchange, but instead use an ISP to connect to the Internet (typically through POP3 and IMAP). In addition, Exchange users may move or archive messages to a PST file (either manually or via auto-archiving) to move them out of the primary mailbox, typically to keep their mailbox size manageable. PST files often contain emails not found in either the EDB or OST files (especially when Exchange is not used), so it’s important to search them for responsive emails as well.

MSG (Outlook MSG File): MSG is a file extension for a mail message file format used by Microsoft Outlook and Exchange. Each MSG file is a self-contained unit for the message “family” (email and its attachments) and individual MSG files can be saved simply by dragging messages out of Outlook to a folder on the computer (which could then be stored on portable media, such as CDs or flash drives). As these individual emails may no longer be contained in the other Outlook file types, it’s important to determine where they are located and search them for responsiveness. MSG is also the most common format for native production of individual responsive Outlook emails.

Other Outlook file types that might contain responsive information are EML (Electronic Mail), which is the Outlook Express email format and PAB (Personal Address Book), which, as the name implies, stores the user’s contact information.

Of course, Outlook emails are not just stored within EDB files on the server or these other file types on the local workstation or portable media; they can also be stored within an email archiving system or synchronized to phones and other portable devices. Regardless, it’s important to account for the different file types when collecting potentially responsive Outlook emails for discovery.

So, what do you think? Are you searching all of these file types for responsive Outlook emails? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

JP Morgan Chase Sanctioned for a Failure to Preserve Skill Codes – eDiscovery Case Law

March 18, 2013

Last week, we discussed how the Equal Employment Opportunity Commission (EEOC) was sanctioned for failing to comply with a motion to compel production of social media data that they had been previously ordered to produce. Now, the “shoe is on the other foot” as their opponent in another case has been sanctioned for spoliation of data.

In EEOC v. JP Morgan Chase Bank, 2:09-cv-864 (S.D. Ohio Feb. 28, 2013), District Judge Gregory L. Frost granted the EEOC’s motion for sanctions for spoliation of data, entitling the plaintiff to “a permissive adverse jury instruction related to the spoliation if this litigation proceeds to a jury trial”, and denied the defendant’s motion for summary judgment.

In this gender discrimination case, the plaintiff requested skill codes from the defendant that determined how calls were routed, contending that statistical analysis of the skill code data would reveal discrimination by illustrating that skill codes resulted in the more lucrative calls being directed to male employees. When defendant did not provide the plaintiff with select skill code data records and other information, the plaintiff filed a motion to compel, which was granted (for most of the requested date range). When the defendant again failed to produce the data, the plaintiff filed a second motion to compel, then withdrew it after the parties appeared to agree to resolve issues (documented in the Magistrate Judge’s order), then filed the motion for sanctions after the defendant failed to comply, indicating that the defendant had purged data from July 8, 2006 through March 10, 2007.

Noting that it is “curious to this Court that defendant began to preserve some other electronic information shortly thereafter” class notices from the plaintiff in 2008 and 2009, “but not all skill login data until late 2010”, Judge Frost stated that “Defendant’s failure to establish a litigation hold is inexcusable. The multiple notices that should have triggered a hold and Defendant’s dubious failure if not outright refusal to recognize or accept the scope of this litigation and that the relevant data reaches beyond the statutory period present exceptional circumstances that remove the conduct here from the protections provided by Rule 37(e).”

As a result, indicating that “Defendant’s conduct constitutes at least negligence and reaches for willful blindness bordering on intentionality”, Judge Frost granted the EEOC’s motion for sanctions for spoliation of data, entitling the plaintiff to “a permissive adverse jury instruction related to the spoliation if this litigation proceeds to a jury trial”, and denied the defendant’s motion for summary judgment.

So, what do you think? Did the defendant’s conduct warrant the sanctions? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Award to Apple in Samsung Case Cut Almost in Half, For Now – eDiscovery Case Law

March 15, 2013

In Apple Inc. v. Samsung Elecs. Co., Case No.: C 11-CV-01846-LHK (N.D. Cal. Mar. 1, 2013), District Judge Lucy Koh reduced the amount of the previous jury award against Samsung in its ongoing intellectual property case from nearly $1.05 billion to over $598 million, due to ordering a new trial on damages for several Samsung products that amounted to over $450 million being stricken from the jury’s award.

In August of last year, a jury of nine found that Samsung infringed all but one of the seven patents at issue and found all seven of Apple’s patents valid – despite Samsung’s attempts to have them thrown out. They also determined that Apple didn’t violate any of the five patents Samsung asserted in the case. Apple had been requesting $2.5 billion in damages. Apple later requested additional damages of $707 million to be added to the $1.05 billion jury verdict. This case was notable from an eDiscovery perspective due to the adverse inference instruction issued by California Magistrate Judge Paul S. Grewal against Samsung just prior to the start of trial for spoliation of data, though it appears that the adverse inference instruction did not have a significant impact in the verdict.

Notice of the Patents

A significant portion of this ruling was related to notice of the patents. As Judge Koh noted in her ruling, “Under 35 U.S.C. § 287(a), there can be no damages award where a defendant did not have actual or constructive notice of the patent or registered trade dress at issue. Thus, it is improper to award damages for sales made before the defendant had notice of the patent, and an award that includes damages for sales made before notice of any of the intellectual property (“IP”) infringed is excessive as a matter of law.” The parties disputed whether Apple had given Samsung notice of each of the patents prior to the filing of the complaint and the amended complaint.

Apple had provided to the Court numbers necessary to calculate Samsung’s profits and reasonable royalty awards based on damages numbers provided by Apple’s damages expert, but with later notice dates, enabling the Court, for some products, to calculate how much of the jury’s award compensated for the sales before Samsung had notice of the relevant IP. However, as Judge Koh noted, “for other products, the jury awarded an impermissible form of damages for some period of time, because Samsung had notice only of utility patents for some period, but an award of infringer’s profits was made covering the entire period from August 4, 2010 to June 15, 2012. For these products, the Court cannot remedy the problem by simply subtracting the extra sales.” {emphasis added} The Court had instructed the jury that infringer’s profits are not a legally permissible remedy for utility patent infringement.

Ruling

Therefore, Judge Koh ordered a new trial on damages for 14 products, totaling $450,514,650 being stricken from the jury’s award. This left an award of $598,908,892 on the remaining awarded products.

So, what do you think? What will be the final award and how much will it cost to determine that? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

EEOC Sanctioned for Failing to Comply with Motion to Compel Production – eDiscovery Case Law

March 14, 2013

As noted previously in this blog, the Equal Employment Opportunity Commission (EEOC) was ordered to turn over social media information related to a class action case alleging sexual harassment and retaliation. Apparently, they were less than cooperative in complying with that order.

In EEOC v. Original Honeybaked Ham Co. of Georgia, 11-cv-02560-MSK-MEH, 2012 U.S. Dist. (D. Colo. Feb. 27, 2013), Colorado Magistrate Judge Michael E. Hegarty sanctioned the EEOC for failing to provide discovery of social media content.

This has been a busy case with at least eight court rulings in 2013 alone, including ruling where Judge Hegarty barred the EEOC from asserting claims not specifically identified during pre-suit litigation and prohibited the EEOC from seeking relief on behalf of individuals who the defendant could not reasonably identify from the information provided by the EEOC.

In this ruling, Judge Hegarty stated: “I agree that the EEOC has, on several occasions, caused unnecessary expense and delay in this case. In certain respects, the EEOC has been negligent in its discovery obligations, dilatory in cooperating with defense counsel, and somewhat cavalier in its responsibility to the United States District Court.”

Elaborating, Judge Hegarty stated, as follows:

“The offending conduct has been demonstrated in several aspects of the EEOC’s discovery obligations. These include, without limitation, the following. First, the circumstances surrounding the EEOC’s representations to this Court concerning its decision to use its own information technology personnel to engage in forensic discovery of the Claimants’ social media (cell phones for texting, web sites for blogging, computers for emailing), for which I had originally appointed a special master. The EEOC unequivocally requested this change, which I made an Order of the Court on November 14, 2012 (docket #248). Weeks later, the EEOC reneged on this representation, requiring the Court and the Defendant to go back to the drawing board. Second, in a similar vein, the EEOC changed its position — again ostensibly because some supervisor(s) did not agree with the decisions that the line attorneys had made — after lengthy negotiation and agreement with Defendant concerning the contents of a questionnaire to be given to the Claimants in this case, designed to assist in identifying the social media that would be forensically examined. The EEOC’s change of mind in midstream (and sometimes well downstream) has required the Defendant to pay its attorneys more than should have been required and has multiplied and delayed these proceedings unnecessarily.”

Stating that he had “for some time, believed that the EEOC’s conduct was causing the Defendant to spend more money in this lawsuit than necessary”, Judge Hegarty granted (in part) the defendants’ Motion for Sanctions and required the EEOC to “pay the reasonable attorney’s fees and costs expended in bringing this Motion”. Perhaps more to come.

So, what do you think? Was the sanction sufficient? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Rules Production Must be TIFFs with Bates Numbers – eDiscovery Case Law

March 12, 2013

In Branhaven, LLC v. Beeftek, Inc., 2013 U.S. Dist., (D. Md. Jan. 4, 2013), Maryland Magistrate Judge Susan K. Gauvey sanctioned plaintiff’s attorneys for wrongfully certifying the completeness of their eDiscovery production and also ruled that defendants “demonstrated that without Bates stamping and .tiff format”, the plaintiff’s production “was not reasonably usable and therefore was insufficient under Rule 34”.

In this trademark infringement suit, the defendants alleged numerous instances of “discovery abuses intended to harass defendants, cause unnecessary delay, and needlessly increase the cost of litigation” by the plaintiff, resulting in $51,122 in legal fees and expenses related to the plaintiff’s “document dump” of 112,000 pages of electronically stored information (ESI). The Plaintiffs produced their ESI in PDF format, which was challenged by the defendants, because the production was untimely and not in TIFF format with Bates Numbers on every page.

While noting that the court did not “want to micromanage discovery between counsel”, Judge Gauvey stated however that “neither does this judge want to endorse this ‘hands off’ approach in working with clients to meet discovery obligations and this casual and even reckless attitude of plaintiff’s counsel to opposing party’s right to timely and orderly discovery.”

With regard to the PDF production, Judge Gauvey referred to the plaintiff’s contention that “the Protocol for Discovery of Electronically Stored Information (Local Rules of District of Maryland) which states that TIFF is the preferred format is only advisory” as a “weak defense”.

Judge Gauvey also noted “as defendants point out, Fed. R. Civ. P. 34(b)(2)(E)(ii) provides two options regarding the form in which a party may produce documents and plaintiff did not satisfy either. The July 20 production was not in a form ‘in which it is ordinarily maintained’ or in ‘a reasonably usable form’…The Advisory Committee Notes to Rule 34 warn that: ‘[a] party that responds to a discovery request by simply producing electronically stored information in a form of its choice, without identifying that form in advance of the production in the response required by Rule 34(b) runs the risk that the requesting party can show that the produced form is not reasonably usable’…That is precisely what happened here…Defendant was blindsided by the volume of the documents (since the prior productions consisted of 388 pages). Moreover, defendants had every reason to think that the documents would be completely Bates-stamped, as prior productions were and further defendants had no reason to think that this production would be so incredibly voluminous, as to require special arrangements and explicit agreement.”

Judge Gauvey ordered the defendant to submit a bill of costs by January 15 for the technical fees they incurred to process the flawed production (which they did, for $2,200). The plaintiff also agreed to pay an undisclosed sum in attorneys’ fees related to the sanctions motion.

On the surface, the ruling that “without Bates stamping and .tiff format, the data was not reasonably usable and therefore was insufficient under Rule 34” appears to take a step backward with regard to production format expectations. However, the ruling also notes that the production “was not in a form ‘in which it is ordinarily maintained’” and the plaintiff’s previous PDF productions (apparently Bates stamped) and the defendant’s productions in PDF format (also presumably Bates stamped) were allowed. Perhaps, if the plaintiff had produced the files in native format instead of a poorly executed PDF format production, the ruling would have been different?

So, what do you think? Does this ruling appear to be a setback for native productions? Or merely reflection of a poorly executed PDF format production? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Five Common Myths About Predictive Coding – eDiscovery Best Practices

March 11, 2013

During my interviews with various thought leaders (a list of which can be found here, with links to each interview), we discussed various aspects of predictive coding and some of the perceived myths that exist regarding predictive coding and what it means to the review process. I thought it would be a good idea to recap some of those myths and how they compare to the “reality” (at least as some of us see it). Or maybe just me. 🙂

1. Predictive Coding is New Technology

Actually, with all due respect to each of the various vendors that have their own custom algorithm for predictive coding, the technology for predictive coding as a whole is not new technology. Ever heard of artificial intelligence? Predictive coding, in fact, applies artificial intelligence to the review process. With all of the acronyms we use to describe predictive coding, here’s one more for consideration: “Artificial Intelligence for Review” or “AIR”. May not catch on, but I like it.

Maybe attorneys would be more receptive to it if they understood as artificial intelligence? As Laura Zubulake pointed out in my interview with her, “For years, algorithms have been used in government, law enforcement, and Wall Street. It is not a new concept.” With that in mind, Ralph Losey predicts that “The future is artificial intelligence leveraging your human intelligence and teaching a computer what you know about a particular case and then letting the computer do what it does best – which is read at 1 million miles per hour and be totally consistent.”

2. Predictive Coding is Just Technology

Treating predictive coding as just the algorithm that “reviews” the documents is shortsighted. Predictive coding is a process that includes the algorithm. Without a sound approach for identifying appropriate example documents for the collection, ensuring educated and knowledgeable reviewers to appropriately code those documents and testing and evaluating the results to confirm success, the algorithm alone would simply be another case of “garbage in, garbage out” and doomed to fail.

As discussed by both George Socha and Tom Gelbmann during their interviews with this blog, EDRM’s Search project has published the Computer Assisted Review Reference Model (CARRM), which has taken steps to define that sound approach. Nigel Murray also noted that “The people who really understand computer assisted review understand that it requires a process.” So, it’s more than just the technology.

3. Predictive Coding and Keyword Searching are Mutually Exclusive

I’ve talked to some people that think that predictive coding and key word searching are mutually exclusive, i.e., that you wouldn’t perform key word searching on a case where you plan to use predictive coding. Not necessarily. Ralph Losey advocates a “multimodal” approach, noting it as: “more than one kind of search – using predictive coding, but also using keyword search, concept search, similarity search, all kinds of other methods that we have developed over the years to help train the machine. The main goal is to train the machine.”

4. Predictive Coding Eliminates Manual Review

Many people think of predictive coding as the death of manual review, with all attorney reviewers being replaced by machines. Actually, manual review is a part of the predictive coding process in several aspects, including: 1) Subject matter knowledgeable reviewers are necessary to perform review to create a training set of documents for the technology, 2) After the process is performed, both sets (the included and excluded documents) are sampled and the samples are reviewed to determine the effectiveness of the process, and 3) The resulting responsive set is generally reviewed to confirm responsiveness and also to determine whether the documents are privileged. Without manual review to train the technology and verify the results, the process would fail.

5. Predictive Coding Has to Be Perfect to Be Useful

Detractors of predictive coding note that predictive coding can miss plenty of responsive documents and is nowhere near 100% accurate. In one recent case, the producing party estimated as many as 31,000 relevant documents may have been missed by the predictive coding process. However, they also estimated that a much more costly manual review would have missed as many as 62,000 relevant documents.

Craig Ball’s analogy about the two hikers that encounter the angry grizzly bear is appropriate – the one hiker doesn’t have to outrun the bear, just the other hiker. Craig notes: “That is how I look at technology assisted review. It does not have to be vastly superior to human review; it only has to outrun human review. It just has to be as good or better while being faster and cheaper.”

So, what do you think? Do you agree that these are myths? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 3

March 8, 2013

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball. A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers. Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post. Wednesday was part 1 and yesterday was part 2. Today is the third and last part. A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

What are you working on that you’d like our readers to know about?

I’m really trying to make 2013 the year of distilling an extensive but idiosyncratic body of work that I’ve amassed through years of writing and bring it together into a more coherent curriculum. I want to develop a no-cost casebook for law students and to structure my work so that it can be more useful for people in different places and phases of their eDiscovery education. So, I’ll be working on that in the first six or eight months of 2013 as both an academic and a personal project.

I’m also trying to go back to roots and rethink some of the assumptions that I’ve made about what people understand. It’s frustrating to find that lawyers talking about, say, load files when they don’t really know what a load file is, they’ve never looked at a load file. They’ve left it to somebody else and, so, the resolution of difficulties has gone through so many hands and is plagued by so much miscommunication. I’d like to put some things out there that will enable lawyers in a non-threatening and accessible way to gain comfort in having a dialog about the fundamentals of eDiscovery that you and I take for granted. So, that we don’t have to have this reliance upon vendors for the simplest issues. I don’t mean that vendors won’t do the work, but I don’t think we should have to bring a technical translator in for every phone call.

There should be a corpus of competence that every litigator brings to the party, enabling them to frame basic protocols and agreements that aren’t merely parroting something that they don’t understand, but enabling them to negotiate about issues in ways that the resolutions actually make sense. Saying “I won’t give you 500 search terms, but I’ll give you 250” isn’t a rational resolution. It’s arbitrary.

There are other kinds of cases that you can identify search terms “all the live long day” and they’re really never going to get you that much closer to the documents you want. The best example in recent years was the Pippins v. KPMG case. KPMG was arguing that they could use search terms against samples to identify forensically significant information about work day and work responsibility. That didn’t make any sense to me at all. The kinds of data they were looking for wasn’t going to be easily found by using keyword search. It was going to require finding data of a certain character and bringing a certain kind of analysis to it, not an objective culling method like search terms. Search terms have become like the expression “if you have a hammer, the whole world looks like a nail”. We need to get away from that.

I think a little education made palatable will go a long way. We need some good solid education and I’m trying to come up with something that people will borrow and build on. I want it to be something that’s good enough that people will say “let’s just steal his stuff”. That’s why I put it out there – it’s nice that they credit me and I appreciate it; but if what you really want to do is teach people, you don’t do it for the credit, you do it for the education. That’s what I’m about, more this year than ever before.

Thanks, Craig, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 2

March 7, 2013

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball. A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers. Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post. Yesterday was part 1. Today is part 2 and part 3 will be published in the blog on Friday. A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

I noticed that you are speaking at a couple of sessions here. What would you like to tell me about those sessions?

{Interviewed the evening before the show} I am on a Technology Assisted Review panel with Maura Grossman and Ralph Losey that should be as close to a barrel of laughs as one can have talking about technology assisted review. It is based on a poker theme – which was actually Matt Nelson’s (of Symantec) idea. I think it is a nice analogy, because a good poker player is a master or mistress of probabilities, whether intuitively or overtly performing mental arithmetic that are essentially statistical and probability calculations. Such calculations are key to quality assurance and quality control in modern review.

We have to be cautious not to require the standards for electronic assessments to be dramatically higher than the standards applied to human assessments. It is one thing with a new technology to demand more of it to build trust. That’s a pragmatic imperative. It is another thing to demand so exalted a level of scrutiny that you essentially void all advantages of the new technology, including the cost savings and efficiencies it brings. You know the old story about the two hikers that encounter the angry grizzly bear? They freeze, and then one guy pulls out running shoes and starts changing into them. His friend says “What are you doing? You can’t outrun a grizzly bear!” The other guy says “I know. I only have to outrun you”. That is how I look at technology assisted review. It does not have to be vastly superior to human review; it only has to outrun human review. It just has to be as good or better while being faster and cheaper.

We cannot let the vague uneasiness about the technology cause it to implode. If we have to essentially examine everything in the discard pile, so that we not only pay for the new technology but also back it up with the old. That’s not going to work. It will take a few pioneers who get the “arrows in the back” early on—people who spend more to build trust around the technology that is missing at this juncture. Eventually, people are going to say “I’ve looked at the discard pile for the last three cases and this stuff works. I don’t need to look at all of that any more.

Even the best predictive coding systems are not going to be anywhere near 100% accurate. They start from human judgment where we’re not even sure what “100% accurate” is, in the context of responsiveness and relevance. There’s no “gold standard”. Two different qualified people can look at the same document and give a different assessment and approximately 40% of the time, they do. And, the way we decide who’s right is that we bring in a third person. We indulge the idea that the third person is the “topic authority” and what they say goes. We define their judgment as right; but, even their judgments are human. To err being human, they’re going to make misjudgments based on assumptions, fatigue, inattention, whatever.

So, getting back to the topic at hand, I do think that the focus on quality assurance is going to prompt a larger and long overdue discussion about the efficacy of human review. We’ve kept human review in this mystical world of work product for a very long time. Honestly, the rationale for work product doesn’t naturally extend over to decisions about responsiveness and relevance. Even though, most of my colleagues would disagree with me out of hand. They don’t want anybody messing with privilege or work product. It’s like religion or gun control—you can’t even start a rational debate.

Things are still so partisan and bitter. The notions of cooperation, collaboration, transparency, translucency, communication – they’re not embedded yet. People come to these processes with animosity so deeply seated that you’re not really starting on a level playing field with an assessment of what’s best for our system of justice. Justice is someone else’s problem. The players just want to win. That will be tough to change.

We “dinosaurs” will die off, and we won’t have to wait for the glaciers to advance. I think we will have some meteoric events that will change the speed at which the dinosaurs die. Technology assisted review is one. We’ve seen a meteoric rise in the discussion of the topic, the interest in the topic, and I think it will have a meteoric effect in terms of more rapidly extinguishing very bad and very expensive practices that don’t carry with them any more superior assurance of quality.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 1

March 6, 2013

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball. A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers. Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post. So, today is part 1. Parts 2 and 3 will be published in the blog on Thursday and Friday. A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?

I think this is the first year where I do not have a ready answer to that question. It’s like the wonderful movie Groundhog Day. I am on the educational planning board for the show, and as hard as we try to find and present fresh ideas, technology assisted review is once again the dominant topic.

This year, we will see a change of the marketing language repositioning the (forgive the jargon) “value proposition” for the tools being sold continuing to move more towards the concept of information governance. If knowledge management had a “hook up” here at LTNY with eDiscovery, their offspring would be information governance. Information governance represents a way to spread the cost of eDiscovery infrastructure among different budgets. It’s not a made up value proposition. Security and regulatory people do have a need, and many departments can ultimately benefit from more granular and regimented management of their unstructured and legacy information stores.

I remain something of a skeptic about what has come to be called “defensible deletion.” Most in-house IT people do not understand that, even after you purchase a single instance de-duplication solution, you’re still going to have as much of 40% “bloat” in your collection of data between local stores, embedded and encoded attachments, etc. So, there are marked efficiencies we can achieve by implementing sensible de-duplication and indexing mechanisms that are effective, ongoing and systemic. Consider enterprise indexing models that basically let your organization and its information face an indexing mechanism in much the same way as the internet faces Google. Almost all of us interact with the internet through Google, and often get the information we are seeking from the Google index or synopsis of the data without actually proceeding to the indexed site. The index itself becomes the resource, and the document indexed a distinct (and often secondary) source. We must ask ourselves: “if a document is indexed, does it ever leave our collection?”

I also think eDiscovery education is changing and I am cautiously optimistic. But, people are getting just enough better information about eDiscovery to be dangerous. And, they are still hurting themselves by expecting there to be some simple “I don’t really need to know it” rule of thumb that will get them through. And, that’s an enormous problem. You can’t cross examine from a script. Advocates need to understand the answers they get and know how to frame the follow up and the kill. My cautious optimism respecting education is function of my devoting so much more of my time to education at the law school and professional levels as well as for judicial organizations. I am seeing a lot more students interested in the material at a deeper level, and my law class that just concluded in December impressed me greatly. The level of enthusiasm the students brought to the topic and the quality and caliber of their questions were as good as any I get from my colleagues in the day to day practice of eDiscovery. Not just from lawyers, but also from people like you who are deeply immersed in this topic.

That is not so much a credit to my teaching (although I hope it might be). The greatest advantage that students have is that they have haven’t yet acquired bad habits and don’t come with preconceived notions about what eDiscovery is supposed to be. Conversely, many lawyers literally do not want to hear about certain topics–they “glaze” and immediately start looking for a way to say “this cannot be important, I cannot have to know this”. Law students don’t waste their energy that way. If the professor says “you need to know this”, then they make it their mission to learn. Yesterday, I had a conversation with a student where she said “I really wish we could have learned more about search strategies and more ways to apply sophisticated tools hands on”. That’s exactly what I wish lawyers would say.

I wish lawyers were clamoring to better understand things like search or de-duplication or the advantages of one form of production over another. Sometimes, I feel like I am alone in my assessment that these are crucial issues. If I am the only one thinking that settling on forms of productions early and embracing native forms of production is crucial to quality, what is wrong with me?

I am still surprised at how many people TIFF most of their collection or production.

They have no clue how really bad that is, not just in terms in cost but also in terms of efficiency. I am hoping the dialogue about TAR will bring us closer to a serious discussion about quality in eDiscovery. We never had much of a dialogue about the quality of human review or the quality of paper production. Either we didn’t have the need, or, more likely we were so immersed in what we were doing we did not have the language to even begin the conversation.

I wrote in a blog post recently about an experiment discussed in my college Introductory Psychology course where this cool experiment involved raising kittens such that they could only see for a few hours a day in an environment composed entirely horizontals or verticals. Apparently, if you are raised from birth only seeing verticals, you do not learn to see horizontals, and vice-versa. So, if I raise a kitten among the horizontals and take a black rod and put it in front of them, they see it when it is horizontal. But, if I orient it vertically, it disappears in their brain. That is kind of how we are with lawyers and eDiscovery.

There are just some topics that you and I and our colleagues see the importance of, but lawyers have been literally raised without the ability to see why those things matter. They see what has long been presented to them in, say, Summation or Concordance, as an assemblage of lousy load files and error ridden OCR and colorless images stripped of embedded commentary. They see this information so frequently and so exclusively that they think that’s the document and, since they only have paper document frames of reference (which aren’t really that much better than TIFFs), they think this must be what electronic evidence looks like. They can’t see the invisible plane they’ve been bred to overlook.

You can look at a stone axe and appreciate the merits of a bronze axe – if all that you’re comparing it to are prehistoric tools, a bronze axe looks pretty good. But, today we have chainsaws. I want lawyers demanding chainsaws to deal with electronic information and to throw away those incredibly expensive stone axes; but, unfortunately, they make more money using stone axes. But, not for long. I am seeing the “house of cards” start to shake and the house of cards I am talking about is the $100 to $300 (or more) per gigabyte pricing for eDiscovery. I think that model is not only going to be short lived, but will soon be seen as negligence in the lawyers who go that route and as exploitive gouging by service providers, like selling a bottle of water for $10 after Hurricane Sandy. There is a point at which price gouging will be called out. We can’t get there fast enough.

eDiscoveryDaily

eDiscovery Daily Is Thirty! (Months Old, That Is)

Outlook Emails Can Take Many Forms – eDiscovery Best Practices

JP Morgan Chase Sanctioned for a Failure to Preserve Skill Codes – eDiscovery Case Law

Award to Apple in Samsung Case Cut Almost in Half, For Now – eDiscovery Case Law

EEOC Sanctioned for Failing to Comply with Motion to Compel Production – eDiscovery Case Law

Court Rules Production Must be TIFFs with Bates Numbers – eDiscovery Case Law

Five Common Myths About Predictive Coding – eDiscovery Best Practices

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 3

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 2

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 1

Status: Updated