eDiscoveryDaily Archives

Proximity Searches Can Be the Right Balance of Recall and Precision – eDiscovery Best Practices

October 8, 2012

When performing keyword searching, the challenge to performing those searches effectively is to balance recall (retrieving responsive documents with hits) and precision (not retrieving too many non-responsive documents with hits). A search that has 100% precision will contain only responsive documents; however, that does not mean that all of the responsive documents have been retrieved. A search that has 100% recall will contain all of the responsive documents in the collection; however, it may also contain a large number of non-responsive documents, which can be drive up review costs. So, how to perform searches that effectively balance recall and precision?

One way is through proximity searching. Proximity searching is simply looking for two or more words that appear close to each other in the document. It’s more precise than an AND search (i.e., termA and termB) with more recall than a phrase search (i.e., “termA termB”). Let’s look an example.

You’re working for an oil company and you’re looking for documents related to “oil rights” (such as “oil rights”, “oil drilling rights”, “oil production rights”, etc.). You could perform phrase searches, but any variations that you didn’t think of would be missed (e.g., “rights to drill for oil”, etc.). You could perform an AND search (i.e., “oil” AND “rights”), and that could very well retrieve all of the files related to “oil rights”, but it would also retrieve a lot of files where “oil” and “rights” appear, but have nothing to do with each other. A search for “oil” AND “rights” throughout various oil company’s data stores may retrieve several published and copyrighted documents that mention the word “oil”, but have nothing to do with “oil rights”. Why? Because almost every published and copyrighted document will have the phrase “All Rights Reserved” in the document, so those will be retrieved, even though many of them will likely be non-responsive.

A proximity search like “oil within 5 words of rights” will only retrieve the document if those words are as close as specified to each other, in either order. Proximity searching helps reduce the result set to a more manageable number for review, by eliminating all of the files that happen to mention “oil” and “rights” somewhere in the document, but not in context with each other. Yet, it catches all of the variations of phrases containing “oil” and “rights” for which you may not think to search.

Proximity searches are great for searching people’s names, as well. For example, a phrase search for “John Adams” won’t retrieve “Adams, John”, but a proximity search for “John within 3 words of Adams” will retrieve “John Adams”, “Adams, John”, and even “John Q. Adams”.

When developing a search of two or more related words that effectively balances recall and precision, consider using a proximity search. It just might be the right search for the situation.

So, what do you think? Do you use proximity searching to make your searches more effective? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Sanctions Can Happen in Police Brutality Cases Too – eDiscovery Case Law

October 5, 2012

As reported in the Seattle Times, Pierce County (Washington) Superior Court Judge Stephanie Arend issued a $300,000 sanction against King County for failure to produce key documents illustrating the previous troubling behavior of a sheriff deputy who tackled Christopher Sean Harris and left him permanently brain-damaged. Judge Arend also indicated that the county would be liable for attorneys' fees and possibly compensatory damages for the Harris family. This after King County had settled with the Harris family for $10 million in January 2011 during a civil trial in King County Superior Court.

After being wrongly identified as a suspect in an earlier bar fight, Harris was tackled and pushed into a wall by Deputy Matthew Paul in Seattle's Belltown neighborhood in May 2009 and left brain-damaged, paralyzed and unable to speak. After reaching a settlement during the civil trial, Harris' attorneys claimed the Sheriff's Office and county withheld emails and other documents that outlined internal concerns about unnecessary or excessive force used by Paul in other incidents. They filed a motion at the end of last year asking Arend to sanction the county and order it to pay an additional $3.3 million.

Documents alleged by Harris’ attorneys to have been intentionally withheld by the King County Sheriff’s office include:

A thread of emails to Paul's supervisor about his behavior at the Basic Law Enforcement Academy, where concerns were raised about Paul having "exhibited behaviors that were a concern" and had used force that was "far above the norm" when working with a smaller female trainee. While the county indicated that a search failed to locate these emails, Judge Arend, in the ruling, noted that "any competent electronic discovery effort would have located this email."
There was also a citizen complaint against Paul in May 2010 after a Seattle resident stopped to videotape Paul and other deputies deal with an intoxicated person and the resident was tackled by Paul and suffered a broken nose. The resident has filed a federal civil-rights lawsuit against Paul and the county.
There were also documents about another use-of-force incident that was not put into Paul’s personnel file until the Harris’ case was settled.

"This reckless indifference in its failure to produce these three documents — documents that were indisputably relevant — is the functional equivalent of intentional misconduct," Judge Arend noted, calling the county’s failure to produce these documents as “reprehensible”.

Because the family would have filed a civil-rights lawsuit if they had known about these other instances, Judge Arend said she will decide about further damages after a hearing for Harris' attorneys to attempt to show that they would have prevailed in a civil-rights case with the additional documents.

Amazingly, Paul remains on the force.

So, what do you think? Was the sanction severe enough? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

EDBP.com, A Lawyer Centric Work Flow Model for eDiscovery – eDiscovery Best Practices

October 4, 2012

Take a closer look – that’s not the EDRM model you see above. It’s the new EDBP model.

EDBP stands for Electronic Discovery Best Practices and is the brainchild of Ralph Losey, whose e-Discovery Team® blog is one of the must-read blogs (and one of the most in-depth) in the industry. Ralph is also National e-Discovery Counsel with the law firm of Jackson Lewis, LLP, an Adjunct Professor at the University of Florida College of Law teaching eDiscovery and advanced eDiscovery and has also previously been a thought leader interviewee on this blog. Other than all that, he’s not very busy.

As Ralph describes on his blog, “EDBP is a new reference of legal best practices for practicing attorneys and paralegals. It is also an open project where other specialists in the field are invited to make contributions.” He also notes that “The ten-step diagram…serves as the basic structure of the tasks performed by attorneys in electronic discovery practice. This structure may also change with time to keep up with evolving attorney practices.”

According to the EDBP site (ironically at EDBP.com), the stated mission is as follows:

“The purpose of EDBP is to provide a model of best practices for use by law firms and corporate law departments. EDBP is designed to be an educational resource for all lawyers striving to stay current with the latest thinking on excellence in legal services in electronic discovery law.”

Other notable aspects about EDBP:

It’s lawyer-centric, designed to address legal services, not the work of vendors. As a result, it’s different in scope from EDRM, which covers non-legal service activities as well. “The EDBP chart will focus solely on legal practice and legal services. It will be by and for lawyers only and the paralegals who assist their legal services”.
It does not address minimum standards for legal services, but instead “embodies an evolving understanding of excellence in legal services”. In other words, if it were a final exam, you’re expected to ace the exam, not just get a passing grade.

The EDBP site also provides linked detailed write ups of each of the color coded sections, entitled Pre-Suit (gray), Preservation (blue), Cooperation (red), C.A.R. (green), Productions (yellow) and Evidence (turquoise?). The sections include links to resources of information, such as The Sedona Conference® (including flowcharts) and case cites, as well as references to Federal Rules.

On his blog, Losey says “I am writing the beginning statements of best practices (about half-way through) and will serve as the first editor and gate-keeper for future contributions from others.” The site also provides a place to provide your email address to subscribe to updates and a comments section to leave a comment for suggestions on how to improve EDBP. It will be interesting to see how this site evolves – it promises to be an invaluable resource for eDiscovery best practices for lawyers and other legal services personnel.

So, what do you think? Do you think EDBP will be a useful resource? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Twitter Turns Over Tweets in People v. Harris – eDiscovery Case Law

October 3, 2012

As reported by Reuters, Twitter has turned over Tweets and Twitter account user information for Malcolm Harris in People v. Harris, after their motion for a stay of enforcement was denied by the Appellate Division, First Department in New York and they faced a finding of contempt for not turning over the information. Twitter surrendered an “inch-high stack of paper inside a mailing envelope” to Manhattan Criminal Court Judge Matthew Sciarrino, which will remain under seal while a request for a stay by Harris is heard in a higher court.

Back in April, Harris, an Occupy Wall Street activist facing criminal charges, tried to quash a subpoena seeking production of his Tweets and his Twitter account user information in his New York criminal case. That request was rejected, so Twitter then sought to quash the subpoena themselves, claiming that the order to produce the information imposed an “undue burden” on Twitter and even forced it to “violate federal law”.

Then, on June 30, Judge Sciarrino ruled that Twitter must produce tweets and user information of Harris, noting: “If you post a tweet, just like if you scream it out the window, there is no reasonable expectation of privacy. There is no proprietary interest in your tweets, which you have now gifted to the world. This is not the same as a private email, a private direct message, a private chat, or any of the other readily available ways to have a private conversation via the internet that now exist…Those private dialogues would require a warrant based on probable cause in order to access the relevant information.” Judge Sciarrino indicated that his decision was “partially based on Twitter’s then terms of service agreement”, which was subsequently modified to add the statement “You Retain Your Right To Any Content You Submit, Post Or Display On Or Through The Service.”

Twitter filed an appeal of the trial court’s decision in with the Appellate Division, First Department in New York, but, unfortunately for Twitter, it didn’t take long for the appellate court panel to rule, as they denied Twitter’s motion for a stay of enforcement of the Trial Court’s order to produce Malcolm Harris’s tweets. Twitter was ultimately given a deadline by the Trial Court during a hearing on the District Attorney’s motion (for Twitter to show cause as to why they should not be held in contempt for failure to produce the tweets) to produce Harris’s information by Friday September 14 or face a finding of contempt. Judge Sciarrino even went so far as to warn Twitter that he would review their most recent quarterly financial statements in determining the appropriate financial penalty if Twitter did not obey the order. Now they have, though the information has been kept under seal (at least for now).

As the Reuters article notes, “The case has drawn interest from privacy advocates, including the Electronic Frontier Foundation (EFF) and the American Civil Liberties Union (ACLU), which have filed an amicus brief in support of Twitter’s appeal. They are concerned the ruling could set a precedent putting the onus on social media companies to try to protect their users from criminal prosecution.”

So, what do you think? Will the stay be denied or will the information remain under seal? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Don’t Be “Duped”, Files with Different HASH Values Can Still Be the Same – eDiscovery Best Practices

October 2, 2012

A couple of months ago, we published a post discussing how the number of pages in each gigabyte can vary widely and, to help illustrate the concept, we took one of our blog posts and put it into several different file formats to illustrate how each file had the same content, yet was a different size. That’s not the only concept that example illustrates.

Content is Often Republished

How many of you have ever printed or saved a file to Adobe Acrobat PDF format? Personally, I do it all the time. For example, I “publish” marketing slicks created in Microsoft® Publisher, “publish” finalized client proposals created in Microsoft Word and “publish” presentations created in Microsoft PowerPoint to PDF format regularly. Microsoft now even includes Adobe PDF as one of the standard file formats to which you can save a file, I even have a free PDF print driver on my laptop, so I can conceivably create a PDF file for just about anything that I can print. In each case, I’m duplicating the content of the file, but in a different file format designed for publishing that content.

Another way content is republished is via the ubiquitous “copy and paste” capability that is used by so many to duplicate content to another file. Whether copying part or all of the content, “copy and paste” functionality is essentially available in just about every application to be able to duplicate content from one application to the next or even one file to the next in the same application.

Same Content, Different HASH

When publishing a file to PDF or copying the entire contents of a file to a new file, the contents of the file may be the same, but the HASH value, which is a digital fingerprint that reflects the contents and format of the file, will be different. So, a Word file and the PDF file published from the Word file may contain the same content, but the HASH value will be different. Even copying the content from one file to another in the same software program can result in different HASH values, or even different file sizes. For example, I copied the entire contents of yesterday’s blog post, written in Word, into a brand new Word file. Not only did they have different HASH values, but they were different sizes – the copied file was 8K smaller than the original. So, these files, while identical in content, won’t be considered “duplicates” based on HASH value and won’t be “de-duped” out of the collection as a result. As a result, these files are considered “near-dupes” for analysis purposes, even though the content is essentially identical.

What to Do with the Near-Dupes?

Identifying and culling these essentially identical near-dupes isn’t necessary in every case, but if it is, you’ll need to perform a process that groups similar documents together so that those near-dupes can be identified and addressed. We call that “clustering”. For more on the benefits of clustering, check out this blog post.

So, what do you think? What do you do with “dupes” that have different HASH values? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Home Depot’s “Extremely Broad” Request for Social Media Posts Denied – eDiscovery Case Law

October 1, 2012

In Mailhoit v. Home Depot, CV 11 03892 DOC (SSx) (C.D. Cal.; Sept. 7, 2012), Magistrate Judge Suzanne Segal ruled that the three out of four of the defendant’s discovery requests failed Federal Rule 34(b)(1)(A)’s “reasonable particularity” requirement, were, therefore, not reasonably calculated to lead to the discovery of admissible evidence and were denied.

Case Background

The plaintiff had been a manager of the defendant's store in Burbank, California, and filed a suit against her employer after being fired, charging unlawful discrimination based on gender, as well as failure to accommodate her known physical disability. The plaintiff testified at her deposition that she suffers from post traumatic stress disorder, depression and isolation, and has cut herself off from communication with friends because of Defendant’s alleged wrongdoing. The defendant argued “that it is entitled to Plaintiff’s communications posted on social networking sites (“SNS”) such as Facebook and LinkedIn to test Plaintiff’s claims about her mental and emotional state.”

Defendant’s Motion to Compel

The defendant filed a Motion to Compel Further Responses to Defendant’s Request for Production of Documents, which included a request for (among other things):

“Any profiles, postings or messages (including status updates, wall comments, causes joined, groups joined, activity streams, blog entries) from social networking sites from October 2005(the approximate date Plaintiff claims she first was discriminated against by Home Depot), through the present, that reveal, refer, or relate to any emotion, feeling, or mental state of Plaintiff, as well as communications by or from Plaintiff that reveal, refer, or relate to events that could reasonably be expected to produce a significant emotion, feeling, or mental state”.

The defendant also requested “[t]hird-party communications to Plaintiff that place her own communications in context”, “[a]ll social networking communications between Plaintiff and any current or former Home Depot employees” and any pictures posted to the plaintiff’s profile or otherwise linked via tagging.

Judge Rules against Defendant in Three of Four Categories

Judge Segal noted that “while a party may conduct discovery concerning another party’s emotional state, the discovery itself must still comply with the general principles underlying the Federal Rules of Civil Procedure that govern discovery. A court can limit discovery if it determines, among other things, that the discovery is…unreasonably cumulative or duplicative”. Since Rule 34(b) requires the requesting party to describe the items to be produced with “reasonable particularity”, Judge Segal ruled that “three of the four categories of SNS communications sought by Defendant fail Rule 34(b)(1)(A)’s ‘reasonable particularity’ requirement”, only granting the defendant’s request for social networking communications between Plaintiff and any current or former Home Depot employees.

So, what do you think? Should the defendant’s requests have been denied, or were they “unreasonably cumulative”? Please share any comments you might have or if you’d like to know more about a particular topic.

Thanks to the Ride the Lightning blog for the tip on this case!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Inadvertent Disclosure? Got Clawback? – eDiscovery Best Practices

September 28, 2012

As discovery becomes more complex and voluminous, it seems as though we’re seeing more and more cases where inadvertent disclosures of privileged documents are becoming more common. In just the past couple of months, we’ve discussed two cases on this blog, where the producing parties were forced to waive privilege of those documents when they failed the now popular five factor test to determine whether an inadvertent disclosure entitles the producing party to have the documents returned. Perhaps if they had a well-defined “clawback” agreement, the results would have been different?

What is a “Clawback” Agreement?

A clawback agreement enables the parties in a case to agree – in advance – that if privileged documents are inadvertently produced during discovery, privilege on those documents won’t be waived. The inadvertently produced documents are instead returned to the producing party, or destroyed by the receiving party – either way, they are not used by the receiving party. As part of that agreement, each party is able to identify documents that it has inadvertently produced and request that they be returned or destroyed by the opposing party. Per the clawback agreement, the opposing party agrees to comply with that request and not make a claim of waiver.

Protection of Waiver under FRE 502

Federal Rule of Evidence 502 (FRE 502) was enacted in 2008 to provide additional reassurances for parties dealing with the problem of inadvertent waiver. Under FRE 502, inadvertent disclosure of privileged material does not operate as a waiver if three conditions are met:

The disclosure is inadvertent
The holder of the privilege took reasonable steps to prevent disclosure
The holder promptly took reasonable steps to rectify the error

What a Clawback Agrement Should Include

To promote compliance with the three conditions, the clawback agreement should address each one, clearly stating that inadvertent production of privileged material will not waive privilege, and addressing the steps that should be taken to avoid inadvertent disclosure, as well as to rectify the error, if such inadvertent disclosure occurs. The definition of “reasonable steps” in each case is in the eye of the beholder, so it’s a good idea to establish and agree on specific steps, if possible. The clawback agreement should also clearly define the procedure to be followed if assertion of privilege is disputed.

Because of FRE 502, if a clawback agreement is incorporated into a protective order entered by the court early in the case, it ensures court approval of the process in case there are disagreements and is binding on the parties, including third parties. If you can’t agree on the terms of the clawback agreement with the opposing party early in the case and establishing the protection that it affords is important, you may need to file a motion with the court to get a clawback order entered.

So, what do you think? Do your cases typically include a court-filed clawback agreement? Have those agreements ever been used to protect inadvertently disclosed privileged information? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Plaintiffs Should Pay for Extensive Discovery Prior to Class Certification – eDiscovery Case Law

September 27, 2012

Have you ever joined a health club, then later tried to cancel your membership? Did the health club make it easy to do so? If not, then this case is for you.

In Boeynaems v. LA Fitness International, LLC, No. 10-2326, 2012 U.S. Dist. (E.D. Pa. Aug. 16, 2012), Pennsylvania District Judge Michael Baylson held that “where (1) class certification is pending and (2) the plaintiffs have asked for very extensive discovery, compliance with which will be very extensive, that absent compelling equitable circumstances to the contrary, the plaintiffs should pay for the discovery they seek . . . . Where the burden of discovery expense is almost entirely on the defendant, principally because the plaintiffs seek class certification, then the plaintiffs should share the costs.”

This case emerged from two separate claims filed by plaintiffs who claimed they “encountered deception and breaches concerning their desire to terminate their membership” with the national gym chain LA Fitness. The two cases were consolidated, and the plaintiffs were seeking class certification so that other plaintiffs could join the suit.

After recounting the discovery history between the parties, Judge Baylson noted that this case arose because of an unresolved dispute, including who should bear the cost of continued discovery. To produce ESI requested by the plaintiffs, LA Fitness approximated it would cost the company hundreds of thousands of additional dollars. LA Fitness had already incurred expenses for discovery tasks previously undertaken, including the review of thousands of e-mails, review of “(1) over 500,000 Member Notes from five states for 30 months looking for certain terms, (2) over 1,000 boxes of cancellation requests, of which Plaintiffs reviewed only 70 boxes, (3) over 19,000 pages of documents, and (4) an electronic search of over 32,000 e-mails, maintained by five custodians.” Moreover, LA Fitness asserted that its review of “a sampling of these Member Notes has exhibited only an extremely small proportion with any evidence probative of Plaintiffs’ claims.”

Judge Baylson pointed out that as a result of the extensive review already undertaken by LA Fitness, the plaintiffs had already “already amassed, mostly at Defendant’s expense, a very large set of documents that may be probative as to the class action issue.” In fairness, Judge Baylson concluded that the costs should now shift to the plaintiffs: “In other words, given the large amount of information defendant has already provided, plaintiffs need to assess the value of additional discovery for their class action motion. If plaintiffs conclude that additional discovery is not only relevant, but important to proving that a class should be certified, then plaintiffs should pay for that additional discovery from this date forward, at least until the class action determination is made.” Also, “if the plaintiffs have confidence in their contention that the court should certify the class, then the plaintiffs should have no objection to making an investment.” Moreover, Judge Baylson noted its counsel could afford the investment, as the plaintiffs were represented by “the very successful and well-regarded Philadelphia firm of Berger & Montague. . . . If the Berger & Montague firm believes that this case is meritorious, it has the financial ability to make the investment in discovery.”

Therefore, for production of any requested documents going forward, the plaintiffs were found to have the responsibility for bearing the costs.

So, what do you think? Should the plaintiffs pay for additional discovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery (free subscription required). For eDiscovery news and best practices, check out the Applied Discovery Blog here.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

When is a Billion Dollars Not Enough? – eDiscovery Case Law

September 26, 2012

When it’s Apple v. Samsung, of course!

According to the Huffington Post, Apple Inc. requested a court order for a permanent U.S. sales ban on Samsung Electronics products found to have violated its patents along with additional damages of $707 million on top of the $1.05 billion dollar verdict won by Apple last month, already one of the largest intellectual-property awards on record.

Back in August, a jury of nine found that Samsung infringed all but one of the seven patents at issue and found all seven of Apple's patents valid – despite Samsung's attempts to have them thrown out. They also determined that Apple didn't violate any of the five patents Samsung asserted in the case. Apple had been requesting $2.5 billion in damages. Trial Judge Lucy Koh could still also triple the damage award because the jury determined Samsung had acted willfully.

Interviewed after the trial, some of the jurors cited video testimony from Samsung executives and internal emails as key to the verdict, which was returned after just 22 hours of deliberation, despite the fact that the verdict form contained as many as 700 points the jury (including charges brought against different subsidiaries of the two companies addressing multiple patents and numerous products).

Role of Adverse Inference Sanction

As noted on this blog last month, Samsung received an adverse inference instruction from California Magistrate Judge Paul S. Grewal just prior to the start of trial as failure to turn “off” the auto-delete function in Samsung’s proprietary “mySingle” email system resulted in spoliation of evidence as potentially responsive emails were deleted after the duty to preserve began. As a result, Judge Grewal ordered instructions to the jury to indicate that Samsung had failed to preserve evidence and that evidence could be presumed relevant and favorable to Apple. However, Judge Lucy Koh decided to modify the “adverse inference” verdict issued for the jury to include instructions that Apple had also failed to preserve evidence. Therefore, it appears as though the adverse inference instruction was neutralized and did not have a significant impact in the verdict; evidently, enough damning evidence was discovered that doomed Samsung in this case.

Friday's Filings

In a motion filed on Friday, Apple sought approximately $400 million additional in damages for design infringement by Samsung; approximately $135 million for willful infringement of its utility patents; approximately $121 million in supplemental damages based on Samsung's product sales not covered in the jury's deliberation; and approximately $50 million of prejudgment interest on damages through December 31 – total of $707 million requested. Apple also requested an injunction to cover "any of the infringing products or any other product with a feature or features not more than colorably different from any of the infringing feature or features in any of the Infringing Products."

Not surprisingly, Samsung submitted a filing on Friday, requesting a new trial “enabling adequate time and even-handed treatment of the parties”, stating “The Court's constraints on trial time, witnesses and exhibits were unprecedented for a patent case of this complexity and magnitude, and prevented Samsung from presenting a full and fair case in response to Apple's many claims.”

So, what do you think? Will Apple get more money? Will Samsung get a new trial? If so, will there be more discovery sanctions? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Proper Wildcard Searching: Why You Should Give a Dam* – eDiscovery Best Practices

September 25, 2012

When we launched eDiscoveryDaily over two years ago, I relayed a story where I provided search strategy assistance to a client that had already agreed upon several searches with opposing counsel. One search related to mining activities, so the attorney decided to use a wildcard of “min*” to retrieve variations like “mine”, “mines” and “mining”. That one search retrieved over 300,000 files with hits.

Why? Because there are 269 words in the English language that begin with the letters “min”. Words like “mind”, “mingle”, “minimal”, “miniscule” and “minutia” were all being retrieved in this search for files related to “mining”. We ultimately had to go back to opposing counsel and negotiate a revised search that was more appropriate.

Recently, I encountered another client, who was trying to use “dam*” to retrieve variations of “damage” and “damages”. Unfortunately, they also retrieved “dame”, “damp” and, well, “damn”. There are 86 total words in the English language that begin with the letters “dam”. Darn it!

Methods to Retrieve the Correct Wildcard Variations

In that blog post, I talked about the benefits of stem searching (if your application’s search engine supports stem searches) to capture the specific variations of a word (like “mine” or “damage”) and Morewords.com, which shows list of words that begin with your search string. For example, to get all 269 words beginning with “min”, go here. Substitute any characters for “min” in the URL to see the words that start with those characters. Choose the variations you want and incorporate them into the search instead of the wildcard – i.e., use “(mine or “mines or mining)” instead of “min*” to retrieve a more precise result set without sacrificing recall. Personally, I almost never use wildcards – I prefer to identify the variations and just use them, it’s more precise.

Introducing Spelling Variations into the Mix

The above approaches assume that words are spelled correctly in the collection – if they are not, those misspellings won’t be retrieved. Misspellings can include Optical Character Recognition (OCR) errors, where the OCR application fails to render all words read from an image file with 100% accuracy (this is common, especially when the resolution of the image is less than optimal). So, you can get “words” in the collection such as “min1ng” or “MININ6”.

To combat this, you’ll need to identify the variations of the terms you wish to use, then you can use a search tool like CloudNine Discovery’s Early Case Assessment application, (FirstPass®, powered by Venio FPR™), that supports "fuzzy" searching, which is a mechanism by finding alternate words that are close in spelling to the word you're looking for (usually one or two characters off). FirstPass will display all of the words – in the collection – close to the word you’re looking for, so if you’re looking for “mining”, you can find variations such as “min1ng”, “MININ6” or even “minig” – that could be relevant. Then, simply select the variations you wish to include in the search. You’ll need to repeat this for each of the variations of the terms you wish to use, but it will enable you to pick up those misspellings and OCR errors to ensure completeness.

So, what do you think? Do you use wildcards in your searches? Are you sure you’re getting just the terms you want? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscoveryDaily