
Don’t Miss Today’s Webinar – How Automation is Revolutionizing eDiscovery!: eDiscovery Trends

Today is your chance to catch a terrific discussion about automation in eDiscovery and, particularly an in-depth discussion about technology assisted review (TAR) and whether it lives up to the current hype!

Today, ACEDS will be conducting a webinar panel discussion, titled How Automation is Revolutionizing eDiscovery, sponsored by CloudNine.  Our panel discussion will provide an overview of the eDiscovery automation technologies and we will really take a hard look at the technology and definition of TAR and the limitations associated with both.  This time, Mary Mack, Executive Director of ACEDS will be moderating and I will be one of the panelists, along with Bill Dimm, CEO of Hot Neuron and Bill Speros, Evidence Consulting Attorney with Speros & Associates, LLC.

The webinar will be conducted at 1:00 pm ET (which is 12:00 pm CT, 11:00 am MT and 10:00 am PT).  Oh, and 5:00 pm GMT (Greenwich Mean Time).  If you’re in any other time zone, you’ll have to figure it out for yourself.  Click on the link here to register.

If you’re interested in learning about various ways in which automation is being used in eDiscovery and getting a chance to look at the current state of TAR, possible warts and all, I encourage you to sign up and attend.  It should be an enjoyable and educational hour.  Thanks to our friends at ACEDS for presenting today’s webinar!

So, what do you think?  Do you think automation is revolutionizing eDiscovery?  As always, please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

English Court Rules that Respondents Can Use Predictive Coding in Contested Case: eDiscovery Case Law

In Brown v BCA Trading, et. al. [2016] EWHC 1464 (Ch), Mr. Registrar Jones ruled that, with “nothing, as yet, to suggest that predictive coding will not be able to identify the documents which would otherwise be identified through, for example, keyword search”, “predictive coding must be the way forward” in this dispute between parties as to whether the Respondents could use predictive coding to respond to eDisclosure requests.

The May 17 order began by noting that “the question whether or not electronic disclosure by the Respondents should be provided, as they ask, using predictive coding or via a more traditional keyword approach instead” was “contested”.  With the “majority of the documents that may be relevant for the purposes of trial…in the hands of the First Respondent”, the order noted that fact is “relevant to take into account when considering the Respondents’ assertion, presented from their own view and on advice received professionally, that they think predictive coding will be the most reasonable and proportionate method of disclosure.”  The cost for predictive coding was estimated “in the region of £132,000” whereas the costs for a key word search approach was estimated to be “at least £250,000” and could “even reach £338,000 on a worst case scenario” (emphasis added).  In the order, it was acknowledged that the cost “is relevant and persuasive only to the extent that predictive coding will be effective and achieve the disclosure required.”

With that in mind, Mr. Registrar Jones stated the following: “I reach the conclusion based on cost that predictive coding must be the way forward. There is nothing, as yet, to suggest that predictive coding will not be able to identify the documents which would otherwise be identified through, for example, keyword search and, more importantly, with the full cost of employees/agents having to carry out extensive investigations as to whether documents should be disclosed or not. It appears from the information received from the Respondents that predictive coding will be considerably cheaper than key word disclosure.”

The order also referenced the ten factors set out by Master Matthews in the Pyrrho Investments case (the first case in England to approve predictive coding) to help determine that predictive coding was appropriate for that case, with essentially all factors applying to this case as well, except for factor 10 (the parties have agreed on the use of the software, and also how to use it).

So, what do you think?  Do you think parties should always have the right to use predictive coding to support their production efforts absence strong evidence that it is not as effective as other means?  Please share any comments you might have or if you’d like to know more about a particular topic.

For more reading about this case, check out Chris Dale’s post here and Adam Kuhn’s post here.

Don’t forget that tomorrow at 1:00pm ET, ACEDS will be conducting a webinar panel discussion, titled How Automation is Revolutionizing eDiscovery, sponsored by CloudNine.  Our panel discussion will provide an overview of the eDiscovery automation technologies and we will really take a hard look at the technology and definition of TAR and the limitations associated with both.  This time, Mary Mack, Executive Director of ACEDS will be moderating and I will be one of the panelists, along with Bill Dimm, CEO of Hot Neuron and Bill Speros, Evidence Consulting Attorney with Speros & Associates, LLC.  Click on the link here to register.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Denies Defendant’s Motion to Overrule Plaintiff’s Objections to Discovery Requests

Judge Peck Refuses to Order Defendant to Use Technology Assisted Review: eDiscovery Case Law

We’re beginning to see more disputes between parties regarding the use of technology assisted review (TAR) in discovery.  Usually in these disputes, one party wants to use TAR and the other party objects.  In this case, the dispute was a bit different…

In Hyles v. New York City, No. 10 Civ. 3119 (AT)(AJP) (S.D.N.Y. Aug. 1, 2016), New York Magistrate Judge Andrew J. Peck, indicating that the key issue before the court in the discovery dispute between parties was whether (at the plaintiff’s request) the defendants can be forced to use technology assisted review, refused to force the defendant to do so, stating “The short answer is a decisive ‘NO.’”

Case Background

In this discrimination case by a former employee of the defendant, after several delays in discovery, the parties had several discovery disputes.  They filed a joint letter with the court, seeking rulings as to the proper scope of ESI discovery (mostly issues as to custodians and date range) and search methodology – whether to use keywords (which the defendants wanted to do) or TAR (which the plaintiff wanted the defendant to do).

With regard to date range, the parties agreed to a start date for discovery of September 1, 2005 but disagreed on the end date.  In the discovery conference held on July 27, 2016, Judge Peck ruled on a date in between what the plaintiff and defendants – April 30, 2010, without prejudice to the plaintiff seeking documents or ESI from a later period, if justified, on a more targeted inquiry basis.  As to custodians, the City agreed to search the files of nine custodians, but not six additional custodians that the plaintiff requested.  The Court ruled that discovery should be staged, by starting with the agreed upon nine custodians. After reviewing the production from the nine custodians, if the plaintiff could demonstrate that other custodians had relevant, unique and proportional ESI, the Court would consider targeted searches from those custodians.

After the parties had initial discussions about the City using keywords, the plaintiff’s counsel consulted an ediscovery vendor and proposed that the defendants should use TAR as a “more cost-effective and efficient method of obtaining ESI from Defendants.”  The defendants declined, both because of cost and concerns that the parties, based on their history of scope negotiations, would not be able to collaborate to develop the seed set for a TAR process.

Judge’s Ruling

Judge Peck noted that “Hyles absolutely is correct that in general, TAR is cheaper, more efficient and superior to keyword searching” and referenced his “seminal” DaSilva Moore decision and also his 2015 Rio Tinto decision where he wrote that “the case law has developed to the point that it is now black letter law that where the producing party wants to utilize TAR for document review, courts will permit it.”  Judge Peck also noted that “Hyles’ counsel is correct that parties should cooperate in discovery”, but stated that “[c]ooperation principles, however, do not give the requesting party, or the Court, the power to force cooperation or to force the responding party to use TAR.”

Judge Peck, while acknowledging that he is “a judicial advocate for the use of TAR in appropriate cases”, also noted that he is also “a firm believer in the Sedona Principles, particularly Principle 6, which clearly provides that:

Responding parties are best situated to evaluate the procedures, methodologies, and technologies appropriate for preserving and producing their own electronically stored information.”

Judge Peck went on to state: “Under Sedona Principle 6, the City as the responding party is best situated to decide how to search for and produce ESI responsive to Hyles’ document requests. Hyles’ counsel candidly admitted at the conference that they have no authority to support their request to force the City to use TAR. The City can use the search method of its choice. If Hyles later demonstrates deficiencies in the City’s production, the City may have to re-do its search.  But that is not a basis for Court intervention at this stage of the case.”  As a result, Judge Peck denied the plaintiff’s application to force the defendants to use TAR.

So, what do you think?  Are you surprised by that ruling?  Please share any comments you might have or if you’d like to know more about a particular topic.

Don’t forget that next Wednesday at 1:00pm ET, ACEDS will be conducting a webinar panel discussion, titled How Automation is Revolutionizing eDiscovery, sponsored by CloudNine.  Our panel discussion will provide an overview of the eDiscovery automation technologies and we will really take a hard look at the technology and definition of TAR and the limitations associated with both.  This time, Mary Mack, Executive Director of ACEDS will be moderating and I will be one of the panelists, along with Bill Dimm, CEO of Hot Neuron and Bill Speros, Evidence Consulting Attorney with Speros & Associates, LLC.  Click on the link here to register.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

ACEDS Adds its Weight to the eDiscovery Business Confidence Survey: eDiscovery Trends

We’ve covered two rounds of the quarterly eDiscovery Business Confidence Survey created by Rob Robinson and conducted on his terrific Complex Discovery site (previous results are here and here).  It’s time for the Summer 2016 Survey.  Befitting of the season, the survey has a HOT new affiliation with the Association of Certified eDiscovery Specialists (ACEDS).

As before, the eDiscovery Business Confidence Survey is a non-scientific survey designed to provide insight into the business confidence level of individuals working in the eDiscovery ecosystem. The term ‘business’ represents the economic factors that impact the creation, delivery, and consumption of eDiscovery products and services.  The purpose of the survey is to provide a subjective baseline for understanding the trajectory of the business of eDiscovery through the eyes of industry professionals.

Also as before, the survey asks questions related to how you rate general business conditions for eDiscovery in your segment of the eDiscovery market, both current and six months from now, a general sense of where you think revenue and profits will be for your segment of the market in six months and which issue do you think will most impact the business of eDiscovery over the next six months, among other questions.  It’s a simple nine question survey that literally takes about a minute to complete.  Who hasn’t got a minute to provide useful information?

Individual answers are kept confidential, with the aggregate results to be published on the ACEDS website (News & Press), on the Complex Discovery blog, and on selected ACEDS Affiliate websites and blogs (we’re one of those and we’ll cover the results as we have for the first two surveys) upon completion of the response period, which started on August 1 and goes through Wednesday, August 31.

What are experts saying about the survey?  Here are a couple of notable quotes:

Mary Mack, Executive Director of ACEDS stated: “The business of eDiscovery is an ever-present and important variable in the equation of legal discovery.  As financial factors are a primary driver in eDiscovery decisions ranging from sourcing and staffing to development and deployment, ACEDS sees value in regularly checking the business pulse of eDiscovery professionals. The eDiscovery Business Confidence Survey provides a tool to help take that pulse on a systematic basis and ACEDS looks forward to sponsoring, participating, and reporting on the results of this salient survey each quarter.”

George Socha, Co-Founder of EDRM and Managing Director of Thought Leadership of BDO stated: “In my experience, the successful conduct of eDiscovery is comprised of a balance of in-depth education, practical execution, and experience-based excellence.  The eDiscovery Business Confidence survey being highlighted by ACEDS is one of many industry surveys that positively contributes to this balance, as it provides a quarterly snapshot into the business of discovery. I highly encourage serious eDiscovery professionals to complete and consider this survey as a key tool for understanding the business challenges and opportunities in our profession.”

The more respondents there are, the more useful the results will be!  What more do you need?  Click here to take the survey yourself.  Don’t forget!

So, what do you think?  Are you confident in the state of business within the eDiscovery industry?  Share your thoughts in the survey and, as always, please share any comments you might have with us or let us know if you’d like to know more about a particular topic.

Don’t forget that next Wednesday at 1:00pm ET, ACEDS will be conducting a webinar panel discussion, titled How Automation is Revolutionizing eDiscovery, sponsored by CloudNine.  Our panel discussion will provide an overview of the eDiscovery automation technologies and we will really take a hard look at the technology and definition of TAR and the limitations associated with both.  This time, Mary Mack, Executive Director of ACEDS will be moderating and I will be one of the panelists, along with Bill Dimm, CEO of Hot Neuron and Bill Speros, Evidence Consulting Attorney with Speros & Associates, LLC.  Click on the link here to register.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

How Automation is Revolutionizing eDiscovery: eDiscovery Trends

I thought about titling this post “Less Than Half of Automation is Revolutionizing eDiscovery” to keep the streak alive, but (alas) all good streaks must come to an end… :o)

If you missed our panel session last month in New York City at The Masters Conference, you missed a terrific discussion about automation in eDiscovery and, particularly an in-depth discussion about technology assisted review (TAR) and whether it lives up to the current hype.  Now, you get another chance to check it out, thanks to ACEDS.

Next Wednesday, ACEDS will be conducting a webinar panel discussion, titled How Automation is Revolutionizing eDiscovery, sponsored by CloudNine.  Our panel discussion will provide an overview of the eDiscovery automation technologies and we will really take a hard look at the technology and definition of TAR and the limitations associated with both.  This time, Mary Mack, Executive Director of ACEDS will be moderating and I will be one of the panelists, along with Bill Dimm, CEO of Hot Neuron and Bill Speros, Evidence Consulting Attorney with Speros & Associates, LLC.

The webinar will be conducted at 1:00 pm ET (which is 12:00 pm CT, 11:00 am MT and 10:00 am PT).  Oh, and 5:00 pm GMT (Greenwich Mean Time).  If you’re in any other time zone, you’ll have to figure it out for yourself.  Click on the link here to register.

If you’re interested in learning about various ways in which automation is being used in eDiscovery and getting a chance to look at the current state of TAR, possible warts and all, I encourage you to sign up and attend.  It should be an enjoyable and educational hour.  Thanks to our friends at ACEDS for conducting the session!

So, what do you think?  Do you think automation is revolutionizing eDiscovery?  As always, please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Denies Defendant’s Motion to Overrule Plaintiff’s Objections to Discovery Requests

Cooperation in Predictive Coding Exercise Fails to Avoid Disputed Production: eDiscovery Case Law

 In Dynamo Holdings v. Commissioner of Internal Revenue, Docket Nos. 2685-11, 8393-12 (U.S. Tax Ct. July 13, 2016), Texas Tax Court Judge Ronald Buch ruled denied the respondent’s Motion to Compel Production of Documents Containing Certain Terms, finding that there is “no question that petitioners satisfied our Rules when they responded using predictive coding”.

Case Background

In this case involving various transfers from one entity to a related entity where the respondent determined that the transfers were disguised gifts to the petitioner’s owners and the petitioners asserted that the transfers were loans, the parties previously disputed the use of predictive coding for this case and, in September 2014 (covered by us here), Judge Buch ruled that “[p]etitioners may use predictive coding in responding to respondent’s discovery request. If, after reviewing the results, respondent believes that the response to the discovery request is incomplete, he may file a motion to compel at that time.”

At the outset of this ruling, Judge Buch noted that “[t]he parties are to be commended for working together to develop a predictive coding protocol from which they worked”.  As indicated by the parties’ joint status reports, the parties agreed to and followed a framework for producing the electronically stored information (ESI) using predictive coding: (1) restoring and processing backup tapes, (2) selecting and reviewing seed sets, (3) establishing and applying the predictive coding algorithm; and (4) reviewing and returning the production set

While the petitioners were restoring the first backup tape, the respondent requested that the petitioners conduct a Boolean search and provided petitioners with a list of 76 search terms for the petitioners to run against the processed data.  That search yielded over 406,000 documents, from which two 1,000 document samples were conducted and provided to the respondent for review.  After the model was run against the second 1,000 documents, the petitioners’ technical professionals reported that the model was not performing well, so the parties agreed that the petitioners would select an additional 1,000 documents that the algorithm had ranked high for likely relevancy and the respondent reviewed them as well.  The respondent declined to review one more validation sample of 1,000 documents when the petitioner’s technical professionals explained that the additional review would be unlikely to improve the model.

Ultimately, using the respondent’s selected recall rate of 95 percent, the petitioners ran the algorithm against the 406,000 documents to identify documents to produce (followed by a second algorithm to identify privileged materials) and, between January and March 2016, the petitioners delivered a production set of approximately 180,000 total documents on a portable device for the respondent to review and included a relevancy score for each document – ultimately, the respondent only found 5,796 to be responsive (barely over 3% of the production) and returned the rest.

On June 17, 2016, the respondent filed a motion to compel production of the documents identified in the Boolean search that were not produced in the production set (1,353 of 1,645 documents containing those terms they claimed were not produced), asserting that those documents were “highly likely to be relevant.”  Ten days later, the petitioner filed an objection to the respondent’s motion to compel, challenging the respondent’s calculations of documents that were incorrectly produced by noting that only 1,360 of documents actually contained those terms, that 440 of them had actually been produced and that many of the remaining documents predated or postdated the relevant time period.  They also argued that the documents were selected by the predictive coding algorithm based on selection criteria set by the respondent.

Judge’s Ruling

Judge Buch noted that “[r]espondent’s motion is predicated on two myths”: 1) the myth that “manual review by humans of large amounts of information is as accurate and complete as possible – perhaps even perfect – and constitutes the gold standard by which all searches should be measured”, and 2) the myth of a perfect response to the respondent’s discovery request, which the Tax Court Rules don’t require.  Judge Buch cited Rio Tinto where Judge Andrew Peck stated:

“One point must be stressed – it is inappropriate to hold TAR [technology assisted review] to a higher standard than keywords or manual review.  Doing so discourages parties from using TAR for fear of spending more in motion practice than the savings from using from using TAR for review.”

Stating that “[t]here is no question that petitioners satisfied our Rules when they responded using predictive coding”, Judge Buch denied the respondent’s Motion to Compel Production of Documents Containing Certain Terms.

So, what do you think?  If parties agree to the predictive coding process, should they accept the results no matter what?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Is Search Still Important in eDiscovery? I Say Yes: eDiscovery Best Practices

With the acceptance of predictive coding and other technology assisted review mechanisms growing over the past few years, some feel that keyword search is no longer important as a “key” (pun intended) component of the eDiscovery process.  In a new article published last week, I discussed why search is so important in eDiscovery and why law firms and e-discovery companies need better search solutions.

In Inside Counsel (3 reasons e-discovery companies needs better search solutions), the author (Amanda Cicatelli) sat down with me and also with Jeff Nace, VP of Product Management at ONE Discovery Inc. to discuss these and other topics regarding search in eDiscovery.  Both CloudNine and ONE Discovery (along with several other eDiscovery providers) are customers of dtSearch, a text retrieval engine which is embedded in many of the eDiscovery software platforms available on the market today.  I have personally used dtSearch with a handful of different eDiscovery platforms and it is ideal for supporting even very large multi-million document, multi-terabyte collections with effective and fast information retrieval.

Let’s face it, with the total amount of data being captured and stored by organizations doubling every 1.2 years, the ability to quickly and effectively search through data stores that are growing exponentially has become more important than ever to meet discovery obligations within reasonable costs.  Effective search solutions help manage and control discovery costs to help litigants stay within reasonable budgets.

In the article, Nace and I talk about the reasons that companies need effective search solutions, the recurring problems with eDiscovery and ways to address the issues that companies face when trying to manage the growing sizes and sources of electronically stored information (ESI) out there.  Thanks to Inside Counsel and Amanda Cicatelli for the opportunity to discuss the state of eDiscovery searching today and thanks as well as the folks at dtSearch for coordinating the interview!

You can check out the article here.

So, what do you think?  How effectively does your eDiscovery platform search through large collections of ESI?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Court Settles Dispute Between Parties on Number of Custodians to Search and Produce: eDiscovery Case Law

In Family Wireless #1, LLC et. al. v. Automotive Technologies, Inc., No. 15-01310 (D. Conn., May 19, 2016), Connecticut Magistrate Judge Sarah A. L. Merriam partially granted the plaintiff’s motion to compel the defendant to search and produce ESI from additional custodians, finding that “three of the six proposed custodians’ files are likely to include information relevant to this matter, and defendant has not met its burden of showing that inclusion of these three individuals would be unduly burdensome”.

Case Background

In this action for breach of contract, misrepresentation, unjust enrichment, and unfair trade practices between a franchisor and its franchisee, the parties met and conferred multiple times over the course of this litigation in an effort to come to a mutually agreeable list of ESI search terms and custodians.  The parties agreed to the search of the electronic files of seven custodians, but failed to agree on six additional custodians, leading to the plaintiff’s motion to compel.

The plaintiffs requested the inclusion of six additional custodians in the ESI search, arguing that, even though they were lower level employees, they “are believed to have been involved in both decision making and day to day operations relevant to the claims and defenses raised in the litigation”.  The defendants argued that a search of the emails of these individuals was duplicative and would not produce any relevant information that has not already been exchanged and that searching the files of the additional custodians would be overly burdensome, resulting in tens of thousands of additional documents and hours of costly review, partly based on a test search of two of the proposed custodians that “captured 51,583 e-mail family hits” to be reviewed for relevance.

Judge’s Ruling

Judge Merriam stated that she was “not persuaded that the addition of the six proposed custodians would be unduly burdensome for defendant. As defendant acknowledged during the conference, limitations on search parameters can be implemented so as to exclude the production of duplicative emails, addressing the concern that this production would consist of many emails that had been previously produced through the prior searches of the higher-level custodians. Using ‘de-duplication’ measures to limit the search should alleviate some of the cost and time concerns that defendant raises.”  Judge Merriam also was not swayed by the defendant’s arguments regarding relevance, indicating that “[t]he mere fact that many documents have already been produced is not sufficient to establish that there are no other relevant materials to be found.”

However, while Judge Merriam found “that plaintiffs have established good cause for expanding the ESI search to include three additional custodians”, she found that “no showing of good cause has been made by plaintiffs to search the ESI of the other three proposed custodians” during the in-person Discovery Conference the Court held to discuss the issues.  Therefore, Judge Merriam partially granted the plaintiff’s motion to compel the defendant to search and produce ESI from three additional custodians.

So, what do you think?  Should the court have ordered production from all six custodians?  Or was a partial production appropriate?  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Can You Figure Out How I Wrote this Blog Post?: Best of eDiscovery Daily

Even those of us at eDiscovery Daily have to take an occasional vacation (which, as you can see by the picture above, means taking the kids to their favorite water park); however, instead of “going dark” for a few days, we thought we would take a look back at some topics that we’ve covered in the past.  Today’s post takes a look back at a little experiment I performed (which was two phones ago for me, by the way).  Enjoy!


I have to be honest, this blog post contains quite a bit of content from one of the early posts from this blog.  However, there is something different about this version of the content – it looks a bit unusual.  Can you figure out how I wrote it?  See if you can figure it out before you get to the bottom.  I promise I haven’t lost my mind.

Types of exceptions file

It’s important to note that efforts to quote fix quote these files will often change the files parentheses and the meta data associated with them parentheses, so it’s important to establish with opposing counsel what measures to address the exceptions are acceptable. Some files may not be recoverable and you need to agree up front how far to go to attempt to recover them.

  • Corrupted files colon files can become corrupted 4 a variety of reasons, from application failures 2 system crashes to computer viruses. I recently had a case where 40 percent of the collection what’s contained in to corrupt Outlook PST file dash fortunately, we were able to repair those files and recover the messages. If you have read Lee accessible backups of the files, try to restore them from backup. If not, you will need to try using a repair utility. Outlook comes with a utility called scan PST. Exe that scans and repairs PST and OST file, and there are utilities parenthesis including freeware utilities parenthesis available via the web foremost file types. If all else fails, you can hire a-data recovery expert, but that can get very expensive.
  • Password protected files colon most collections usually contain at least some password protected files. Files can require a password to enable them to be edited, or even just to view them. As the most popular publication format, PDF files are often password protected from editing, but they can still be feud 2 support review parenthesis though some search engines May fail to index them parenthesis. If a file is password protected, you can try to obtain the password from the custodian providing the file dash if the custodian is unavailable or unable to remember the password, you can try a password cracking application, which will run through a series of character combinations to attempt to find the password. Be patient, it takes time, and doesn’t always succeed.
  • Unsupported file types corn in most collections, there are some unusual file types that art supported by the review application, such as file for legacy or specialized applications parenthesis E. G. AutoCAD for engineering drawing parenthesis. You may not even initially no what type of files they are semi colon if not, you can find out based on file extension by looking the file extension up in file ext. If your review application can’t read the file, it also can’t index the files for searching or display them 4 review. If those file maybe responses 2 discovery requests, review them with the natives application to determine they’re relevancy.
  • No dash text file colon files with no searchable text aren’t really exceptions dash they have to be accounted for, but they won’t be retrieved in searches, so it’s important to make sure they don’t quote slip through the cracks unquote. It’s common to perform optical character recognition parenthesis Boosie are parenthesis on Tiff files and image only PDF files, because they are common document 4 minutes. Other types of no text files, such as pictures in JTAG or PNG format, are usually not oser, unless there is an expectation that they will have significant text.

Did you figure it out?  I “dictated” the above content using speech-to-text on my phone, a Samsung Galaxy 3 (yes, that was three years and four versions ago, I will have to update the “experiment” soon to see if the speech-to-text is any better now on my Apple iPhone 6).  I duplicated the formatting from the earlier post, but left the text the way that the phone “heard” it.  Some of the choices it made were interesting: it understands “period” and “comma” as punctuation, but not “colon”, “quote” or “parenthesis”.  Words like “viewed” became “feud”, “readily” became “read Lee” and “OCR” became “Boosie are”.  It also often either dropped or added an “s” to words that I spoke.

These days, more ESI is discoverable from sources that are non-formalized, including texts and “tweets”.  Acronyms and abbreviations (and frequent misspelling of words) is common in these data sources (whether typed or through bad dictation), which makes searching them for responsive information very challenging.  You need to get creative when searching these sources and use mechanisms such as conceptual clustering to group similar documents together, as well as stemming and fuzzy searching to find variations and misspellings of words.

Want to see the original version of the post?  Here it is.

So, what do you think?  How do you handle informal communications, like texts and “tweets”, in your searching of ESI?   Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Data May Be Doubling Every Couple of Years, But How Much of it is Original?: Best of eDiscovery Daily

Even those of us at eDiscovery Daily have to take an occasional vacation (which, as you can see by the picture above, means taking the kids to their favorite water park); however, instead of “going dark” for a few days, we thought we would take a look back at some topics that we’ve covered in the past.  Today’s post takes a look back at the challenge of managing duplicative ESI during eDiscovery.  Enjoy!


According to the Compliance, Governance and Oversight Council (CGOC), information volume in most organizations doubles every 18-24 months (now, it’s more like every 1.2 years). However, just because it doubles doesn’t mean that it’s all original. Like a bad cover band singing Free Bird, the rendition may be unique, but the content is the same. The key is limiting review to unique content.

When reviewers are reviewing the same files again and again, it not only drives up costs unnecessarily, but it could also lead to problems if the same file is categorized differently by different reviewers (for example, inadvertent production of a duplicate of a privileged file if it is not correctly categorized).

Of course, we all know the importance of identifying exact duplicates (that contain the exact same content in the same file format) which can be identified through MD5 and SHA-1 hash values, so that they can be removed from the review population and save considerable review costs.

Identifying near duplicates that contain the same (or almost the same) information (such as a Word document published to an Adobe PDF file where the content is the same, but the file format is different, so the hash value will be different) also reduces redundant review and saves costs.

Then, there is message thread analysis. Many email messages are part of a larger discussion, sometimes just between two parties, and, other times, between a number of parties in the discussion. To review each email in the discussion thread would result in much of the same information being reviewed over and over again. Pulling those messages together and enabling them to be reviewed as an entire discussion can eliminate that redundant review. That includes any side conversations within the discussion that may or may not be related to the original topic (e.g., a side discussion about the latest misstep by Anthony Weiner).

Clustering is a process which pulls similar documents together based on content so that the duplicative information can be identified more quickly and eliminated to reduce redundancy. With clustering, you can minimize review of duplicative information within documents and emails, saving time and cost and ensuring consistency in the review. As a result, even if the data in your organization doubles every couple of years, the cost of your review shouldn’t.

So, what do you think? Does your review tool support clustering technology to pull similar content together for review? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.