
eDiscovery Trends: John Simek


This is the third of our Holiday Thought Leader Interview series.  I interviewed several thought leaders to get their perspectives on various eDiscovery topics.

Today’s thought leader is John Simek. John is the Vice President of Sensei Enterprises, a computer forensics firm in Fairfax, Va, where he has worked since 1997. He is an EnCase Certified Examiner and is a nationally known testifying expert in computer forensic issues. Together with his wife, Sharon Nelson, John has become a frequent speaker on eDiscovery topics and digital forensic issues. We have also interviewed Sharon, who serves as Sensei’s President, for this series, and her interview will appear this coming Wednesday.

You have been a forensic examiner for a long time. How has the business changed over that time? How much does the rate of change in computer technology make your job difficult? Has social media and mobile technology changed the nature of your work and the evidence in play?

Certainly the technology changes present a challenge for any forensic examiner. We are constantly investing in training and tools to deal with the changing landscape. Social media investigations and mobile devices are explosive forms of evidence for many of our cases. The constant changes in smartphones means we must have dozens of tools to extract data from iPads, Androids, BlackBerrys, iPhones, tablets and other mobile devices. Access to social media data varies as well. Some is readily available in the public areas, some may reside on the actual computer used to access the social media sites and some data may be held by the providers themselves, where the user has no clue it is being collected.

There have been several cases of law firms and EDD providers suing each other of late. Why is there this seeming rise in conflict and how does it affect relationships in the industry?

I’ve only seen two such cases and they get ugly really quick. I think the primary reason is lack of transparency and adequate communication. The client should always know what the anticipated costs and effort will be. Should scope change then a new estimate needs to be communicated. I think all too often the EDD providers launch out of the gate and the costs spiral out of control. Obviously, if you are one of those providers that ended up in court over fees or even inadequate or improper processing of ESI, your reputation will be forever spoiled.

There are a lot of certifications a forensic examiner can obtain. What is the value of certification? How should buyers of EDD services evaluate their forensic examiners?

Certifications are a good starting point, although I think they have lost their value over the last several years. Perhaps the tests are getting easier, but I’m seeing folks with forensic certifications that shouldn’t be trusted with a mouse in their hand. Don’t just look to forensic certifications either. Other technology (network, operating system, database, etc.) certifications are also valuable. Check CVs. Do they speak, write and have previous experiences testifying? One of the best methods of evaluation is referrals. Did they do a quality job? Were they on time? Did the costs fall within budget?

You’ve done a lot of work in family law cases. In cases where emotions are running high, how do you counsel clients? Is there a way to talk to people about proportionality when they are angry?

You’ve hit the nail on the head. There is very little logic in family law cases, especially when emotions are running high. I’ve lost count of the number of times we’ve told clients NOT to spend their money on continuing or even starting a forensic analysis. Some listen and some don’t. The exception is where there are issues pertaining to the welfare of any children. We had one case where dad was into BDSM and exhibiting similar behavior towards the children. Mom had no job and was extremely brutalized from the abuse over the years. We completed that case pro bono as it was the right thing to do. Dad lost custody and ordered supervised visitation only.

There has been a lot of hype about EDD services for small firms. In your experience, is this becoming a reality? Can small and solo firms compete with large firms for more EDD cases?

Electronic evidence plays a part in more and more cases. There is a crying need for better tools and methods to review ESI in the smaller cases. Thankfully, some vendors are listening. Products like Digital Warroom and Nextpoint’s products are very affordable for the smaller cases and don’t require a large investment by the solo or small firm attorney. These are hosted solutions, which means you are using the cloud. Large firms are also using hosted solutions, but may use other vendor products depending on the type of data (e.g. foreign language) and/or volume.

You testify in a lot of cases as an expert witness. What are the reasons your services might be needed in this area? What are common reasons that forensic evidence is being challenged, and how can legal teams avoid being challenged?

The good news is that less than 10% of our cases end up going to trial. As we say in the forensic world, “The truth is the truth.” Once we have had a chance to analyze the evidence and report the findings, there are rarely any challenges. That’s what a forensic exam is all about- being repeatable. The opposing party’s examiner better find the same results. The challenge may come from the interpretation of the results. This is where experience and knowledge of the expert comes into play. Many of the forensic examiners today have never used a computer without a graphical interface. Remember the Casey Anthony case? I cringed when I heard the prosecution testimony about the activity surrounding the Internet searches. It failed the smell test in my mind, which ended up being true since the expert later admitted there was a problem with the software that was used.

Would you recommend a similar career path to young technologists? What do you like about being a forensic examiner?

Some universities are now offering degrees in Digital Forensics or some similar name. I’m not sure I would go the route of computer forensics as a baseline. I’m seeing more activity in what I would call digital investigations. This includes network forensics and dealing with cases such as data breaches. We are doing more and more of these types of exams. It’s sort of like following the data trail. Probably the single best thing about being a forensic examiner is getting to the truth. Since we also do criminal defense work, there are many times that we’ve had to call the attorney and tell them that their client needs a new story.

Thanks, John, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Trends: Jason R. Baron


This is the first of the Holiday Thought Leader Interview series.  I interviewed several thought leaders to get their perspectives on various eDiscovery topics.

Today’s thought leader is Jason R. Baron. Jason has served as the National Archives' Director of Litigation since May 2000 and has been involved in high-profile cases for the federal government. His background in eDiscovery dates to the Reagan Administration, when he helped retain backup tapes containing Iran-Contra records from the National Security Council as the Justice Department’s lead counsel. Later, as director of litigation for the U.S. National Archives and Records Administration, Jason was assigned a request to review documents pertaining to tobacco litigation in U.S. v. Philip Morris.

He currently serves as The Sedona Conference Co-Chair of the Working Group on Electronic Document Retention and Production. Baron is also one of the founding coordinators of the TREC Legal Track, a search project organized through the National Institute of Standards and Technology to evaluate search protocols used in eDiscovery. This year, Jason was awarded the Emmett Leahy Award for Outstanding Contributions and Accomplishments in the Records and Information Management Profession.

You were recently awarded the prestigious Emmett Leahy Award for excellence in records management. Is it unusual that a lawyer wins such an award? Or is the job of the litigator and records manager becoming inextricably linked?

Yes, it was unusual: I am the first federal lawyer to win the Emmett Leahy award, and only the second lawyer to have done so in the 40-odd years that the award has been given out. But my career path in the federal government has been a bit unusual as well: I spent seven years working as lead counsel on the original White House PROFS email case (Armstrong v. EOP), followed by more than a decade worrying about records-related matters for the government as Director of Litigation at NARA. So with respect to records and information management, I long ago passed at least the Malcolm Gladwell test in "Outliers" where he says one needs to spend 10,000 hours working on anything to develop a level of "expertise."  As to the second part of your question, I absolutely believe that to be a good litigation attorney these days one needs to know something about information management and eDiscovery — since all evidence is "born digital" and lots of it needs to be searched for electronically. As you know, I also have been a longtime advocate of a greater linking between the fields of information retrieval and eDiscovery.

In your acceptance speech you spoke about the dangers of information overload and the possibility that it will make it difficult for people to find important information. How optimistic that we can avoid this dystopian future? How can the legal profession help the world avoid this fate? 

What I said was that in a world of greater and greater retention of electronically stored information, we need to leverage artificial intelligence and specifically better search algorithms to keep up in this particular information arms race. Although Ralph Losey teased me in a recent blog post that I was being unduly negative about future information dystopias, I actually am very optimistic about the future of search technology assisting in triaging the important from the ephemeral in vast collections of archives. We can achieve this through greater use of auto-categorization and search filtering methods, as well as a having a better ability in the future to conduct meaningful searches across the enterprise (whether in the cloud or not). Lawyers can certainly advise their clients how to practice good information governance to accomplish these aims.

You were one of the founders of the TREC Legal Track research project. What do you consider that project’s achievement at this point?

The initial idea for the TREC Legal Track was to get a better handle on evaluating various types of alternative search methods and technologies, to compare them against a "baseline" of how effective lawyers were in relying on more basic forms of keyword searching. The initial results were a wake-up call, in showing lawyers that sole reliance on simple keywords and Boolean strings sometimes results in a large quantity of relevant evidence going missing. But during the half-decade of research that now has gone into the track, something else of perhaps even greater importance has emerged from the results, namely: we have a much better understanding now of what a good search process looks like, which includes a human in the loop (known in the Legal Track as a topic authority) evaluating on an ongoing, iterative basis what automated search software kicks out by way of initial results. The biggest achievement however may simply be the continued existence of the TREC Legal Track itself, still going in its 6th year in 2011, and still producing important research results, on an open, non-proprietary platform, that are fully reproducible and that benefit both the legal profession as well as the information retrieval academic world. While I stepped away after 4 years from further active involvement in the Legal Track as a coordinator, I continue to be highly impressed with the work of the current track coordinators, led by Professor Doug Oard at the University of Maryland, who was remained at the helm since the very beginning.

To what extent has TREC’s research proven the reliability of computer-assisted review in litigation? Is there a danger that the profession assumes the reliability of computer-assisted review is a settled matter?

The TREC Legal Track results I am most familiar with through calendar year 2010 have shown computer-assisted review methods finding in some cases on the order of 85% of relevant documents (a .85 recall rate) per topic while only producing 10% false positives (a .90 precision rate). Not all search methods have had these results, and there has been in fact a wide variance in success achieved, but these returns are very promising when compared with historically lower rates of recall and precision across many information retrieval studies. So the success demonstrated to date is highly encouraging. Coupled with these results has been additional research reported by Maura Grossman & Gordon Cormack, in their much-cited paper Technology-Assisted Review in EDiscovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, which makes the case for the greater accuracy and efficiency of computer-assisted review methods.

Other research conducted outside of TREC, most notably by Herbert Roitblat, Patrick Oot and Anne Kershaw, also point in a similar direction (as reported in their article Mandating Reasonableness in a Reasonable Inquiry). All of these research efforts buttress the defensibility of technology-assisted review methods in actual litigation, in the event of future challenges. Having said this, I do agree that we are still in the early days of using many of the newer predictive types of automated search methods, and I would be concerned about courts simply taking on faith the results of past research as being applicable in all legal settings. There is no question however that the use of predictive analytics, clustering algorithms, and seed sets as part of technology-assisted review methods is saving law firms money and time in performing early case assessment and for multiple other purposes, as reported in a range of eDiscovery conferences and venues — and I of course support all of these good efforts.

You have discussed the need for industry standards in eDiscovery. What benefit would standards provide?

Ever since I served as Co-Editor in Chief on The Sedona Conference Commentary on Achieving Quality in eDiscovery (2009), I have been thinking that the process for conducting good eDiscovery. That paper focused on project management, sampling, and imposing various forms of quality controls on collection, review, and production. The question is, is a good eDiscovery process capable of being fit into a maturity model of sorts, and might be useful to consider whether vendors and law firms would benefit from having their in-house eDiscovery processes audited and certified as meeting some common baseline of quality? To this end, the DESI IV workshop ("Discovery of ESI") held in Pittsburgh last June, as part of the Thirteenth International AI and Law Conference (ICAIL 2011), had as its theme exploring what types of model standards could be imposed on the eDiscovery discipline, so that we all would be able to work from some common set of benchmarks, Some 75 people attended and 20-odd papers were presented. I believe the consensus in the room was that we should be pursuing further discussions as to what an ISO 9001-type quality standard would look like as applied to the specific eDiscovery sector, much as other industry verticals have their own ISO standards for quality. Since June, I have been in touch with some eDiscovery vendors have actually undergone an audit process to achieve ISO 9001 certification. This is an area where no consensus has yet emerged as to the path forward — but I will be pursuing further discussions with DESI workshop attendees in the coming months and promise to report back in this space as to what comes of these efforts.

What sort of standards would benefit the industry? Do we need standards for pieces of the eDiscovery process, like a defensible search standard, or are you talking about a broad quality assurance process?

DESI IV started by concentrating on what would constitute a defensible search standard; however, it became clear at the workshop and over the course of the past few months that we need to think bigger, in looking across the eDiscovery life cycle as to what constitutes best practices through automation and other means. We need to remember however that eDiscovery is a very young discipline, as we're only five years out from the 2006 Rules Amendments. I don't have all the answers, by any means, on what would constitute an acceptable set of standards, but I like to ask questions and believe in a process of continuous, lifelong learning. As I said, I promise I'll let you know about what success has been achieved in this space.

Thanks, Jason, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

eDiscovery Best Practices: When is it OK to Produce without Linear Review?


At eDiscoveryDaily, the title of our daily post usually reflects some eDiscovery news and/or analysis that we are providing our readers.  However, based on a comment I received from a colleague last week, I thought I would ask a thought provoking question for this post.

There was an interesting post in the EDD Update blog a few days ago entitled Ediscovery Production Without Review, written by Albert Barsocchini, Esq.  The post noted that due to “[a]dvanced analytics, judicial acceptance of computer aided coding, claw back/quick-peek agreements, and aggressive use of Rule 16 hearings”, many attorneys are choosing to produce responsive ESI without spending time and money on a final linear review.

A colleague of mine sent me an email with a link to the post and stated, “I would not hire a firm if I knew they were producing without a doc by doc review.”

Really?  What if:

  • You collected the equivalent of 10 million pages* and still had 1.2 million potentially responsive pages after early data assessment/first pass review? (reducing 88% of the population, which is a very high culling percentage in most cases)
  • And your review team could review 60 pages per hour, requiring 20,000 hours to complete the responsiveness review?
  • And their average rate was a very reasonable $75 per hour to review, resulting in a total cost of $1.5 million to perform a doc by doc review?
  • And you had a clawback agreement in place so that you could claw back any inadvertently produced privileged files?

“Would you insist on a doc by doc review then?”, I asked.

Let’s face it, $1.5 million is a lot of money.  That may seem like an inordinate amount of money to spend on linear review and the data volume for some large cases may be so voluminous that an effective argument might be made to rely on technology to identify the files to produce.

On the other hand, if you’re a company like Google and you inadvertently produced a document in a case potentially worth billions of dollars, $1.5 million doesn’t seem near as big an amount to spend given the risk associated with potential mistakes.  Also, as the Google case and this case illustrate, there are no guarantees with regards to the ability to claw back inadvertently produced files.  The cost of linear review will, especially in larger cases, need to be weighed against the potential risk of not conducting that review for the organization to determine what’s the best approach for them.

So, what do you think?  Do you produce in cases where not all of the responsive documents are reviewed before production? Are there criteria that you use to determine when to conduct or forego linear review?  Please share any comments you might have or if you’d like to know more about a particular topic.

*I used pages in the example to provide a frame of reference to which most attorneys can relate.  While 10 million pages may seem like a large collection, at an average of 50,000 pages per GB, that is only 200 total GB.  Many laptops and desktops these days have a drive that big, if not larger.  Depending on your review approach, most, if not all, original native files would probably never be converted to a standard paginated document format (i.e., TIFF or PDF).  So, it is unlikely that the total page count of the collection would ever be truly known.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Trends: Announcing Holiday Thought Leader Series!


eDiscoveryDaily thought quite a bit about what to get for our readers to celebrate these holidays, and what better to give you than interviews with some of the most influential thought leaders in eDiscovery today!  We haven’t had this much fun since the last round of thought leader interviews we conducted at Legal Tech New York earlier this year!  For a recap of those interviews, click here.

Jason Krause has been working hard and “chased” down several well respected individuals and, as a result, we’re pleased to introduce the schedule for the series, which will begin this Wednesday, December 14.

Here are the interviews that we will be publishing over the next two weeks:

Wednesday, December 14: Jason Baron, National Archives' Director of Litigation since 2000 and Co-Chair of the Working Group on Electronic Document Retention and Production for the Sedona Conference.  Jason is also one of the founding coordinators of the TREC Legal Track, a search project organized through the National Institute of Standards and Technology to evaluate search protocols used in eDiscovery. This year, Jason was awarded the Emmett Leahy Award for Outstanding Contributions and Accomplishments in the Records and Information Management Profession.

Thursday, December 15: Bennett Borden, Co-Chair of Williams Mullen’s eDiscovery and Information Governance Section.  Based in Richmond, Va., Bennett’s practice is focused on Electronic Discovery and Information Law. Bennett has published several papers on the use of predictive coding in litigation and is a frequent speaker on eDiscovery topics.

Friday, December 16: John Simek, Vice President of Sensei Enterprises, a computer forensics firm in Fairfax, Va, where he has worked since 1997. He is an encase Certified Examiner and is a nationally known testifying expert in computer forensic issues.

Monday, December 19: Joshua Poje, Research Specialist with the American Bar Association’s Legal Technology Resource Center, which publishes the Annual Legal Technology Survey. He is a graduate of DePaul University College of Law and Augustana College.

Tuesday, December 20: Joseph Collins, co-founder and president of VaporStream, which provides recordless communications. Collins previously worked in the energy marketplace, but has become an advocate for private communication in business, even within the legal community.

Wednesday, December 21: Sharon Nelson, President of Sensei Enterprises, where she had worked on the front lines of computer forensics and EDD- topics also discusses on the blog Ride the Lightning (one of my favorites!).  She is a graduate of the Georgetown University Law Center and is the president elect of the Virginia Bar Association.

Thanks to everyone for their time in participating in these interviews!  And, thanks to Jason for securing interviews with these key individuals for eDiscoveryDaily.

So, what do you think?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Law: Another Losing Plaintiff Taxed for eDiscovery Costs

As noted yesterday and back in May, prevailing defendants are becoming increasingly successful in obtaining awards against plaintiffs for reimbursement of eDiscovery costs.

An award of costs to the successful defendants in a patent infringement action included $64,295 in costs for conversion of data to TIFF format and $5,950 for an eDiscovery project manager in Jardin v. DATAllegro, Inc., No. 08-CV-1462-IEG (WVG), (S.D. Cal. Oct. 12, 2011).

Defendants in a patent infringement action obtained summary judgment of non-infringement and submitted bills of costs that included $64,295 in costs for conversion of data to TIFF format and $5,950 for an eDiscovery project manager. Plaintiff contended that the costs should be denied because he had litigated the action and its difficult issues in good faith and there was a significant economic disparity between him and the corporate parent of one of the defendants.

The court concluded that plaintiff had failed to rebut the presumption in Fed. R. Civ. P. 54 in favor of awarding costs. The action was resolved through summary judgment rather than a complicated trial, and there was no case law suggesting that the assets of a parent corporation should be considered in assessing costs. The financial position of the party having to pay the costs might be relevant, but it appeared plaintiff was the founder of a company that had been sold for $500 million.

Taxing of costs for converting files to TIFF format was appropriate, according to the court, because the Federal Rules required production of electronically stored information and “a categorical rule prohibiting costs for converting data into an accessible, readable, and searchable format would ignore the practical realities of discovery in modern litigation.” The court stated: “Therefore, where the circumstances of a particular case necessitate converting e-data from various native formats to the .TIFF or another format accessible to all parties, costs stemming from the process of that conversion are taxable exemplification costs under 28 U.S.C. § 1920(4).”

The court also rejected plaintiff’s argument that costs associated with an eDiscovery “project manager” were not taxable because they related to the intellectual effort involved in document production:

Here, the project manager did not review documents or contribute to any strategic decision-making; he oversaw the process of converting data to the .TIFF format to prevent inconsistent or duplicative processing. Because the project manager’s duties were limited to the physical production of data, the related costs are recoverable.

So, what do you think?  Will more prevailing defendants seek to recover eDiscovery costs from plaintiffs? Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery (free subscription required).

eDiscovery Case Law: Plaintiff Responsible for Taxation of eDiscovery Costs

Back in May, we discussed a case where the plaintiff, after losing its lawsuit, was responsible for repaying the defendant more than $367,000 in eDiscovery costs.  It appears that making plaintiffs responsible for eDiscovery costs when they lose is becoming a trend.

In re Aspartame Antitrust Litig., No. 2:06-CV-1732-LDD, (E.D. Pa. Oct. 5, 2011),a case with a “staggering” volume of discovery, successful defendants were awarded about $500,000 of their electronic discovery costs for a litigation database, imaging hard drives, keyword searches, de-duplication, and data extraction that allowed for cost-effective discovery. However, the court refused to award costs for defendants’ use of an eDiscovery program that provided visual clustering of documents and went beyond necessary keyword search and filtering functions.

Defendants in an artificial sweetener market allocation and price fixing class action obtained summary judgment against two representative plaintiffs that had not purchased the sweetener within the four-year statute of limitations. Defendants filed bills of costs, and the plaintiffs asked the court to deny or reduce those costs.

The court granted about $500,000 in disputed costs, most of which were incurred by defendants during electronic discovery. The volume of discovery was “staggering,” according to the court, and “in cases of this complexity, eDiscovery saves costs overall by allowing discovery to be conducted in an efficient and cost-effective manner.” Defendants’ use of third party vendors for keyword searches and culling of duplicates allowed one defendant to reduce over 366 gigabytes of potentially responsive data by 85%. The court stated:

“We therefore award costs for the creation of a litigation database, storage of data, imaging hard drives, keyword searches, de-duplication, data extraction and processing. Because a privilege screen is simply a keyword search for potentially privileged documents, we award that cost as well. In addition, we award costs associated with hosting data that accrued after defendants produced documents to plaintiffs because, as the plaintiffs themselves acknowledged earlier in the proceedings, discovery was ongoing in this case up until summary judgment was issued.”

The court also awarded costs for technical support and the creation of load files. However, it would “draw the line” at awarding costs for use of a “sophisticated eDiscovery program” that provided concept-based visual clustering of document collections. Such a service was “undoubtedly helpful,” but it was “squarely within the realm of costs that are not necessary for litigation but rather are acquired for the convenience of counsel.”

So, what do you think?  Should plaintiffs have to reimburse eDiscovery costs to defendants if they lose? Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery (free subscription required).

eDiscovery Project Management: “Belt and Suspenders” Approach for Effective Communication


eDiscovery Daily has published 57 posts to date related to Project Management principles (including this one).  Those include two excellent series by Jane Gennarelli, one covering a range of eDiscovery Project Management best practice topics from October thru December last year, and another covering management of a contract review team, which ran from January to early March this year.

Effective communication is a key part of effective project management, whether that communication is internally within the project team or externally with your client.  It is so easy for miscommunications to occur that can derail your project and cause deadlines to be missed, or work product to be incomplete or not meet the client’s expectations.

I like to employ a “belt and suspenders” approach to communication with clients as much as possible, by discussing requirements or issues with the client and then following up with documentation to confirm the understanding.  That seems obvious and many project managers start out that way – they discuss project requirements and services with a client and then formally document into a contract or other binding agreement.  However, as time progresses, many PMs start to lax in following up to document changes discussed to scope or approach to handling specific exceptions with clients.  Often, it’s the little day to day discussions and decisions that aren’t documented that can come back to haunt you. Or PMs communicate solely via email and keep the project team waiting for the client to respond to the latest email.  Unless there is a critical decision for which documented agreement is required to proceed, discussing and documenting keeps the project moving while ensuring each decision gets documented.

I can think of several instances where this approach helped avoid major issues, especially with the follow-up agreement or email.  If nothing else, it gives you something to point back to if miscommunication occurs.  Years ago, I met with a client and reviewed a set of hard copy documents that they wanted scanned, processed and loaded into a database (we had a Master Services Agreement in place to cover those services).  The client said they had “sticky notes” on the documents that they wanted.  I took the time to go through those, ask questions and verbally confirm my understanding of which documents they wanted processed.  I then documented in an email what services they wanted and the ranges of documents they requested to be processed and they confirmed the services and those documents in their response (evidently without looking too closely at the list of document ranges).

What the client didn’t know is that one of their paralegals had removed “sticky notes” from some of the documents, so I didn’t have all of the document ranges they intended to process.  When they later started asking questions why certain documents weren’t processed, I was able to point back to the email showing their approval of the document ranges to process, verifying that we had processed the documents as instructed.  The client realized the mistake was theirs, not ours, and we helped them get the remaining documents processed and loaded.  Our reputation with that client remained strong – thanks to the “belt and suspenders” approach!

So, what do you think?  Have you had miscommunications with clients because of inadequate documentation? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Best Practices: Search “Gotchas” Still Get You


A few days ago, I reviewed search syntax that one of my clients had prepared and noticed a couple of “gotchas” that typically cause problems.  While we’ve discussed them on this blog before, it was over a year ago (when eDiscovery Daily was still in its infancy and had a fraction of the readers it has today), so it bears covering them again.

Letting Your Wildcards Run Wild

This client liberally used wildcards to catch variations of words in their hits.  As noted previously, sometimes you can retrieve WAY more with your wildcards than you expect.  In this case, one of the wildcard terms was “win*” (presumably to catch win, wins, winner, winning, etc.).  Unfortunately, there are 253 words that begin with “win”, including wince, winch, wind, windbag, window, wine, wing, wink, winsome, winter, etc.

How do I know that there are 253 words that begin with “win”?  Am I an English professor?  No.  But, I did stay at a Holiday Inn Express last night.  Just kidding.

Actually, there is a site to show a list of words that begin with your search string.  Morewords.com shows a list of words that begin with your search string (e.g., to get all 253 words beginning with “win”, go here – simply substitute any characters for “win” in the URL to see the words that start with those characters).  This site enables you to test out your wildcard terms before using them in searches and substitute the variations you want if the wildcard search is likely to retrieve too many false hits.  Or, if you use an application like FirstPass™, powered by Venio FPR™, for first pass review, you can type the wildcard string in the search form, display all the words – in your collection – that begin with that string, and select the variations on which to search.  Either way enables you to avoid retrieving a lot of false hits you don’t want.

Those Stupid Word “Smart” Quotes

As many attorneys do, this client used Microsoft Word to prepare his proposed search syntax.  The last few versions of Microsoft Word, by default, automatically change straight quotation marks ( ' or " ) to curly quotes as you type. When you copy that text to a format that doesn’t support the smart quotes (such as HTML or a plain text editor), the quotes will show up as garbage characters because they are not supported ASCII characters.  So:

“smart quotes” aren’t very smart

will look like this…

âsmart quotesâ arenât very smart

And, your search will either return an error or some very odd results.

To learn how to disable the automatic changing of quotes to smart quotes or replace smart quotes already in a file, refer to this post from last year.  And, be careful, there’s a lot of “gotchas” out there that can cause search problems.  That’s why it’s always best to be a “STARR” and test your searches, refine and repeat them until they yield expected results.

So, what do you think?  Have you run into these “gotchas” in your searches? Please share any comments you might have or if you’d like to know more about a particular topic.

LitigationWorld Pick of the Week: Could This Be the Most Expensive eDiscovery Mistake Ever?


We’re pleased to announce that our blog post “eDiscovery Best Practices: Could This Be the Most Expensive eDiscovery Mistake Ever?”, regarding Google’s inadvertent disclosure during its litigation with Oracle was selected as the Pick of the Week from TechnoLawyer in the November 21, 2011 issue of LitigationWorldLitigationWorld is a free weekly email newsletter that provides helpful tips regarding electronic discovery, litigation strategy, and litigation technology.  It’s also a great source of ideas for blog posts!  😉

In each issue, the editorial team at LitigationWorld links to the most noteworthy articles on the litigation Web published during the previous week. From these articles, they then select one as their Pick of the Week.

Thanks to the folks at TechnoLawyer for this recognition.  We appreciate it!

eDiscovery Case Law: New York Supreme Court Requires Production of Software to Review Files

The petitioner – in TJS of New York, Inc. v. New York State Dep’t of Taxation and Fin., 932 N.Y.S.2d 243 (N.Y. App. Div. Nov. 3, 2011) – brought article 78 proceeding to compel Department of Taxation and Finance to produce records that were responsive to petitioner’s request under Freedom of Information Law (FOIL) for records related to sales tax audit.  Some of the records, however, could not be reviewed without a copy of the Department’s Audit Framework Extension software, which the Department refused to provide.  The petitioner then moved to compel production of the software program in order to install it on his computer and view the electronic files. The court denied petitioner’s motion, concluding that the software program was exempt from disclosure and also denied the petitioner’s subsequent motion to renew.

The court determined that the term “record” was broadly defined as “any information kept, held, filed, produced or reproduced by, with or for an agency …, in any physical form whatsoever, including, but not limited to, reports, statements, examinations, memoranda, opinions, folders, files, books, manuals, pamphlets, forms, papers, designs, drawings, maps, photos, letters, microfilms, computer tapes or discs, rules, regulations or codes”.  However, the petitioner disagreed, “citing the Department’s own description of the software as well as advisory opinions in which the Committee on Open Government concludes that software can constitute a record under FOIL”.

The court agreed with that argument, noting:

  • “The description of the software submitted by the Department and the reasoning and analysis contained in the advisory opinions relied on by petitioner lead us to conclude that the software at issue contains information and, thus, constitutes a record for FOIL purposes.”
  • “Specifically, the affidavit submitted by the Department from an auditor involved in the design and development of the software program, as well as the attached training manual for the software, reveals that the software is the means for conducting an audit and that, based on data entered by an auditor, the program does reconciliations, creates letters, produces forms, determines taxes due or refunds owed and creates a comprehensive audit report.  The June 1998 advisory opinion cited by petitioner concludes that software that enables an agency to manipulate data is a record pursuant to FOIL in the same way that a written manual describing a series of procedures would be subject to disclosure under FOIL”.
  • “The 2001 advisory opinion references a definition of software as ‘a series of instructions designed to produce information that can be seen on a screen, printed, stored, transferred and transmitted’ and concludes that it is a record subject to FOIL”
  • “Given these opinions and the Department’s own description of the capabilities of the program, we conclude that it is more than just a delivery system or data warehouse and, instead, falls within FOIL’s broad definition of a record subject to disclosure”

So, what do you think?  Should producing parties be required to produce specialized software to review produced records? Please share any comments you might have or if you’d like to know more about a particular topic.