Case Study

Exploring the Future of Legal Innovation at The Masters Conference: Thought Leadership in D.C. and Social Media in Discovery and Investigations

On Thursday, April 10, 2025, legal and technology professionals gathered at Arnold & Porter in Washington, D.C. for an inspiring day of discussion, collaboration, and community during The Masters Conference Thought Leadership event. Hosted at Arnold & Porter’s offices at 601 Massachusetts Ave NW, this full-day conference promised a deep dive into the latest challenges and advancements in eDiscovery, legal tech, investigations, and career development.

The conference featured a wide range of insightful sessions—covering topics from artificial intelligence, custodian interviews with modern data challenges, case law updates, and social media collection and analysis. For this blog, I’m focusing on the session that explored the power and process of social media collection and analysis, which stood out as particularly timely and impactful.

The session on social media was titled “Unlocking Social Media Data,” sponsored by SMI Aware, and examined the investigative value of social media evidence. Josh Janow and Paige Hansen (SMI Aware) walked through data preservation strategies across platforms like Facebook, Instagram, LinkedIn, TikTok, Venmo, Strava and over 500 other accounts.

This session actually kicked off at the beginning of the conference, when Josh invited volunteers to have their social media presence assessed live. Using SMI Aware’s platform, the team conducted real-time OSINT (Open Source Intelligence) research on those individuals, compiling reports to present at the 11 a.m. breakout session. Initially, only a few attendees stepped forward—but when the findings were revealed, those volunteers were genuinely surprised by what had been uncovered in just a couple of hours. The reticence on the face of many in the room underscored the power of this tool in the hands of e-discovery professionals.  What began as a novel and engaging activity quickly shifted in tone during the session, as attendees began to recognize social media research as a “must-have” component in litigation, compliance, and due diligence strategies.

Why Social Media Data Matters in Discovery

Since much of our life events are journaled online (as Paige put it), critical evidence is often found in unexpected places—Instagram posts, Venmo transactions, Reddit threads, and business collaboration tools like Slack and GitHub. The session opened with a challenge: What if your case hinges on something someone posted online—then deleted?

Social media can tell a story that contradicts a claim, verifies an alibi, or reveals patterns that shift the legal narrative. Whether it’s a workers’ comp investigation or a high-stakes wrongful termination suit, open-source data is no longer a “nice to have”—it’s a necessity.

When and Where to Search

The first key takeaway? Timing is everything.

Social media content can be altered or deleted. That’s why early case assessment should now include an OSINT component. From public Facebook profiles to lesser-known platforms like Discord or SoundCloud, relevant content often exists in plain sight—if you know where and how to look.

In one powerful case example shared, a claimant in a workplace injury lawsuit posted photos of themselves competing in a dance competition—at a time they were allegedly too injured to work. That evidence was found publicly, but only for a short window before it was removed.

The Legal and Ethical Imperatives

Attorneys and investigators have both a professional and ethical obligation to understand where potential evidence may exist, even if it lies outside traditional custodians and repositories.

The presentation emphasized that collecting this data isn’t about “digging for dirt”—it’s about diligence. When done properly, it involves secure data collection methods, legal defensibility, and a clear chain of custody. Not doing so could mean missing key facts, or worse—compromising the admissibility of your findings.

Challenges and Limitations

Despite its power, social media evidence isn’t without hurdles:

  • Deleted or ephemeral content (think Stories or temporary posts)
  • Private settings that restrict access
  • Platform-specific restrictions on what can be scraped or reviewed

This is where specialized tools and experienced teams like SMI Aware’s come in. Their approach combines automated tools with human analysis to ensure data is gathered ethically, interpreted contextually, and structured into actionable insights with a deliverable of a report and the structured data that can then be imported into a review platform like CloudNine Review.

Real-World Impact: Case Studies in Action

The session walked attendees through several real-world investigations, including:

  • Workers’ Compensation fraud
  • Wage and hour disputes
  • Wrongful termination claims
  • Workplace compliance investigations
  • Pre-employment screening

Each case underscored the same point: social media and OSINT data can change the course of an investigation—or the outcome of litigation.

Key Takeaways

  • Social media is critical to modern discovery. If you’re not using it, you’re behind.
  • Data disappears quickly. Timely collection is key.
  • You need technical tools and expert interpretation to turn raw data into usable evidence.
  • Ethical and professional rules require attorneys to understand how OSINT fits into their duty of competence.
  • The report generated from SMI Aware’s software and service is a ready for an expert and is the main use, but also can create the proper load files for review in an eDiscovery review platform.

This conference was yet another testament to the evolving digital landscape of discovery—where artificial intelligence, modern data collection, and advanced review technologies are increasingly aligned to meet today’s challenges.

Are you ready to take control of eDiscovery costs? Let’s talk about how CloudNine can help you save money while optimizing efficiency: Contact Us today.

data culling in eDiscovery

The CFO’s Perspective: Why Culling Data First in eDiscovery Saves Big Money

By Abhishek Jhaver, CloudNine GM and CFO

As a CFO, managing costs without sacrificing efficiency is always a priority. One of the biggest cost drivers in eDiscovery is data volume —the more data you have, the more expensive it is to process, host, and especially review. That is why culling data before review isn’t just a best practice — it’s a financial imperative.

The High Cost of Excess Data

The traditional approach often involves collecting terabytes of data, much of which is irrelevant or redundant. Legal teams then spend significant resources on processing, hosting, and reviewing data that isn’t needed.

Industry studies show that document review can account for up to 70% of total eDiscovery costs. If we can significantly reduce the volume of data before review, we can directly impact the bottom line. The key is leveraging technology to remove unnecessary data early in the process before it becomes a cost burden. This culling process is typically called Early Data Assessment or Early Case Assessment.

CloudNine LAW: The Smarter Way to Cull Data Early

CloudNine LAW is a powerful solution designed to dramatically reduce data volume before review. By using its robust deduplication, filtering, and processing capabilities of over 5,000 file types, organizations can cut down on avoidable data, saving time and money in the process. Here’s how LAW helps make the corporate legal finance teams happy:

  • Deduplication: A significant percentage of collected data consists of duplicate files. LAW automatically identifies and removes redundant documents, ensuring legal teams aren’t reviewing the same information multiple times.
  • Filtering by Date, Custodian, and Keywords: LAW allows legal teams to filter data based on date ranges, specific custodians, and relevant keywords—removing unnecessary files before they ever reach review.
  • DeNISTing: System files and other non-relevant data clog up eDiscovery workflows. LAW uses DeNISTing to eliminate these non-user-generated files, reducing overall data volume.
  • Efficient Processing for Faster Turnaround: LAW processes large data sets efficiently, enabling legal teams to move to review with only the most relevant data.

 

The Financial Impact of Early Culling

By culling data early with CloudNine LAW, organizations can realize substantial cost savings. Fewer documents to review mean lower processing, hosting, and attorney review costs—potentially cutting review-related expenses by 30-50%. Additionally, streamlined processes allow legal teams to work faster while reducing billable hours and improving case timelines.

For CFOs, the equation is simple:

More data = higher costs.

Relevant data = Significant savings in cost, time, and resources.

 

A CFO’s Call to Action

If your legal team is not leveraging early-stage culling through tools like CloudNine LAW, you are likely overspending. Investing in proactive data reduction is one of the smartest financial decisions you can make in eDiscovery—because in the end, the best way to cut eDiscovery costs is to cut the irrelevant data first.

Are you ready to take control of eDiscovery costs? Let’s talk about how CloudNine LAW can help you save money while optimizing efficiency: Contact Us today.

CloudNine & United Litigation: Tackling Even the Trickiest eDiscovery Data

United Litigation is a fast-growing legal services firm founded 15 years ago, with national and global reach through its offices in Los Angeles, San Francisco, and Taiwan, offering 24/7 support. They pride themselves on having a stable of industry veterans on staff to handle eDiscovery and other litigation support services for their law firm and corporate clients.

For over 20 years, eDiscovery teams at United Litigation have relied on CloudNine and its CloudNine LAW to provide fast eDiscovery processing for its clients, with the ability to handle 4500+ file types with built-in OCR, scanning and printing capabilities, email threading, deduplication and more. Morgan Caparaz, President, and CEO of United Litigation, also spoke to CloudNine LAW’s unique flexibility in accommodating general use cases as well as exceptions, which can make up to 90% of United Litigation’s work. “LAW handles the data exceptions for us,” Caparaz said. “We use it as is but also have access to the back end and can write custom automation scripts and databases that help us to handle exceptions. It’s the only solution around for us to use when data is suspect or bad. No matter the issue, I am confident we can handle it with LAW.”

Caparaz also spoke about CloudNine LAW’s Turbo module. Turbo is an ingestion engine designed to handle data with a reduced infrastructure and allows for the ability to process data up to 25% faster. “Turbo is a sledgehammer when it comes to tackling large amounts of complex data but also allows us to get surgical on the data with the metadata. It’s this combination of large-scale processing and granular metadata analysis that sets it apart from other solutions. With LAW and its Turbo module, we gain tremendous efficiencies and can process terabytes of data every month with a lean team,” explained Caparaz.

CloudNine LAW is available in an on-premise. To learn more about how we can scale your team’s work, contact us today.

Reconstructed Text Message Chains And A Telling Voicemail Tips The Scales

When a partner of a commercial real estate company caught wind of fraudulent deals being penned by the other partners of the firm, he decided to take action. However, one of the partners involved realized that there may be trouble, and quickly began a coordinated effort to destroy the digital trail evidence. Text messages were removed from devices and cloud-based backups destroyed.

Investigators leveraged the power of active threading functionality allowing them to reconstruct conversations from multiple data sources ranging from backups to forensic images of each individual’s phone.

The digital trail of evidence being deleted from multiple devices presented a significant hurdle to investigators attempting to piece conversations back together. The answer and revealing path rested on reconstructing the fragmented message threads across devices. Investigators leveraged the power of active threading functionality to reconstruct conversations from multiple data sources ranging from backups to forensic images of each individual’s phone. Interestingly enough, it was a voicemail that captured the group’s intentions to delete the incriminating data that became the key piece of evidence.

While the whistleblower possessed text messages and other communications surrounding the corruption, fraud, and dishonesty, other key messages were found through restoration of iCloud backups spanning laptops and two smartphones. The technology was able to reveal key information captured from multitude of metadata sources including:

  • EXIF Data from several key photos
  • Geolocations revealing travel patterns
  • Text Messages, WhatsApp and Facebook Messenger

Using the functionality of active threading, the counsel saved incredible amounts of time putting the story together with all of the relevant data. In the end, the timeline functionality paired with its ability to support multiple disparate data types counsel was able to clearly piece together the fraud as well as the attempt to hide the evidence.

Phones Analyzed For Message Content And Cell Tower Location Data Foil Criminals

Today’s criminals do not plan their crimes via email and zoom exchanging PowerPoint slides. And when it comes to solving today’s complex crimes, identifying technological data points that can prove innocence or guilt beyond a reasonable doubt are at the heart of every investigation. From Fitbits to burner phones, today’s criminals leave a trail of digital fingerprints that require modern investigative technology.

Mobile devices are often considered a window into one’s daily life. A cell phone’s location can be detected through cell site location information, often referred to as CSLI. This data, when available, can be quickly analyzed through technological means to place a person at or near specific locations. However, geolocation is often just one piece of a complex puzzle. When you send an SMS, MMS or place a call, your phone’s location is often being recorded when connecting to a nearby cell tower. Dates and times are also recorded down to the second that include the moment a call or message was sent, as well as the service being engaged, such as SMS or 4G LTE.

Investigators were trying to piece together as much evidence as possible to determine where each suspect was located in correlation to the crime scene.

Recently one of our Partners needed to analyze and cross reference messages received by one device in conjunction with cell tower data from two other devices subpoenaed as part of a criminal investigation. Mobile messages from the victim’s device were loaded into our platform, combined with the cell tower data captured from the suspect’s devices. Leveraging the Actor Matching Technology, each message could be mapped to its participants, showing who was engaged in conversations at specific times.

Investigators were trying to piece together as much evidence as possible to determine where each suspect was located in correlation to the crime scene. The Actor Matching Technology quickly linked actors to the phone calls, text messages, and the cell tower geolocations recorded. Timelines were then built for each matter actor to show where each suspect was located in correlation to the victim. Combining this evidence into a single investigative solution quickly revealed patterns of activity surrounding the night of the incident, helping investigators present a clear picture of the night in question.

A Ticking Time Clock Tells the Tale of Hours Worked

Wage and hour class actions require specificity of aligning claims to the class. Much of this data usually resides in a multitude of platforms ranging from off-the-shelf software to proprietary timekeeping systems that often prove challenging to align and fully understand.

A telemarketing firm was faced with a claim of several years of unpaid time was owed to employees because their timekeeping software used to register employees’ time was not properly recording the start and end of their workday. The claim alleged that the system did not synchronize time entries properly with the actual time each employee started their workday, taking anywhere from 15 to 30 minutes to register their initial punch in time. There were also claims by the class that their workday often extended beyond the time they clocked out. Overall, the claim alleged the employees had been underpaid on average of thirty minutes each day spanning a period of approximately three years and sought reparations for these unpaid hours.

CloudNine technology was used to effectively combine both the timecard entries with each employees’ email activity over the course of the period of the claim. The digital trail revealed that periods of work activity were in sync with the timekeeping system.

Data available included exports from the timekeeping system, which reflected each employees’ time in and out for each day worked, as well as internal ticketing system that leveraged email notifications and responses, reflecting when each user was actively resolving open requests. CloudNine technology was used to effectively combine both the timecard entries with each employees’ email activity over the course of the period of the claim. The digital trail revealed that periods of work activity were in sync with the timekeeping system.

At its core, the CloudNine technology is designed to combine a multitude of disparate data types in a simple and intuitive fashion, allowing case teams to quickly identify differing activities and filter by the correlating individual or group. Sets of data can then be easily identified and time-lined to demonstrate clear patterns of work activity across a variety of differing data types.

eDiscovery Trends: New York Pilot Program Requires Joint Electronic Discovery Submission for Cases Involving ESI

 

On November 1, 2011, the Southern District of New York implemented a new Pilot Program for Complex Cases in "response to the federal bar's concerns about the high costs of litigating complex civil cases." The program is "designed to improve judicial case management of these disputes and reduce costs and delay" and will run for eighteen months.

Fourteen types of civil lawsuits are designated as "complex civil cases," including "stockholder's suits, patent and trademark claims, product liability disputes, multi-district litigation, and class actions." District court judges have the power to add or remove a case from the pilot, even if it does not fall in these categories.

Parties to complex cases must submit Exhibit B, Joint Electronic Discovery Submission if they believe relevant ESI that is potentially responsive to current or future discovery requests exists. In addition, parties must certify that "they are sufficiently knowledgeable in matters relating to their clients' technological systems to discuss competently issues relating to electronic discovery, or have involved someone competent to address these issues on their behalf." They must also meet and confer prior to the Rule 16 conference on preservation; methodologies for search and review; sources of ESI; limitations on the scope of production; form of production; managing privileged material, including inadvertent production, clawback and quick peek agreements, and Rule 502(d) orders; and the costs of production, cost-saving measures, and cost allocation.

So, what do you think?  Should more jurisdictions adopt such a program? Or should they wait until the results of this pilot are published?  Please share any comments you might have or if you’d like to know more about a particular topic.

Case Summary Source: Applied Discovery.  For eDiscovery news and best practices, check out the Applied Discovery Blog here.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Best Practices: Judges’ Guide to Cost-Effective eDiscovery

 

Last week at LegalTech, I met Joe Howie at the blogger’s breakfast on Tuesday morning.  Joe is the founder of Howie Consulting and is the Director of Metrics Development and Communications for the eDiscovery Institute, which is a 501(c)(3) nonprofit research organization for eDiscovery.

eDiscovery Institute has just released a new publication that is a vendor-neutral guide for approaches to considerably reduce discovery costs for ESI.  The Judges’ Guide to Cost-Effective E-Discovery, co-written by Anne Kershaw (co-Founder and President of the eDiscovery Institute) and Joe Howie, also contains a foreword by the Hon. James C. Francis IV, Magistrate Judge for the Southern District of New York.  Joe gave me a copy of the guide, which I read during my flight back to Houston and found to be a terrific publication that details various mechanisms that can reduce the volume of ESI to review by up to 90 percent or more.  You can download the publication here (for personal review, not re-publication), and also read a summary article about it from Joe in InsideCounsel here.

Mechanisms for reducing costs covered in the Guide include:

  • DeNISTing: Excluding files known to be associated with commercial software, such as help files, templates, etc., as compiled by the National Institute of Standards and Technology, can eliminate a high number of files that will clearly not be responsive;
  • Duplicate Consolidation (aka “deduping”): Deduping across custodians as opposed to just within custodians reduces costs 38% for across-custodian as opposed to 21% for within custodian;
  • Email Threading: The ability to review the entire email thread at once reduces costs 36% over having to review each email in the thread;
  • Domain Name Analysis (aka Domain Categorization): As noted previously in eDiscoveryDaily, the ability to classify items based on the domain of the sender of the email can significantly reduce the collection to be reviewed by identifying emails from parties that are clearly not responsive to the case.  It can also be a great way to quickly identify some of the privileged emails;
  • Predictive Coding: As noted previously in eDiscoveryDaily, predictive coding is the use of machine learning technologies to categorize an entire collection of documents as responsive or non-responsive, based on human review of only a subset of the document collection. According to this report, “A recent survey showed that, on average, predictive coding reduced review costs by 45 percent, with several respondents reporting much higher savings in individual cases”.

The publication also addresses concepts such as focused sampling, foreign language translation costs and searching audio records and tape backups.  It even addresses some of the most inefficient (and therefore, costly) practices of ESI processing and review, such as wholesale printing of ESI to paper for review (either in paper form or ultimately converted to TIFF or PDF), which is still more common than you might think.  Finally, it references some key rules of the ABA Model Rules of Professional Conduct to address the ethical duty of attorneys in effective management of ESI.  It’s a comprehensive publication that does a terrific job of explaining best practices for efficient discovery of ESI.

So, what do you think?  How many of these practices have been implemented by your organization?  Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Case Study: Term List Searching for Deadline Emergencies!

 

A few weeks ago, I was preparing to conduct a Friday morning training session for a client to show them how to use FirstPass™, powered by Venio FPR™, to conduct a first pass review of their data when I received a call from the client.  “We thought we were going to have a month to review this data, but because of a judge’s ruling in the case, we now have to start depo prep for two key custodians on Monday for depositions now scheduled next week”, said Megan Moore, attorney with Steele Sturm, PLLC, in Houston.  “We have to complete our review of their files this weekend.”

So, what do you do when you have to conduct both a first pass and final review of the data in a weekend?

It was determined that Steele Sturm had to complete first pass review that Friday, so that we could prepare the potentially responsive files for an attorney review starting Saturday morning.  Steele Sturm identified a list of responsive search terms and Trial Solutions worked with the attorneys to include variations of the terms (such as proximity searches and synonyms) to finalize a list of terms to apply to the data to identify potentially responsive files.  Because FirstPass provides the ability to import and search an entire term list at once, we were able to identify potentially responsive files in a simple, two step process.  “Using FirstPass, Trial Solutions helped us cull out 75% of the collection as non-responsive, enabling our review team to focus review on the remaining 25%”, said Moore.

Once the potentially responsive files were identified, they were imported into OnDemand™, powered by ImageDepot™, for linear attorney review.  During review, the attorneys identified that some of the terms used in identifying potentially responsive files were overbroad, so additional searches were performed in OnDemand to “group tag” those files as non-responsive.  “Trial Solutions provided training and support throughout the weekend to enable our review team to quickly "tag" each file using OnDemand as to responsiveness and privilege to enable us to meet our deadline”, said Moore.

So, what do you think?  Do you have any “emergency” war stories to share?  Please share any comments you might have or if you’d like to know more about a particular topic.