eDiscovery Daily Blog

Here’s a New Dataset Option, Thanks to EDRM: eDiscovery Trends

For several years, the Enron data set (converted to Outlook by the EDRM Data Set team back in November of 2010) has been the only viable set of public domain data available for testing and demonstration of eDiscovery processing and review applications.  Chances are, if you’ve seen a demo of an eDiscovery application in the last few years, it was using Enron data.  Now, the EDRM Data Set team has begun to offer some new dataset options.

Yesterday, EDRM announced the release of the first of its “Micro Datasets.”  As noted in the announcement, the datasets are designed for eDiscovery data testing and process validation. Software vendors, litigation support organizations, law firms and others may use these smaller sets to qualify support, test speed and accuracy in indexing and search, and conduct more forensically oriented analytics exercises throughout the eDiscovery workflow.

The initial offering is a 136.9 MB zip file containing the latest versions of everything from Microsoft Office and Adobe Acrobat files to image files and contains EDRM specific work product files and data from public websites. There are even some uncommon formats including .mbox email storage files and .gz archive files!  The EDRM Dataset group has scoured the internet and found usable freely available data at universities, government sites and elsewhere, a selection of which are included in the zip file.

The first EDRM Micro Dataset zip file is available now for download here.  While it’s an initial small set, EDRM has promised “advanced” data sets to come.  Those advanced data sets, to be released in the near future, will be available exclusively to EDRM members.  Members will be notified by email with instructions for file downloading.   Organizations interested in EDRM membership will find information at https://www.edrm.net/join/.  Now, there is more reason than ever to join!

So, what do you think?  Are you tired of using the Enron data set and look forward to alternatives?   If so, today is your lucky day!  Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine. eDiscovery Daily is made available by CloudNine solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscovery Daily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

print