eDiscovery Best Practices: Data Mapping Doesn’t Have to be Complicated

November 18, 2011

Some time ago, we talked about the importance of preparing a data map of your organization’s data to be ready when litigation strikes.

Back then, we talked about four steps to create and maintain an effective data map, including:

Obtaining early “buy-in” with various departments throughout the organization;
Document and educate to develop logical and comprehensive practices for managing data;
Communicate regularly so that new data stores (or changes to existing ones) can be addressed as they occur;
Update periodically to keep up with changes in technology that create new data sources.

The data map itself doesn’t have to be complicated. It can be as simple as a spreadsheet (or series of spreadsheets, one for each department or custodian, depending on what level of information is likely to be requested). Here are examples of types of information that you might see in a typical data map spreadsheet:

Type of Data: Prepare a list and continue to add to it to ensure all of the types or data are considered. These can include email, work product documents, voice mail, databases, web site, social media content, hard copy documents, and any other type of data in use within your organization.
Department/Custodian: A data map is no good unless you identify the department or custodian responsible for the data. Some of these may be kept by IT (e.g., Exchange servers for the entire organization) while others could be down to the individual level (e.g., Access databases kept on an individual’s laptop).
Storage Classification: The method(s) by which the data is stored by the department or custodian is important to track. You’ll typically have Online, Nearline, Offline and Inaccessible Data. A type of data can apply to multiple or even all storage classifications. For example, email can be stored Online in Exchange servers, Nearline in an email archiving system, Offline in backup tapes and Inaccessible in a legacy format. Therefore, you’ll need a column in your spreadsheet for each storage classification.
Retention Policy: Track the normal retention policy for each type of data stored by each department of custodian (e.g., retain email for 5 years). While a spreadsheet won’t automatically identify when specific data is “expired”, a regular process of looking for data older than the retention time period will enable your organization to purge “expired” data.
Litigation Hold Applied: Unless of course, that data is subject to an active litigation hold. If so, you’ll want to identify the case(s) for which the hold is applied and be prepared to update to remove those cases from the list once the hold obligation is released. If all holds are released on normally “expired” data and no additional hold obligations are expected, that may be the opportunity to purge that data.
Last Update Date: It’s always a good idea to keep track of when the information in the data map was last updated. If it’s been a while since that last update, it might be time to coordinate with that department or custodian to bring their portion of the data map current.

As you see, a fairly simple 9 or 10 column spreadsheet might be all you need to start gathering information about the data stores in your organization.

So, what do you think? Has your organization implemented a data mapping program? If not, why not? Please share any comments you might have or if you’d like to know more about a particular topic.

eDiscovery Daily Blog