Archiving101.com; in depth no nonsense information about archiving and related technologies.
11th January 2010

EDRM data set project

posted in Categorization, eDiscovery |


This was made available earlier thiswee. The EDRM Data Set project http://edrm.net/activities/projects/data-set has 3 different sets of data available for free:

 

  • The EDRM Data Set Enron PST files: Enron e-mail messages and attachments organized in 32 zipped files, each less than 700 MB in size, containing 168 .pst files.
  • The EDRM File Formats Data Set: 381 files covering 200 file formats.
  • The EDRM Internationalization Data Set: A snapshot of selected Ubuntu localization mailing list archives covering 23 languages in 724 MB of email.

 

 Might be good for those that would like to test some discovery on content instead of the manual created data.

Leave a Reply