Archiving101.com; in depth no nonsense information about archiving and related technologies.
30th September 2009

Why should you archive before you migrate

This is a question that has been asked many times.  During an upgrade to Exchange 2007 or even Exchange 2010 in the future, all mailbox data within the legacy Exchange 2003 Server would typically be migrated into the new Exchange 2007 Server. Since a migration will copy old, infrequently accessed, or fixed/static email data, it may be a time consuming and inefficient process depending on the volume of inactive data. The excessive bulk of this data will also impact performance and capacity on the new servers, and increase backup and restore times.

Using an email archiving solution prior to a migration delivers some distinct advantages:

  • It identifies old or unnecessary email data so administrators can take informed, intelligent actions on whether to remove, archive, or migrate the data. This simplifies and streamlines migration to Exchange 2007.
  • It retains archived email data in a format that can be accessed in the future to comply with various regulatory requirements and for e-discovery purposes. This eliminates any need for support or updates to legacy Exchange 2003 systems.
  • It prevents from having to re-architect the storage tied to the Exchange 2007 system when the archiving system is installed after the migration as the Exchange 2007 storage sizing will be sized to include the inactive data which could contribute to 80% of the total storage.  Deploying the archive before also ensures that the storage for the Exchange Servers is adequately sized.

After a migration with email archiving, Exchange 2007 servers will be much more efficient by maximizing data storage usage and performance. The reduction of the storage footprint of the Exchange Server will also significantly improve the Backup and restore times. Email archiving provides ongoing benefits after the migration such as mailbox size and quota management so users no longer need to store data in PST files, and sophisticated search capabilities to enforce policies on inappropriate email content. While Exchange 2007 includes limited archiving in the form of Managed Folders and Journaling, these features do not provide a full solution for compliance nor storage management.

posted in exchange 2010, exchange 2007, migration, storage | 0 Comments

8th October 2008

Google now offers 10 year retention on its archiving solution

Today Google announced that it has added new option for archiving messages up to ten years for their Google’s Message Discovery solution, which is based on its 2007 acquisition of email archiving services vendor Postini. This new option is available for a flat fee of $45 USD per user per year, which raises the biggest question of all instantly. Isn’t Google pricing itself out of the market with this steep price?

Not only pricing, but a recent Forrester Research report also attributed relatively slow adoption of email archiving SaaS to network latency in accessing off-site archived messages and searching them for e-Discovery.

Google will continue to offer a one-year retention period for the existing fee of $25 per user per year. Both packages also include spam and virus filtering, policy management tools and, of course, search.

posted in SaaS, storage, vendor selection, competition, eDiscovery | 0 Comments

26th August 2008

GFIs upcoming release

From the comment that was posted on the ‘Microsoft is against stubbing‘ article it seems like that GFI is following the path of recommendation and not only fixes a huge hole in their product for capturing data that already is in the mailbox, but also does it without stubbing.

Now if only the rest of the industry starts to move away from stubbing then the world would be a better place.

posted in storage, vendor selection, competition | 2 Comments

4th July 2008

Microsoft now has a negative stance against stubbing

Looks like people are finally listening and seeing that stubbing/shortcutting truly is ‘evil’ and that it has negative impact on an Exchange environment. Quoting from:

http://technet.microsoft.com/en-us/library/cc671168(EXCHG.80).aspx

Third-party archiving solutions have become popular as corporate compliance requirements and mailbox quota management have gained importance. Many of these archiving solutions offer the ability to leave a small stub file in place of the archived message that can be used by end users to retrieve archived messages from the archival system. Some organizations use the stub file solution as a workaround to offering large mailboxes. One of the goals of stub archiving solutions is to reduce the aggregate mailbox and database size, thereby reducing recovery time objectives (RTOs). On the surface, this appears to be a good idea. However, stub-based archiving solutions have the following technical problems:

  • Server performance Removing the message bodies and attachments from Exchange reduces the mailbox size, but it does not significantly change the server performance for users accessing Exchange via Outlook in online mode and Outlook Web Access. Item counts are the primary performance driver for the Exchange store, and not aggregate size. For example, server performance with a folder containing 100 KB of full e-mail message items is similar to a folder containing 100 KB of stub files.
  • Client complexity Because the use of stub files with a third-party archiving solution requires the deployment and use of Outlook add-ins, a significant amount of time must be spent by administrators to deploy and manage these add-ins. Administrator time is also required to assist end users with technical difficulties using the add-ins. Not deploying stub files removes all of this additional administrative work that must be performed, thereby allowing more time to administrators and end users.

Today I raise a glass and toast …. for the non-believers .. time to change opinion. Archiving vendors need to change their ways to capture information.

posted in storage, competition | 18 Comments

9th June 2008

More on the max item count in your mailbox

Microsoft recently updated the MSDN article that talks about the maximum items you should have in your mailbox so that optimal performance remains. I’ve said it many times before, but I’m against stubbing/shortcutting/extending or whatever kinda name the archiving vendor may give to the technology as it increases the risk for the performance problems.

After all … it isn’t the size of the Exchange Database that dictates how well it performs .. its the amount of items.

So I would say that this is recommended reading fo all of you

http://tinyurl.com/5o6vku

(shortened it since it is a crazy long MSFT link)

posted in storage | 2 Comments

6th May 2008

Mimosa Announces Next-Generation File System Archiving

SANTA CLARA, Calif.–(BUSINESS WIRE)–Mimosa Systems, a leader in live content archiving solutions, today unveiled Mimosa NearPoint File System Archiving (FSA), a powerful archiving option that enables users to retain, retrieve and recover critical, free-range files alongside millions of emails, attachments, instant messages, and content from backup tapes stored in an integrated content-aware archive. Built on the award-winning Mimosa NearPoint platform, this new offering provides Mimosa customers with the most advanced content archive that addresses critical requirements for storage optimization, end-user information access, eDiscovery, content monitoring, and recovery in a single solution.

The widespread growth of unmanaged, free-range electronic files, such as Microsoft® Office files, Adobe® PDF files, and others, creates a critical challenge for enterprises trying to control storage costs, maintain content according to their retention and deletion policies, and find and preserve these files in the face of stringent discovery requirements.

Mimosa NearPoint FSA moves files from premium production storage to lower cost archive storage. FSA utilizes advanced stubbing technology that coordinates the use of network and backup processes while preserving the look and feel of the original file that was stored on the production disk. End-users can access the files as they normally would, while administrators see immediate benefits of storage reduction and optimized primary storage performance.

Mimosa FSA allows enterprises to significantly reduce storage costs while improving storage management and the backup efficiency of file servers containing thousands of free-range files. Legal staff can now rapidly search both files and email with a single search query, and preserve these files in place, lowering the cost of collection, and facilitating a quick export to downstream review or analytics applications.

“Mimosa NearPoint FSA promises to deliver advanced information archiving for retention, electronic discovery, auditing and access, further aligning our IT systems with core operational requirements,” said Bruce Lorimer, Information Systems Supervisor for the County of Madera. “With the NearPoint archiving solution in place, we will be able to satisfy legal and operational demands for a more compliant and efficient organization.”

Within organizations, users typically store content on a network file system because it is easy to do. However, this storage strategy makes central management very difficult because file systems are not managed content repositories with check-in services, search indexes, and version control. File system content presents a major challenge to organizations for several reasons:

  • Production storage costs can soar out of control as users routinely store multiple copies of the same content.
  • Retention and disposition policies are difficult to enforce, negatively impacting the cost and time to comply with an eDiscovery request.
  • Locating sensitive information such as customer information, trade secrets and intellectual property that was saved to the file system before the organization had polices directing usage.

“Mimosa’s new file system archive allows organizations to store more information at lower costs while addressing retention and legal preservation concerns,” said Brian Babineau, Senior Analyst, Enterprise Strategy Group. “NearPoint FSA allows employees to access files as they normally would because administrators can leave stubbed files on the file system providing them seamless access to their information. Compliance officers and records managers are ensured that information is properly retained and IT administrators can easily configure and operate the solution with enhanced management capabilities.

Mimosa NearPoint FSA The Industrys Next-Generation File System Archive

With Mimosa NearPoint FSA, administrators can:

  • Manage production storage costs with policies that move files to lower cost archive storage devices based on attributes such as type, size, age, and last access date.
  • Consistently apply retention and disposition policies across files, emails, attachments, and instant messages from one administrative interface.
  • Maintain a seamless end-user experience with advanced placeholders that point to archived content while preserving the look and feel of the original files when browsing with Microsoft Windows Explorer®.
  • Lower archive storage costs with advanced de-duplication across multiple copies of files files that are sent as attachments and files contained on backup tapes.
  • Expedite eDiscovery by searching and preserving relevant files alongside email and instant messages using a single search, and cull-down user interface.
  • Lower risk by creating alerts for content at rest on the file system that was created before a policy for sensitive information was implemented.
  • Enable faster, more granular file system recovery, and reduce the need to restore individual items from backup tapes.
  • Avoid backup issues typically associated with first-generation file system archiving products by coordinating with the backup process to optionally protect content behind the end-user stub.

NearPoint File System Archiving gives enterprises a powerful archive to take control of distributed file stores to reduce storage costs and retain, retrieve, and recover critical information to meet stringent compliance and eDiscovery requirements, said T.M. Ravi, CEO, Mimosa Systems. With the introduction of this solution, Mimosa provides companies with the most comprehensive and advanced content archiving platform for email, files, and instant messages stored in a unified repository.

posted in storage, vendor selection, competition, search | 0 Comments

9th April 2008

Interact 2008: Day 2

I’m glad to hear that Microsoft is taking a hard stance against stubbing/shortcutting of data within a mailbox.  The obvious performance problems it causes are starting to hit home.  Not only it is the end user performance that is affected by stubbing, it also can have a significant impact on larger environments leveraging Standby Cluster Replication.   The problem lies in the fact that if a message is replaced with a shortcut, the original message is deleted and replaced with a stub.  This stub actually generates a transaction in the transaction log.  So when you leverage SCR these transaction logs will have to be shipped to your remote site and while you are trying to reduce the volume of information it actually increases the volume of transaction data that you will have to ship over your WAN.

I’m glad to see in at least a few presentations Microsoft customers recommending AGAINST stubbing.  Applause!!

posted in storage, competition | 0 Comments

13th March 2008

How stubbing can hurt your performance on Exchange Server

To follow up on a previous post, The Death of Store Management. As many archiving vendors claim to help the overal performance of Exchange by stubbing email messages and offering customers the option to reduce the footprint of Exchange databases, the reality is that they actually hurt the Exchange Server performance by doing this.

It comes down to the fact that the performance degradation with having large mailboxes is really related to the number of items in a given folder and not the overall size of the items. As the number of items in a folder approaches around 5000, the size of the 11 default view folders increases to the point where they are no longer retained in the DB cache, which means that they must be generated on the fly each time you view the folder. Microsofts KB database has a good article on this: http://support.microsoft.com/kb/905803
For online mode clients this means increased IO on the exchange server, while for cached mode clients, it means increased I/O on the client. (when you’re in chached mode the read I/O doesn’t go away, it’s just shifted to the client).

The overall size of the mailbox, combined with what Microsoft recommends as limits on store size, determines how many users you can put on a given store and the overall scalability (scale up) of a given mailbox server. You can always scale out (add more mailbox servers). Larger mailboxes generally mean more mailbox server, more licenses, more storage, bigger backups, and higher costs/user for your messaging environment, however … what I wrote before .. a far better solution (and one that Microsoft would really like you to use) is NOT to stub. Go towards an age based policy where you delete data from Exchange after a set of time as all of your information is already in the archive.

And to give an example … a previous post on PST consolidation describes a bad example as well that will hurt your performance.

posted in storage, competition | 6 Comments

21st February 2008

Companies can force archiving to go ‘underground’

I spend some time yesterday moderating a webinar with Mark Diamond from Contoural and there was an interesting statement that really hits home for organizations who haven’t implemented a proper retention and disposition policy in their environment. End users are very predictable when it comes to email .. they all are packrats … big ones. Organizations who in the past enforced strict mailbox limits started to see people leverage PST files to store their data, something that for a while was actually encouraged to do.

Now that we are years later these organizations realize that they have a few TB worth of PST files out there that is a legal time bomb waiting to explode. Without knowing what is in there it is a nightmare scenario.

The other case scenario is still in use today and reasonably popular; enforce a strict deletion policy on Exchange (i.e. delete everything after 90 days). What users tend to do in that case often is for instance leverage gmail as their private archive. They would set up a mailbox rule to forward all of their emails to an outside account, either gmail or maybe their private email account.

Basically both these options have resulted in underground archives that could be targeted in case of a litigation even though the data might be in a private mailbox. Forcing data archiving ‘underground’ is very dangerous as all of a sudden the organization has no longer control over this data and they don’t know what kind of data is out there. It isn’t that simple either as now the data could be stored in many different locations; from PCs to iPods and more.

A proper data retention policy comes with a way to store and dispose of the data and not force it underground. In my opinion an archiving solution is going to help you with this as you allow the data to go somewhere under your rules.

posted in storage, search, eDiscovery | 0 Comments

30th January 2008

Old data is toxic waste?

I ran into this quote on Hu Yoshida’s blog who is the CTO of Hitachi Data Systems on some of his predictions for 2008:

“Data more than 60 days old on production systems will be considered toxic waste. Structured data such as databases and semi-structured data such as e-mail and document management data are increasing dramatically as they are required to hold more data, longer, for compliance reasons. This will call for new types of archiving systems that can scale to petabytes and provide the ability to search for content across different modalities of data. “

Interesting .. I never thought about calling it ‘toxic waste’ before.

posted in storage | 3 Comments