_ _ Infobelt
Archiving vs. Backup: Which Do You Really Need for Compliance?
Everyone who has worked in records management has seen it before: Organizations keeping their backup copies of production data “because it’s needed for compliance.” This, however, turns out to be a costly move…and one that does not really address data retention needs. What is really needed for data retention is a proper data archiving system.
Which prompts the question: What is the difference? Why is backup not suitable for compliance, and what is gained from investing in a true enterprise data archive?
Archiving vs. Backup: Two Different Missions
The short answer to the above is that archiving solutions and backup solutions were created with two different goals in mind:
- Backup makes a copy of data (both active and inactive) so that, should that data become damaged, corrupted, or missing, it can be recovered quickly.
- Archiving makes a copy of inactive or historical data so it can be stored in an unalterable, cost-effective way for legal or compliance reasons.
Backup is an important part of a business continuity plan. Should a piece of hardware fail, or a database become corrupted, it still will be possible to recover the necessary data to keep business operations going.
Maintaining a backup system can be costly, however. The data in the system needs to be updated often, and made easily recoverable, should a disaster happen. The space and cost required to do so can become quite large as an organization’s data grows.
Archiving stems from the realization that not all data an organization has is needed for daily operations—it is not production data. Examples include old forms, transaction records, old email communications, closed accounts, and other historical data. But while this data has no ongoing use, it has to be kept to comply with laws having to do with data retention.
It’s easy to see how the two might be confused—after all, both kinds of technology are, in essence, making a copy of the organization’s data.
But whenever you have two different goals or purposes for two different pieces of technology, you are going to have some important differences as well. If those differences are large enough, you won’t be able to simply swap one technology for the other. At least, not without some major problems.
First Major Difference: The Cost of Space
When a bit of data is stored, there is a cost associated with it. That’s true whether that data sits in the cloud, on an on-prem server, or on a tape drive in a closet somewhere.
Not all storage costs are equal. Take cloud providers like AWS, Microsoft (Azure), and Google, for example. These big players tier their storage offerings, basing the price on things like accessibility, security, and optimization for computations. “Hot storage” holds data that might be used day-to-day and needs to be optimized for computing, and so is relatively much more expensive. “Cool” or “cold” storage is for data that is rarely used, and so does not need to be optimized or accessed quickly. Thus, it tends to be cheaper—sometimes by half or more.
The same goes for on-prem storage. Some data needs to be readily accessible, and so located on a server that needs to be maintained and secured. There are many more options for data that does not need to be accessible, like historical data.
The longer an organization stays up and running, the greater its older, inactive historical data is in proportion to its active data. This is why archiving is important: It saves this inactive data in a much more cost-efficient way, freeing up the systems that traffic in active data (and freeing up storage budget).
Second Major Difference: Immutability
An important part of compliance with data retention laws is keeping the data in an unaltered, and unalterable, state. This is where the idea of immutable storage comes into play. Immutable storage, such as a WORM (write once, read many) datastore, cannot be altered, even by an administrator. The data is, in a sense, “frozen in time.”
This is important for legal purposes. If data is needed for any reason, it is important to show that it has been stored in a way that resists any sort of tampering or altering. In short, immutability is built into most data archiving solutions, because immutability is important for the very tasks for which archives were engineered. The same might not always be true for data backups.
Another benefit of immutability: It provides built-in protection against ransomware attacks.
An important part of compliance with data retention laws is keeping the data in an unaltered, and unalterable, state. This is where the idea of immutable storage comes into play. Immutable storage, such as a WORM (write once, read many) datastore, cannot be altered, even by an administrator. The data is, in a sense, “frozen in time.”
Third Major Difference: Logging and Tracking
Along with alterability comes the idea of logging or tracking who has accessed a particular bit of data. Having a log of who accessed which data, and when, leaves an important trail of breadcrumbs when it comes to audits, as well as data privacy incidents. Most backup systems do not need this level of logging and tracking—they usually carry just enough information to verify when backup or recovery has been run, and how successful it was. Archiving provides a much more granular level of detail.
Fourth Major Difference: Scheduled Destruction
Once data is no longer needed for compliance purposes, it should be destroyed. That way, it no longer takes up space, nor runs the risk of being compromised (which can be a data privacy issue).
Best-in-class archives, because they are focused on compliance needs, have such scheduled destruction built in. Backup systems usually do not, as that would be antithetical to their purpose of saving data. (At best, backup systems overwrite previous backups, and some let the user determine how many backup copies need to stay current.)
Archiving and Backup: Which Does Your Organization Need? (And How Do You Know?)
Really, most enterprise-sized organizations need both. Business continuity plans need to include solutions for backup.
But those solutions make for a very costly, and mostly inadequate, archiving solution for compliance purposes. Different technology is needed for this.
So, if your organization is discussing disaster recovery and prioritizing things like speed to get up and running again with your production data intact, it’s best to explore a backup solution.
But if, like the customers above, you are looking to retain records or other data for compliance purposes, invest in a data archive.
Barry Burke, storage guru and CTO of Dell EMC for years, has a great way of conceptualizing the difference between the two technologies, looking not at what is done, but what the intent behind the action is:
In explaining the word “archive” we came up with two separate Japanese words. One was “"katazukeru,” and the other was “shimau”…Both words mean “to put away,” but the motivation that drives this activity changes the word usage. The first reason, katazukeru, is because the table is important; you need the table to be empty or less cluttered to use it for something else, perhaps play a card game, work on arts and crafts, or pay your bills. The second reason, shimau, is because the plates are important; perhaps they are your best tableware, used only for holidays or special occasions, and you don't want to risk having them broken.
If plates are data and the table is your production storage system, then backup is shimau: The data is important to save, even at a high cost. Archiving is katazukeru: It’s the system itself that must be cleared so you can get on with the next activity…but, of course, you still want and need to save the plates.
Interested in what an archiving solution can do for your organization above and beyond backup? Take a look at our Omni Archive Manager, or reach out to talk to one of our specialists.