Backup and OpenStack

2016-06-30

Watching OpenStack as an emerging platform, it took some time to sort the various offering for backup. If you are trying to sort the different offerings for backup, hopefully you can use the following as a something of a guide.

(This is a bit of a background piece, for somewhere I want to go later.)

Before we can talk about backups and the cloud, we need to cover the general case of backup.

Backup storage by kind

Factored by present technology choices, there are three primary forms of backup:

Backups in primary storage.
Backups in storage specialized for backup.
Backup to the cloud.

There are also hybrids of the above three variants.

For (1) your highest performance and highest cost choice is to keep backups in primary storage. High-end storage uses references and hashes to store volumes, not bare indices to raw blocks. That level of abstraction at the storage-side means creating a read-only clone of an entire volume (as needed for backup) can be a sub-second operation. Restores can also be sub-second. If you need (and can afford) the highest levels of backup/restore performance, keep your backups in smart primary storage.

There are problems with this approach.

The first is cost. Your smart/primary storage is expensive. In reality, almost all the data written into backups will never be read. Even with deduplication (we are assuming smart storage), the larger part of your expensive storage is occupied by data that will never be used. If you need - and can afford - the highest performance levels, this is what you will use.

The second is fragility. Failure of primary storage means you lose not just your current data, but also all your backups. For most folk, this is not acceptable, so you need a hybrid solution, with some form of replication.

Note also if you are not using "smart" storage, performance with this approach is poor.

For (2) the vendor can take advantage of the fact most backups are written, but never read.

Some form of deduplication is a staple. The algorithms used for both ingest and expiration are critical. Performance is less than (1), and cost is less as well.

For (3) the cost is lowest, as performance is lowest.

Performance is much less than (1) or (2) ... (or at least should be less). As the data is off-site, this approach is more robust in the disaster-recovery case.

There are of course an entire assortment of hybrid solutions.

Quality of implementation

Whatever the approach, the algorithms chosen, and the quality of implementation - can either meet the optimum, or not. As backups move a lot of data, these algorithms matter.

Replication

If all your backups are in one place, and something bad happens to that place, you could lose all your backups. Most likely, you want to replicate all or some of your backups to a distinct place. Again, algorithms matter.

Cloud backup in general

When cloud backup is mentioned, there are quite distinct cases.

Backup from on-premise to the cloud.
Backup of applications in the cloud to on-premise.
Backup of applications in the cloud, within the cloud.

You could buy distinct solutions for each of the above cases. That does not sound like fun.

So the interesting question becomes: Can you design backup that works for all three cases?

Note that (3) in particular constrains the solution.

Cloud backup in OpenStack

When backup is mentioned in OpenStack forums, there are distinct cases.

Backup of the cloud infrastructure.
Backup of the bulk data in the cloud.
Backup of tenant applications and data within the cloud.

You hear (1) and (2) rather a lot from folk who are deploying OpenStack. Backing up your cloud infrastructure (1) is critical and modest in scope. Conventional approaches to backup for Linux applications work well.

Bulk data backup (2) is simple, but not very useful. If you have any active tenant applications, then tenant data is very likely in an inconsistent state.

The more interesting case is (3). Tenants need an on-demand and scheduled service to backup their applications and data in the cloud.