Both developers and DBAs know that MySQL backups are critical for restoring data in case of a loss: backups should be taken and tested frequently. However, there might be cases where some MySQL backups will be required to be kept for a longer time – that’s where MySQL backup archiving can help.
What is MySQL Backup Archiving?
Simply put, MySQL backup archiving ensures that older backup versions can be retained as needed. MySQL backup archiving differs from MySQL backups because a backup is a copy of data that is taken so it can be used in case of data loss or data corruption Backup archiving, on the other hand, refers to the actions that keep a log of MySQL backups so that older copies of the backups can be restored when needed. Before archiving your backups, identify the data about to be archived, determine whether you need to access the data frequently or not, and also ensure that all applicable regulations are being followed.
Making use of Percona Toolkit
Percona Toolkit’s pt-archiver allows you to archive rows from a MySQL table into another table or a file. To use the tool, you first have to install it:
- Navigate to the Percona Toolkit’s download page;
- Choose what version of the tool you want to install and on what operating system;
- Install the tool by running the following command (e.g. For RedHat-Based OS, replace the percona-toolkit-3.2.1-1.el7.x86_64.rpm with the file you chose):
$ rpm -ivh percona-toolkit-3.2.1-1.el7.x86_64.rpm
- Finally, check if the installation is successful: the command pt-archiver –help should provide you with the help information. If it does, you’ve installed the tool successfully.
Here are a few sample commands (here the –h means the host, the
--D option is the name of your database, and the
--t option is your table). The records will be archived in a file specified in the
--file option. The –where option specifies the WHERE clause, the
--limit option specifies the LIMIT clause, and the
--commit-each commits each set of fetched and archived rows):
- To archive a set of records to a file:
$ pt-archiver --source --h=localhost, --D=database,--t=table --file ’/home/user/%Y-%m-%d-backup’ --where “column=’value’” --limit-5000
- To archive a set of records from one server (oltp_server) to another server (olap_server):
$ pt-archiver --source h=oltp_server, D=database, t=table --dest h=olap_server --file '/backups/%Y-%m-%d-%D.%t' --where "1=1" --limit 1000 --commit-each
Replace the database and table with the names of your database and table.
You might also want to add the
--no-delete option to tell the tool to not delete archived rows.
If you want to see queries the pt-archive tool was running, turn on the general log before running pt-archiver:
SET GLOBAL general_log = ON;
Then, after the backup process is complete, open the file set in the general_log_file parameter in
Note: Keep in mind that this SET GLOBAL command will log all the database activity, so, depending on the traffic, it could affect your database performance.
The ARCHIVE Storage Engine
The ARCHIVE storage engine, judging by the name, should be one of the options too, right? The engine was first introduced in MySQL 4.1.3 – its main advantages include:
- The engine produces special-purpose tables that store large amounts of data not taking up a lot of space on the disk;
- The engine supports compressed data – it uses zlib for data compression.
With that being said, the engine also has disadvantages:
- The engine does not support B-tree indexes, clustered indexes, full-text search indexes, or foreign keys;
- Geographic spatial reference systems are not supported;
- The engine is not ACID compliant.
Also, note that data cannot be deleted from this table – it can be only added. As the engine does not support indexes, so it’s not a good choice for data retrieval if you have a lot of rows.
Archiving Backups with ClusterControl
ClusterControl allows you to set a backup policy and upload data off-site to your desired cloud storage location. The tool supports AWS, GCP, and Microsoft Azure. To archive your backups using ClusterControl, follow the following steps:
- Create a backup using ClusterControl setting the backup method, backup host, if needed, select only the databases and tables you want to archive, and check “Upload Backup to the cloud”. Click “Continue”:
- Set the backup settings, click “Continue”:
After that, define the cloud settings.
- Now, ensure that you have setup lifecycle management and storage class transitions for your selected bucket – transitions should be transitioned to Amazon S3 Glacier because it is designed to archive data that will be rarely accessed.
MySQL data can be archived when the ARCHIVE engine is in use, using Percona’s pt-archiver, or using ClusterControl by Severalnines. As always, research the tools you are going to use, know their benefits and disadvantages, and choose wisely.