As Zeebe fully manages the state of your process instances, consider taking backups of Zeebe data; this is crucial to prevent data loss, roll back application-level errors, and more.
Zeebe is fault-tolerant and replicates state internally. Backups are only necessary if you'd like to protect against the loss of entire replica sets or data corruption bugs.
State of other components, such as Operate and Tasklist, is not managed by Zeebe and must be backed up separately.
Taking backups is a manual process that is highly dependent on your infrastructure and deployment. Camunda does not provide an automated backup mechanism or tool. However, we do offer the following guidance to create and execute a successful backup.
Cold backups, also called offline backups, require downtime.
During the downtime, processes don't make progress and clients can't communicate with Zeebe. To make sure that the downtime doesn't cause issues for your clients, you should test how your clients behave during the downtime, or shut them down as well.
Shutting down all brokers in the cluster
To take a consistent backup, all brokers must be shut down first.
As soon as brokers shut down, partitions become unhealthy and clients lose connections to Zeebe or experience full backpressure. To prevent unnecessary failovers during the shutdown process, we recommend shutting down all brokers at the same time instead of a gradual shutdown.
Wait for all brokers to fully shut down before proceeding to the next step.
Creating the backup
data folder contains symbolic and hard links which may require special attention when copying, depending on your environment.
To create the backup, take the following steps:
- Each broker has a data folder where all state is persisted. The location of the data folder is configured via
zeebe.broker.data.directory. Create a copy of the data folder and store it in a safe location.
If you have direct access to the broker, for example in a bare-metal setup, you can do this by creating a tarball like this:
tar caf backup.tar.gz data/.
You may also use filesystem snapshots or Kubernetes volume snapshots if that fits your environment better
- Double-check that your tool of choice supports symbolic and hard links.
- Do not merge or otherwise modify data folders as this might result in data loss and unrestorable backups.
- Save the broker configuration to ensure the replacement cluster can process the backed-up data.
See the following example on how a backup may look:
$ tree zeebe-backup-*
After taking the backup, brokers can be started again and will automatically resume with processing.
Restore from backup
Prepare replacement cluster
Always use the same or the next minor version of Zeebe that you were using when taking the backup. Using a different version may result in data corruption or data loss. See the update guide for more details.
Ensure your replacement cluster has the same number of brokers as the old cluster and uses the same node IDs.
Shutting down all brokers in the replacement cluster
Before installing the backup, ensure all brokers are fully shut down.
Installing the backup
To install the backup, take the following steps:
- Delete the existing data folder on each broker of your replacement cluster.
- For each broker, copy over the configuration and the data folder.
- You may need to slightly adjust the configuration for your replacement cluster, for example to update IP addresses.
Starting the Zeebe cluster
After replacing the data folders, brokers can be started again and will automatically resume with processing.