How the data is stored and archived
Operate imports data from Zeebe and stores it in Elasticsearch indices with defined prefix (default:
- deployed processes, including the diagrams
- the state of process instances, including variables, flow nodes, that were activated within instance execution, incidents etc.
It additionally stores some Operate specific data:
- operations performed by the user
- list of users
- technical data, like the state of Zeebe import etc.
The data that represents process instance state becomes immutable after process instance is finished. At this moment the data may be archived, meaning that
it will be moved to a dated index, e.g.
operate_variables_2020-01-01, where date represents the date on which given process instance was finished.
The same is valid for user operations: after they are finished the related data is moved to dated indices.
Note: All Operate data present in Elasticsearch (from both "main" and dated indices) will be visible from the UI.
In case of intensive Zeebe usage the amount of data can grow significantly with the time, therefore you should think about the data cleanup strategy. Dated indices may be safely removed from Elasticsearch. "Safely" means here, that only finished process instances will be deleted together with all related data, and the rest or the data will stay consistent. You can use Elasticsearch Curator or other tools/scripts to delete old data.
Attention: Only indices that contain dates in their suffix may be deleted.