Data retention
The Orchestration Cluster centrally manages data retention for all data using unified storage and policy configuration.
All cluster data, including deployed process definitions, process instance state, user operations, and technical metadata, is written to secondary storage. Depending on your configuration, this secondary storage is backed by Elasticsearch/OpenSearch or an RDBMS. The data representing process instance state becomes immutable after the process instance is finished, and it becomes eligible for archiving.
Secondary storage is configurable. Choose the backend that best fits your requirements for indexing, querying, retention, and operations. See configuring secondary storage for setup guidance, and refer to secondary storage for terminology and conceptual context.
When using Elasticsearch/OpenSearch, finished data is moved to a dated index (for example, operate-variable_2020-01-01), with the suffix representing the completion date of the associated process or operation. Data from both main and dated indices remains searchable and visible in the UI. For RDBMS backends, the exporter does not create dated indices. Data remains in the same tables and stays visible until retention policies delete it.
Archive period
The time between a process instance finishing and being moved to a dated index can be configured using the waitPeriodBeforeArchiving parameter. Refer to that configuration for the current default value.
Data cleanup
The amount of stored data can grow significantly over time. Therefore, we recommend implementing a data cleanup strategy. Dated indices, which contain only finished process instances, may be safely removed from Elasticsearch/OpenSearch.
In the Orchestration Cluster, strategies for the deletion of archived data can be defined via the retention configuration.