Historical Metrics
Management Center persists metrics so that you can view historical data for each cluster. You can customize how long data is persisted by changing the time-to-live (TTL).
Persisted metrics allow you to check the status of a cluster at a time in the past. Metrics include various time series data, including CPU load, memory consumption, and operation counters.
How Metrics are Persisted
When Management Center receives metrics from a Hazelcast cluster, it groups them by type into one-minute buckets before compressing them and saving them to disk in the hazelcast-mc/metrics
directory. For example, metrics for CPU load goes into one bucket, while metrics for memory consumption go into another bucket.
You can customize where metrics data is persisted. See Configuring Management Center. |
Every 10 seconds, Management Center starts a metrics-persistence thread that checks for one-minute buckets. To allow for late metrics due to poor connection or other errors, the metrics-persistence thread waits until one-minute buckets are 70 seconds old. This way, one-minute buckets have a 10-second threshold in which late metrics may be saved to disk.
For example:
-
Management Center receives the first chunk of metrics at 9:00:00 AM.
-
A one-minute bucket is created for each metric type in memory and each bucket’s timestamp is set to 9:00:00 AM. At this point nothing is saved to disk.
-
The next chunk of metrics is received at 9:00:03 AM. No new one-minute buckets are created, only the new metrics are added to the existing one-minute buckets.
-
The first metrics-persistence thread run starts at 9:00:10 AM. The thread didn’t find any minute buckets that are 70 seconds old. Still, nothing is saved to disk.
-
At 9:01:10 AM, the metrics-persistence thread saves the compressed one-minute buckets to disk.
Each in-memory minute bucket consumes around 0.5 KB of memory. |
Changing the Metrics Time-to-Live
To customize the number of days for which Management Center persists metrics, start Management Center with the hazelcast.mc.metrics.disk.ttl.duration
property.
This property takes a duration in days, represented in the ISO-8601 format:
P(n)D
-
P
is the duration designator (referred to as "period"), and is always placed at the beginning of the duration. -
(n)
is the number of days. -
D
is the day designator that follows the value for the number of days.
This example represents a duration of four days:
P4D
This setting is a soft limit, which gives you indirect control over Management Center disk usage. The actual disk usage depends on the volume of metrics that are generated by your clusters such as the number of cluster members and the number of data structures with enabled statistics that you have in the connected clusters.
Persistence Logs
Every hour, you will see the following message in the logs:
Known time series
Tracked minute buckets
Number of persistence runs
Total persistence run time
Max persistence run time
Average persistence run time
Last hour average persistence run time
Total persisted minute buckets
Max persisted minute buckets per run
Average persisted minute buckets per run
Last hour average persisted minute buckets per run
Total dropped data points that arrived when the target minute bucket is no longer tracked, but not yet released(persisted)
Total evicted dangling minute series
Persistent store TTL
Persistent store size on disk
Data point memory compression ratio
These logs provide some basic persistence statistics and details about any active persistence configuration. All time values are logged in HH:mm:ss.SSS format.
Data point memory compression ratio
shows how many times less disk space is used than in-memory storage.
Troubleshooting
Use this section to find suggestions for resolving errors that may occur with metrics persistence.
RocksDB File Permissions
Management Center uses the RocksDB native library to persist metrics. By default, the RocksDB binary
is extracted into the java.io.tmpdir
directory. To overcome file system permissions, you may need to override
this default directory by setting the ROCKSDB_SHAREDLIB_DIR
environment variable to the absolute path of another directory. See the following example:
mkdir $HOME/tmp/rocksdb
export ROCKSDB_SHAREDLIB_DIR="$(realpath $HOME/tmp/rocksdb)"
java -jar hazelcast-management-center-5.0.4.jar
Slow Background Persistence Runs
If Management Center tries to persist more metrics than the underlying persistent storage or disk throughput can handle, you will see the following warning:
Detected slow background persistence runs: 6 runs took 61000 ms. Consider decreasing count of collected metrics.
To resolve this issue, you can do the following:
-
Decrease the amount of metrics that the members send. See the Metrics section of the Hazelcast documentation for details about configuring metrics collection.