Date: 2025-01-24
Accepted
To evaluate code performance, it is necessary to collect metrics such as DB/HTTP query execution time, CPU consumption, RAM usage.
For performance measurement - Uptrace.
For collecting internal JVM metrics - Prometheus and Grafana (see their demo).
Ability to evaluate the performance of all project subsystems.
This is a popular and powerful solution. The vendor website offers a cloud option, self-hosting it free.
Uptrace accumulates metrics in ClickHouse and stores metric metadata in Postgres. The graphs are useful because they display percentiles1 instead of average values: p50, p90, p99. This helps answer questions like what time it takes to process 90% of incoming REST requests, leaving peaks (10%) out of scope. Accordingly, business requirements for speed are not applied to 100% of requests, which is technically impossible to guarantee, but to 90% or 99%. Here’s an example graph from the vendor website:
A percentile is a measure where a percentage of the sample does not exceed it. For example, p90 for query execution time means that 90% of queries do not exceed this number of seconds.
↩