Skip to content

Memory leak on "api/v1/series" endpoint with high cardinlity metrics on 0.0.63+ versions #415

@Skunnyk

Description

@Skunnyk

Hi,

We are using promxy in front of VictoriaMetrics.
After upgrading from 0.0.60 to 0.0.68, we've seen huge memory usage on promxy process (dozen of GB !).
After some tests, the problem starts to occurs from promxy 0.0.63 (with the prometheus 2.24 merge).

It seems that "variable" (label_values) call by grafana are the cause, when querying high cardinality metrics.

For example, to retrieve all my kafka clusters, I use label_values(kafka_topic_partition_current_offset, job)

The related query fired and received by promxy is :
http://localhost:8082/api/v1/series?match%5B%5D=kafka_topic_partition_current_offset&start=1617353041&end=1617374641

Starting from here, I see promxy ram usage going to the sky.

promxy[9525]: time="2021-04-02T17:02:40+02:00" level=debug msg="http://promlts02.tsdb02.production.int:8428" api=GetValue end="2021-04-02 14:44:01 +0000 UTC" matchers="[__name__=\"kafka_topic_partition_current_offset\"]" start="2021-04-02 08:44:01 +0000 UTC" took=7.837578603s
promxy[9525]: time="2021-04-02T17:02:40+02:00" level=debug msg="http://promlts01.tsdb02.production.int:8428" api=GetValue end="2021-04-02 14:44:01 +0000 UTC" matchers="[__name__=\"kafka_topic_partition_current_offset\"]" start="2021-04-02 08:44:01 +0000 UTC" took=7.847713652s
promxy[9525]: time="2021-04-02T17:02:40+02:00" level=debug msg=Select matchers="[__name__=\"kafka_topic_partition_current_offset\"]" selectHints="&{1617353041000 1617374641000 0 series [] false 0}" took=7.929695972s
promxy[9525]: 127.0.0.1 - - [02/Apr/2021 15:02:40] "GET /api/v1/series HTTP/1.1 200 403911" 7.941467 match%5B%5D=kafka_topic_partition_current_offset&start=1617353041&end=1617374641

I have some metrics with huge cardinality (> 100000, like mysql_perf_schema_table_io_waits_total from the mysqld_exporter), if I run http://localhost:8082/api/v1/series?match%5B%5D=mysql_perf_schema_table_io_waits_total&start=1617353041&end=1617374641 against promxy 0.0.68, ram usage can touch the sky with more than 20GB for the promxy process :D

The same query against promxy 0.0.62 works fine.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions