-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Describe the bug
When I use Grafana to view Loki logs, I found that in certain situations, Grafana displays incomplete logs.
For example, when I query logs from 0:00 to 0:30
, if sorted by "newest first"
, it will show logs from 0:15 to 0:30
and logs from 0:00 to 0:10
, which makes me mistakenly think there are no logs between 0:10 to 0:20
(but there actually are).
To Reproduce
Steps to reproduce the behavior:
- Started Loki (3.5.0)
- Application logs are collected by the vector process and sent to Kafka, then Alloy collects the Kafka data and finally sends it to Loki.
- Application are multiple similar apps (all called gas), but they have different tags, such as: gas1, gas2, etc.
- Query:
{cluster="g1009", app="gas"} |=
LuaServerPlayerCharacter:onHeroSpawnAvailable``
Expected behavior
All logs that meet the criteria are output sorted by time.
Environment:
- Infrastructure: app-log --> vector ---> kafka ---> alloy --> loki
- Deployment tool: helm
Screenshots, Promtail config, or terminal output

Then I selected "oldest first"
and found there are still missing logs, or more accurately, the sorting is abnormal.

loki config
limits_config:
ingestion_rate_strategy: "local"
ingestion_rate_mb: 50
ingestion_burst_size_mb: 200
max_label_name_length: 1024
max_label_value_length: 2048
max_label_names_per_series: 30
reject_old_samples: true
reject_old_samples_max_age: 168h
creation_grace_period: 10m
discover_service_name: [app component]
discover_log_levels: true
log_level_fields: [loglevel level LEVEL Level]
use_owned_stream_count: false
max_streams_per_user: 0
max_global_streams_per_user: 50000
unordered_writes: true
per_stream_rate_limit: 128MB
per_stream_rate_limit_burst: 521MB
max_chunks_per_query: 2000000
max_query_series: 500
max_query_length: 30d1h
max_query_range: 0
max_query_parallelism: 32
cardinality_limit: 50000
max_streams_matchers_per_query: 1000
max_concurrent_tail_requests: 10
max_entries_limit_per_query: 5000
max_cache_freshness_per_query: 10m
query_timeout: 300s
split_queries_by_interval: 15m
split_metadata_queries_by_interval: 12h
split_instant_metric_queries_by_interval: 1h
min_sharding_lookback: 0s
deletion_mode: filter-and-delete
retention_period: 360h
volume_enabled: true
Related analysis
Due to my configured split_queries_by_interval
being 15m
and Grafana's limits
being set to 3000
, a 30-minute log query will be split into two 15-minute segments. Since the first 15-minute segment doesn't reach the 3000 row limit, it continues to query the second 15-minute segment. However, for some unknown reason, the sorting of the logs in the second segment is not executed as expected.
Then I conducted another experiment: if there is only one log stream
(in my example, only gas1), meaning all labels are the same with no unique labels, I found that this situation does not occur.
Then I captured the query requests sent from Grafana to Loki and found the requests as follows: http://lokiXXXXXXXX/loki/api/v1/query_range?direction=forward&end=1757062799000000000&limit=3000&query=%7Bcluster%3D%22g1009%22%2C+app%3D%22gas%22%7D+%7C%3D+%60LuaServerPlayerCharacter%3AonHeroSpawnAvailable%60&start=1757061000000000000&step=1000ms, with the corresponding results as follows:
