Log scrape errors #3292

thampiotr · 2025-04-11T09:04:10Z

Request

It would be great if Alloy provided an option (enabled by default may be good idea) for logging prometheus.scrape errors for targets.

This could lead to a lot of logs, so some level of rate limiting may be necessary. But often users see scrapes failing and don't see the reason for them.

Use case

Finding scrape failure reason.

The failure reason can currently be discovered in the UI, but the experience of this is poor, especially in a large cluster. The UI can be improved, but seems like logging is still a good idea, even if UI would have a nice tool for finding target status in the cluster.

kalleep · 2025-04-14T08:54:03Z

Please correct me if I am wrong but after a brief look into what it would take to add these kind of logs this is what I have found so far:

Every scrape job is handled by a scrape.Manager that is coming from Prometheus project. This one runs the scrap loops and internally record scrape errors.
There is already options to log scrape errors to a file and we use the JSON logger from Prometheus for this but we need to configure the file name for this https://github.com/grafana/alloy/blob/main/internal/component/prometheus/scrape/scrape.go#L82

Possible solutions:

Run a job that periodically check for any recorded errors on scrape targets and logs them. Not sure this would be the best solutions because it would add more read locks to targets
Reuse the mechanism that is already there to log scrape errors. We don't have to supply a logger that logs to file here but we need to set ScrapeFailureLogFile to something for it to work. Not sure how you would combine both settings i.e. log scrape errors to std out and file if configured.
I think this should should also be possible to setup without any code changes with

prometheus.scrape "scrape" {
  // ... other config
  scrape_failure_log_file = "some-file.log"
}

local.file_match "local_files" {
     path_targets = [{"__path__" = "some-file.log"}]
     sync_period = "5s"
}

// other config to to relabel / push logs

thampiotr added the enhancement New feature or request label Apr 11, 2025

macabu marked this as a duplicate of #3291 Apr 11, 2025

macabu mentioned this issue Apr 11, 2025

Failed to scrape Prometheus endpoint log message does not include the error #3291

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log scrape errors #3292

Log scrape errors #3292

thampiotr commented Apr 11, 2025 •

edited

Loading

kalleep commented Apr 14, 2025 •

edited

Loading

Log scrape errors #3292

Log scrape errors #3292

Comments

thampiotr commented Apr 11, 2025 • edited Loading

Request

Use case

kalleep commented Apr 14, 2025 • edited Loading

thampiotr commented Apr 11, 2025 •

edited

Loading

kalleep commented Apr 14, 2025 •

edited

Loading