Skip to content

Log scrape errors #3292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thampiotr opened this issue Apr 11, 2025 · 1 comment
Open

Log scrape errors #3292

thampiotr opened this issue Apr 11, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@thampiotr
Copy link
Contributor

thampiotr commented Apr 11, 2025

Request

It would be great if Alloy provided an option (enabled by default may be good idea) for logging prometheus.scrape errors for targets.

This could lead to a lot of logs, so some level of rate limiting may be necessary. But often users see scrapes failing and don't see the reason for them.

Use case

Finding scrape failure reason.

The failure reason can currently be discovered in the UI, but the experience of this is poor, especially in a large cluster. The UI can be improved, but seems like logging is still a good idea, even if UI would have a nice tool for finding target status in the cluster.

@kalleep
Copy link
Contributor

kalleep commented Apr 14, 2025

Please correct me if I am wrong but after a brief look into what it would take to add these kind of logs this is what I have found so far:

  1. Every scrape job is handled by a scrape.Manager that is coming from Prometheus project. This one runs the scrap loops and internally record scrape errors.
  2. There is already options to log scrape errors to a file and we use the JSON logger from Prometheus for this but we need to configure the file name for this https://github.com/grafana/alloy/blob/main/internal/component/prometheus/scrape/scrape.go#L82

Possible solutions:

  1. Run a job that periodically check for any recorded errors on scrape targets and logs them. Not sure this would be the best solutions because it would add more read locks to targets
  2. Reuse the mechanism that is already there to log scrape errors. We don't have to supply a logger that logs to file here but we need to set ScrapeFailureLogFile to something for it to work. Not sure how you would combine both settings i.e. log scrape errors to std out and file if configured.
  3. I think this should should also be possible to setup without any code changes with
prometheus.scrape "scrape" {
  // ... other config
  scrape_failure_log_file = "some-file.log"
}

local.file_match "local_files" {
     path_targets = [{"__path__" = "some-file.log"}]
     sync_period = "5s"
}

// other config to to relabel / push logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants