Skip to content

feat(processors.round): Add plugin #17078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

alexgokhale
Copy link
Contributor

Summary

This PR adds a denoise plugin, which allows numeric field values to be rounded to the specified number of significant figures. This is particularly useful for metrics that fluctuate often, as after they have been rounded they can be deduplicated effectively.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #17077

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label May 30, 2025
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @alexgokhale! Some comments in the code...

@srebhan srebhan changed the title feat(processors): add Denoise plugin feat(processors.denoise): Add plugin Jul 1, 2025
@srebhan srebhan self-assigned this Jul 1, 2025
@srebhan srebhan added the waiting for response waiting for response from contributor label Jul 7, 2025
@alexgokhale
Copy link
Contributor Author

Thanks for the review! I was away last week but will actions your comments as soon as possible :)

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jul 7, 2025
@srebhan
Copy link
Member

srebhan commented Jul 8, 2025

@alexgokhale please let me know once you are ready for another review round...

@alexgokhale alexgokhale changed the title feat(processors.denoise): Add plugin feat(processors.round): Add plugin Jul 9, 2025
@alexgokhale
Copy link
Contributor Author

@srebhan Hey -- just wanted to check in before I make the last changes (to switch to Precision and then re-organise the file):

The use case that prompted this feature request from myself was for different voltage metrics being captured -- one is a 24V line and another is a 5V line. In both cases we want 3 s.f., but this would be a different number of decimal places for each (e.g. 23.8V for the 24V rail, 5.21V for the 5V rail).

I appreciate that when creating a processor for Telegraf the desire is to build for what will be most useful across all cases, so I think it would be reasonable to switch to Precision instead -- would the recommended flow for a use case like my own be to have two different instances of the processor (one with precision -1, the other with -2, each targeting different metrics/fields?

@alexgokhale alexgokhale requested a review from srebhan July 9, 2025 08:56
@alexgokhale
Copy link
Contributor Author

Also one clarification on the meaning of positive/negative numbers on precision – might it make more sense to have a positive number meaning to the right of the decimal separator and negative meaning to the left?

I think this would probably better reflect the common use cases for a plugin like this, which are (probably) more likely to be rounding to a number of decimal places than to be rounding large numbers to a certain number of significant figures?

One potential risk of the precision concept as a whole is that it could be very difficult to define for larger numbers – for example if you're dealing with values in the millions/billions range, it could be more intuitive to define the rounding by the number of significant figures, rather than having to provide a precision of 7 for example.

E.g. for a metric with value 12345678, providing 2 and getting an output of 12000000 might be clearer than needing to provide 7, especially if the numbers are going to be on the border of rolling over/under to a new digit.

Hopefully this makes sense, happy to clarify anything but thought it could be useful to lay out how I've been thinking about usage so far! 😅

@srebhan
Copy link
Member

srebhan commented Jul 9, 2025

[...] would the recommended flow for a use case like my own be to have two different instances of the processor (one with precision -1, the other with -2, each targeting different metrics/fields?

Yes. Alternatively you could change the processor such that you can specify multiple "subsections" with a precision and a field filter to stay with one instance. However, this would probably make things more complicated for other people as the would need another subsection.

One additional question regarding your use-case: Why do you round in Telegraf and not on the query/visualization side?

@srebhan
Copy link
Member

srebhan commented Jul 9, 2025

Also one clarification on the meaning of positive/negative numbers on precision – might it make more sense to have a positive number meaning to the right of the decimal separator and negative meaning to the left?

Yeah you are right. I was thinking of it as 10^precision that's why I used this nomenclature but your mileage may vary... :-) I'm fine either way as long as the setting description reflects what it does.

@srebhan
Copy link
Member

srebhan commented Jul 9, 2025

One potential risk of the precision concept as a whole is that it could be very difficult to define for larger numbers – for example if you're dealing with values in the millions/billions range, it could be more intuitive to define the rounding by the number of significant figures, rather than having to provide a precision of 7 for example.

I don't agree as usually you do have use-case specific requirements e.g. I need everything in a precision of 1/10th Volt or 1kV for my application. It seems like your use-case is more "visualization" driven and I would really suggest to do that type of manipulation on the query/visualization side...

@alexgokhale
Copy link
Contributor Author

Alternatively you could change the processor such that you can specify multiple "subsections" with a precision and a field filter to stay with one instance. However, this would probably make things more complicated for other people as the would need another subsection.

I agree – will leave as is, I think it's reasonable to just use more than one processor.

Why do you round in Telegraf and not on the query/visualization side?

We collect these values from a number of devices in a fleet, and naturally they can be very noisy (due to the precision we capture them at). We use the dedup processor to prevent us collecting values that haven't changed for other metrics, which significantly reduces the amount of data we need to store, but this currently isn't possible for our voltage/other noisy values because they fluctuate so much.

Because we only care about understanding these values to a certain precision (i.e. it's not useful to have the voltage to 10 d.p., nor is it probably that accurate), it would be ideal to avoid having to ingest this data in the first place.

I need everything in a precision of 1/10th Volt or 1kV for my application.

Good point, I think we're aligned on implementation 👍

@alexgokhale
Copy link
Contributor Author

@srebhan Think this should be ready for a re-review now 🙏

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alexgokhale! Almost there. Some more comments from my side.

@alexgokhale alexgokhale requested a review from srebhan July 15, 2025 08:33
@alexgokhale
Copy link
Contributor Author

@srebhan Have actioned your comments - there's one test case which I'm not sure on the behaviour of:

Rounding -34457 as an int with precision -2 (i.e. factor 100) results in an output of -34400, because the value is being truncated when dividing as an integer. Is this the behaviour you would expect?

I would have expected to want to round half away from zero here, resulting in -34500, although this would require handling part of the calculation as a float.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alexgokhale! I think your integer rounding is not correct. Please find a suggestion in the code. Furthermore, we can merge the two tests IMO and need to add a test for non-numeric fields. Altogether, only some minor changes...

@alexgokhale alexgokhale requested a review from srebhan July 16, 2025 13:31
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks @alexgokhale! One last suggestion regarding the field_include parameter in the sample-configuration... Don't forget to run make docs afterwards...

@alexgokhale alexgokhale requested a review from srebhan July 17, 2025 09:28
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alexgokhale! Fixing the tag to push the PR forward...

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Jul 17, 2025
@srebhan srebhan assigned skartikey and mstrandboge and unassigned srebhan Jul 17, 2025
@telegraf-tiger
Copy link
Contributor

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

☺️ This pull request doesn't significantly change the Telegraf binary size (less than 1%)

📦 Click here to get additional PR build artifacts

Artifact URLs

. DEB . RPM . TAR . GZ . ZIP
[[amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip] [arm64.deb armel.rpm darwin_arm64.tar.gz windows_arm64.zip] [armel.deb armv6hl.rpm freebsd_amd64.tar.gz windows_i386.zip] [armhf.deb i386.rpm freebsd_armv7.tar.gz ] [i386.deb ppc64le.rpm freebsd_i386.tar.gz ] [mips.deb riscv64.rpm linux_amd64.tar.gz ] [mipsel.deb s390x.rpm linux_arm64.tar.gz ] [ppc64el.deb x86_64.rpm linux_armel.tar.gz ] [riscv64.deb linux_armhf.tar.gz ] [s390x.deb linux_i386.tar.gz ] [ linux_mips.tar.gz ] [ linux_mipsel.tar.gz ] [ linux_ppc64le.tar.gz ] [ linux_riscv64.tar.gz ] [ linux_s390x.tar.gz ]]

Copy link
Contributor

@skartikey skartikey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexgokhale Great work on this PR, the type conversion is handled correctly, and the logic is functionally sound. Thanks for putting this together!

One small suggestion regarding code clarity: the current integer rounding logic works correctly, but it's a bit more complex than it needs to be. We might consider simplifying it by using math.Round, which also helps ensure consistent behavior across numeric types.

Here's a proposed refactor using generics to clean up the rounding logic:

// Simplified rounding for integers only
func roundInt[V constraints.Integer](value V, factor float64) V {
	// For positive precision (factor < 1.0), integers don't need rounding
	if factor < 1.0 {
		return value
	}
	
	// Use standard math.Round for consistent behavior
	rounded := math.Round(float64(value) / factor) * factor
	return V(rounded)
}

// Simplified rounding for floats only  
func roundFloat[V constraints.Float](value V, factor float64) V {
	// Always round floats regardless of factor
	rounded := math.Round(float64(value) / factor) * factor
	return V(rounded)
}

// The round() method with corrected function calls:
func (p *Round) round(value interface{}) interface{} {
	switch v := value.(type) {
	case int:
		return roundInt(v, p.factor)
	case int8:
		return roundInt(v, p.factor)
	case int16:
		return roundInt(v, p.factor)
	case int32:
		return roundInt(v, p.factor)
	case int64:
		return roundInt(v, p.factor)
	case uint:
		return roundInt(v, p.factor)
	case uint8:
		return roundInt(v, p.factor)
	case uint16:
		return roundInt(v, p.factor)
	case uint32:
		return roundInt(v, p.factor)
	case uint64:
		return roundInt(v, p.factor)
	case float32:
		return roundFloat(v, p.factor)
	case float64:
		return roundFloat(v, p.factor)
	default:
		p.Log.Tracef("Invalid type %T for value '%v'", value, value)
		return value
	}
}

I haven’t tested this refactor, but in theory it should work as intended. Let me know your thoughts, happy to discuss or tweak the approach.

Comment on lines +21 to +22
IncludeFields []string `toml:"include_fields"`
ExcludeFields []string `toml:"exclude_fields"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The README shows fields_include and fields_exclude, but the struct uses IncludeFields and ExcludeFields. The TOML tags should match the documentation.

return V((v - r) * f)
}

func roundFloat[V constraints.Integer | constraints.Float](value V, factor float64) V {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows integers, but looking at the usage, only float32 and float64 call this function. The constraint should be:

Suggested change
func roundFloat[V constraints.Integer | constraints.Float](value V, factor float64) V {
func roundFloat[V constraints.Float](value V, factor float64) V {

@alexgokhale
Copy link
Contributor Author

@skartikey On the rounding logic   @srebhan suggested keeping the logic for the two of these separate due to previous issues experienced with sharing the same rounding logic for floats and integers. Your suggestion makes the functions almost identical, is there a shared preference for whether to keep the logic for these distinct vs just using math.Round? I will then update to reflect that decision.

For fields_include/exclude it seems like some processors use include_fields (e.g. noise), but the recommendation here was to use fields_include. Which of these is the most up to date preference across telegraf? I'll then make the TOML keys, sample config, and variable names consistent with this.

@alexgokhale alexgokhale requested review from skartikey and srebhan July 21, 2025 11:14
@skartikey
Copy link
Contributor

@alexgokhale Thanks for the context. I think let’s keep the rounding logic as it is for now, we can always revisit and improve it later if needed.

Copy link
Contributor

@skartikey skartikey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexgokhale Nice work! Thanks for your contribution!

@skartikey skartikey assigned mstrandboge and unassigned skartikey Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin new plugin plugin/processor ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add denoise plugin
4 participants