Buffered metric support #240

cretz · 2025-04-01T13:37:49Z

What was changed

Added Temporalio::Runtime::MetricBuffer users can create with a size and set on Runtime to buffer metrics. They can call retrieve_updates on the buffer to get buffered metrics. This gives more freedom for users to do what they want with the metrics.

Checklist

Closes [Feature Request] Buffered metrics for user use #176

Fixes temporalio#176

Sushisource · 2025-04-01T16:07:57Z

temporalio/ext/src/runtime.rs

        }
    }
+
+    pub fn retrieve_buffered_metrics(&self, durations_as_seconds: bool) -> Result<RArray, Error> {


Why not store whether durations are seconds on the buffer itself? Feels weird to pass it every time

Yeah, no real reason, just seemed to be a conversion-time parameter as opposed to struct state and is how we did it in Python. It is class state from a user POV though.

Sushisource · 2025-04-01T16:13:50Z

temporalio/ext/src/metric.rs

+// We can't use Ruby Opaque because it doesn't protect the object from being
+// GC'd, but we trust ourselves not to access this value outside of Ruby
+// context (which has global GVL to ensure thread safety).
+unsafe impl Send for BufferedMetricRef {}
+unsafe impl Sync for BufferedMetricRef {}


It's clear to me this is only ever used from one thread at a time (Sync), but it's not clear to me that it will always be from the same thread (Send).

Since it's stored on the runtime, and then cloned in the retrieve_buffered_metrics call, if a user calls that from a different thread, the Rc could have counting errors. Ideally we somehow force it to always be the same thread, or failing that we need to warn the user that they can only ever call it from the same thread that the runtime was created on, I think.

So with GVL yes we can trust only one thread at a time. As for which thread, for our use case, any Magnus Ruby Value is accessible via any Ruby thread. There is a utility that does mark it as Send called Opaque (https://docs.rs/magnus/latest/magnus/value/struct.Opaque.html). But that has other downsides in that it doesn't prevent GC'ing and we can't combine them due to their constraints.

So we have to decide, do we want BoxValue which prevents GC but we have to mark Send/Sync ourselves, or do we want Opaque which has Send/Sync but does not prevent GC. Looking at the implementations of these and seeing that Opaque was just a transparent Send/Sync to prevent access without a Ruby object (so requires the GVL to be captured), I chose the BoxValue approach. But I could have used Opaque and manually did a rb_gc_register_address on new and rb_gc_unregister_address on drop, but this seems better.

Is there a Rust-side concern with Send I am missing if I trust how I use the value itself to be safe?

Yeah, so Opaque can be Send because it's only declared as such when wrapping a Ruby value - and it makes the statement that Ruby values are fine to move across threads - cool.

The problem here with our thing is almost unrelated to Ruby - it's about the Rc. The implementation of Rc's refcount relies on the fact that it never leaves the thread it was created on (IE: it uses something like thread local storage for refcounting). So, if our handler that is using it gets called from a thread that wasn't the one it was created on, since clone changes the refcount, that's UB

Oh, yes of course, I should not use Rc if I want to be Send safe, forgot I used it. This is my bad coding where I forced Rc to be send/sync when it's not. I should change my code to:

#[derive(Clone, Debug)] pub struct BufferedMetricRef { value: Arc<SendSyncBoxValue<Value>>, } struct SendSyncBoxValue<T>(BoxValue<T>); unsafe impl<T> Send for SendSyncBoxValue<T> {} unsafe impl<T> Sync for SendSyncBoxValue<T> {}

Or something like that. That way I can inherit send/sync as a property of the Ruby child only. I will fix this.

Yeah, that would be fine

Ok, I added this utility in 7ab900b, but since I applied it to another part where we were doing the same thing, it kinda expanded the Rust code that was changed. Basically, like Opaque, we require a &Ruby reference to prove that they are on a Ruby thread before accessing the value.

Sushisource

Cool! The safe box value makes sense to me

Buffered metric support

3acd1b5

Fixes temporalio#176

cretz requested a review from a team April 1, 2025 13:37

Sushisource reviewed Apr 1, 2025

View reviewed changes

Add Ruby Rust utility for Send/Sync BoxValue

7ab900b

Sushisource approved these changes Apr 9, 2025

View reviewed changes

Merge branch 'main' into metric-buffer

9fea660

cretz merged commit 89945c2 into temporalio:main Apr 9, 2025
7 checks passed

cretz deleted the metric-buffer branch April 9, 2025 20:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Buffered metric support #240

Buffered metric support #240

Uh oh!

cretz commented Apr 1, 2025

Uh oh!

Sushisource Apr 1, 2025

Uh oh!

cretz Apr 1, 2025

Uh oh!

Sushisource Apr 1, 2025

Uh oh!

cretz Apr 1, 2025

Uh oh!

Sushisource Apr 1, 2025

Uh oh!

cretz Apr 1, 2025 •

edited

Loading

Uh oh!

Sushisource Apr 1, 2025

Uh oh!

cretz Apr 2, 2025 •

edited

Loading

Uh oh!

Sushisource left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Buffered metric support #240

Buffered metric support #240

Uh oh!

Conversation

cretz commented Apr 1, 2025

What was changed

Checklist

Uh oh!

Sushisource Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

cretz Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

Sushisource Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

cretz Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

Sushisource Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

cretz Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sushisource Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

cretz Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sushisource left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cretz Apr 1, 2025 •

edited

Loading

cretz Apr 2, 2025 •

edited

Loading