Skip to content

Per-callsite collect trait object #1907

@carllerche

Description

@carllerche

This RFC proposes altering the Collect trait to split out per-callsite
operations into a second trait. Additionally, all thread-local operations are
removed in favor of registering traits statically. Thread-local behavior can be
layered on in tracing-subscriber.

trait RegisterCallsite: 'static {
    fn register_callsite(&self, metadata: &'static Metadata<'static>) -> DynCollect;
}

trait Collect: 'static {
    fn interest(&self, metadata: &'static Metadata<'static>) -> Interest;

    // Rest of the `Collect trait methods are identical, except there is no `reigster_callsite`
}

Note, all type names are mostly placeholder at this point. Naming suggestion are
welcome.

Motivation

It is common for Collect implementations to perform logic based on the type of
callsite being instrumented. This sometimes takes the form of a sequence of if
blocks checking the Metadata when handling an event. For [Tokio Console], each
event results in a linear
scan
.

Supporting per-callsite Collect traits would enable removing this overhead by
enabling the Collect implementor to define separate trait implementation for
each case. Theoretically, a single dynamic dispatch would be required for
instrumentation.

Details

Each callsite will hold a single DynCollect instance. Think of DynCollect as
Arc<dyn Collect> for now. The static callsite struct used by the macro
will look something like:

struct MacroCallsite {
    collect: AtomicDynCollect,

    // The rest stays the same. Roughly....
    interest: AtomicUsize,
    meta: &'static Metadata<'static>,

}

where AtomicDynCollect is a wrapper around AtomicPtr with some added logic
to handle the trait-object (more later). The macro will generate something like:

static CALLSITE: tracing_core::callsite::Entry<MacroCallsite> = default();

The Entry decorator is a node in an intrusive linked list tracking every known
callsite in the system. When the callsite instance is first used, it is appended
to the global linked list. This is similar to the current behavior.

The next step is different. After the callsite is registered with the global
list, it acquires an instance of DynCollect by calling register_callsite on
the global RegisterCallsite instance.

If the collector has no interest in the callsite (equivalent to today's
Interest::Never), then a no-op DynCollect instance is returned. Otherwise,
an instance of DynCollect specific to the callsite is returned by the
RegisterCallsite. The DynCollect is stored in the static callsite instance
using an atomic operation.

The process of instrumenting events is the same as today. The same methods are
called.

DynCollect

The DynCollect is roughly equivalent to Arc<dyn Collect>. However, we would
like to support storing a collect instance in a callsite using an atomic
operation and avoid locking. This will help cases like rustc where the initial
registration process carries significant total cost as callsites are often used
only a handful of times. An Arc<dyn Collect> consists of two pointers: one to
the data and one to the v-table. This means we cannot use an AtomicPtr.

To enable atomic operations for storing Collect instances, a different virtual
dispatch strategy is used. The VTable is embedded as a header to the data. The
DynCollect type looks like:

struct DynCollect {
    ptr: *const DynCollectHeader,
}

struct DynCollectHeader {
    ref_count: AtomicUsize,
    vtable: &'static DynCollectVtable,
}

This is similar to the struct layout and strategy used by Tokio tasks. More
details can be found in that
code

The downside of this strategy is there is no zero-cost conversion between
Arc<dyn Collect> and DynCollect. That said, most tracing users use the
Subscribe API and not the tracing-core::Collect API, so end user impact should
be minimal.

Locking

Currently, the callsite registration process is synchronized using a lock
similar to OnceCell. We could theoretically remove the lock if we are OK w/
optimistic registration. Concurrent first calls to a callsite could both
call register_callsite() and only one would win the compare-and-swap. This may
be confusing the user and benefits are not clear.

Subscribe API

The Subscribe trait in tracing-subscriber should also have a callsite
specific trait. The split would be similar to RegisterCallsite and Collect,
except that the register_callsite method probably could just return regular
trait object like Arc<dyn Subscribe> or Box<dyn Subscribe>.

Open questions

All the naming is TBD. Also, more details might come to light during
implementation. The proposal is mostly a high level sketch.

Metadata

Metadata

Assignees

Labels

crate/coreRelated to the `tracing-core` cratekind/rfcA request for comments to discuss future changes

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions