Skip to content

Commit 41f773f

Browse files
kenfinniganarminruOberon00
authored
Resolve semantic inconsistencies for non traditional messaging (open-telemetry#1027)
Co-authored-by: Armin Ruech <armin.ruech@dynatrace.com> Co-authored-by: Christian Neumüller <christian+github@neumueller.me>
1 parent 2acbd8d commit 41f773f

File tree

2 files changed

+129
-19
lines changed

2 files changed

+129
-19
lines changed

semantic_conventions/trace/messaging.yaml

Lines changed: 48 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ groups:
2626
brief: "A message sent to a queue"
2727
- id: topic
2828
value: "topic"
29-
brief: "A message broadcasted to the subscribers of the topic"
29+
brief: "A message sent to a topic"
3030
required:
3131
conditional: 'Required only if the message destination is either a `queue` or `topic`.'
3232
brief: 'The kind of message destination'
@@ -67,10 +67,16 @@ groups:
6767
type: number
6868
brief: 'The compressed size of the message payload in bytes.'
6969
examples: 2048
70+
- ref: net.peer.name
71+
note: >
72+
This should be the IP/hostname of the broker (or other network-level peer) this specific message is sent to/received from.
73+
required:
74+
conditional: If available.
75+
- ref: net.peer.ip
76+
tag: connection-level
77+
required:
78+
conditional: If available.
7079
constraints:
71-
- any_of:
72-
- 'net.peer.name'
73-
- 'net.peer.ip'
7480
- include: network
7581

7682
- id: messaging.producer
@@ -114,3 +120,41 @@ groups:
114120
brief: >
115121
Semantic convention for servers that consume messages received from messaging systems
116122
and always send back replies directed to the producers of these messages.
123+
124+
- id: messaging.kafka
125+
prefix: messaging.kafka
126+
extends: messaging
127+
brief: >
128+
Attributes for Apache Kafka
129+
attributes:
130+
- id: message_key
131+
type: string
132+
brief: >
133+
Message keys in Kafka are used for grouping alike messages to ensure they're processed on the same partition.
134+
They differ from `messaging.message_id` in that they're not unique.
135+
If the key is `null`, the attribute MUST NOT be set.
136+
note: >
137+
If the key type is not string, it's string representation has to be supplied for the attribute.
138+
If the key has no unambiguous, canonical string form, don't include its value.
139+
examples: 'myKey'
140+
- id: consumer_group
141+
type: string
142+
brief: >
143+
Name of the Kafka Consumer Group that is handling the message.
144+
Only applies to consumers, not producers.
145+
examples: 'my-group'
146+
- id: client_id
147+
type: string
148+
brief: >
149+
Client Id for the Consumer or Producer that is handling the message.
150+
examples: 'client-5'
151+
- id: partition
152+
type: number
153+
brief: >
154+
Partition the message is sent to.
155+
examples: 2
156+
- id: tombstone
157+
type: boolean
158+
required:
159+
conditional: 'If missing, it is assumed to be false.'
160+
brief: 'A boolean that is true if the message is a tombstone.'

specification/trace/semantic_conventions/messaging.md

Lines changed: 81 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,10 @@
1616
- [Messaging attributes](#messaging-attributes)
1717
* [Attributes specific to certain messaging systems](#attributes-specific-to-certain-messaging-systems)
1818
+ [RabbitMQ](#rabbitmq)
19+
+ [Apache Kafka](#apache-kafka)
1920
- [Examples](#examples)
2021
* [Topic with multiple consumers](#topic-with-multiple-consumers)
22+
* [Apache Kafka Example](#apache-kafka-example)
2123
* [Batch receiving](#batch-receiving)
2224
* [Batch processing](#batch-processing)
2325

@@ -27,18 +29,30 @@
2729

2830
Although messaging systems are not as standardized as, e.g., HTTP, it is assumed that the following definitions are applicable to most of them that have similar concepts at all (names borrowed mostly from JMS):
2931

30-
A *message* usually consists of headers (or properties, or meta information) and an optional body. It is sent by a single message *producer* to:
32+
A *message* is an envelope with a potentially empty payload.
33+
This envelope may offer the possibility to convey additional metadata, often in key/value form.
3134

32-
* Physically: some message *broker* (which can be e.g., a single server, or a cluster, or a local process reached via IPC). The broker handles the actual routing, delivery, re-delivery, persistence, etc. In some messaging systems the broker may be identical or co-located with (some) message consumers.
35+
A message is sent by a message *producer* to:
36+
37+
* Physically: some message *broker* (which can be e.g., a single server, or a cluster, or a local process reached via IPC). The broker handles the actual delivery, re-delivery, persistence, etc. In some messaging systems the broker may be identical or co-located with (some) message consumers.
38+
With Apache Kafka, the physical broker a message is written to depends on the number of partitions, and which broker is the *leader* of the partition the record is written to.
3339
* Logically: some particular message *destination*.
3440

41+
Messages can be delivered to 0, 1, or multiple consumers depending on the dispatching semantic of the protocol.
42+
3543
### Destinations
3644

37-
A destination is usually identified by some name unique within the messaging system instance, which might look like an URL or a simple one-word identifier.
38-
Two kinds of destinations are distinguished: *topic*s and *queue*s.
39-
A message that is sent (the send-operation is often called "*publish*" in this context) to a *topic* is broadcasted to all *subscribers* of the topic.
45+
A destination is usually identified by some name unique within the messaging system instance, which might look like a URL or a simple one-word identifier.
46+
Traditional messaging, such as JMS, involves two kinds of destinations: *topic*s and *queue*s.
47+
A message that is sent (the send-operation is often called "*publish*" in this context) to a *topic* is broadcasted to all consumers that have *subscribed* to the topic.
4048
A message submitted to a queue is processed by a message *consumer* (usually exactly once although some message systems support a more performant at-least-once mode for messages with [idempotent][] processing).
4149

50+
In a messaging system such as Apache Kafka, all destinations are *topic*s.
51+
Each record, or message, is sent to a single consumer per consumer group.
52+
Consumer groups provide *deliver once* semantics for consumers of a topic within a group.
53+
Whether a specific message is processed as if it was sent to a topic or queue entirely depends on the consumer groups and their composition.
54+
For instance, there can be multiple consumer groups processing records from the same topic.
55+
4256
[idempotent]: https://en.wikipedia.org/wiki/Idempotence
4357

4458
### Message consumption
@@ -47,11 +61,10 @@ The consumption of a message can happen in multiple steps.
4761
First, the lower-level receiving of a message at a consumer, and then the logical processing of the message.
4862
Often, the waiting for a message is not particularly interesting and hidden away in a framework that only invokes some handler function to process a message once one is received
4963
(in the same way that the listening on a TCP port for an incoming HTTP message is not particularly interesting).
50-
However, in a synchronous conversation, the wait time for a message is important.
5164

5265
### Conversations
5366

54-
In some messaging systems, a message can receive a reply message that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*.
67+
In some messaging systems, a message can receive one or more reply messages that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*.
5568
The grouping usually happens through some sort of "In-Reply-To:" meta information or an explicit *conversation ID* (sometimes called *correlation ID*).
5669
Sometimes a conversation can span multiple message destinations (e.g. initiated via a topic, continued on a temporary one-to-one queue).
5770

@@ -74,6 +87,7 @@ The span name SHOULD be set to the message destination name and the operation be
7487

7588
The destination name SHOULD only be used for the span name if it is known to be of low cardinality (cf. [general span name guidelines](../api.md#span)).
7689
This can be assumed if it is statically derived from application code or configuration.
90+
Wherever possible, the real destination names after resolving logical or aliased names SHOULD be used.
7791
If the destination name is dynamic, such as a [conversation ID](#conversations) or a value obtained from a `Reply-To` header, it SHOULD NOT be used for the span name.
7892
In these cases, an artificial destination name that best expresses the destination, or a generic, static fallback like `"(temporary)"` for [temporary destinations](#temporary-destinations) SHOULD be used instead.
7993

@@ -122,20 +136,19 @@ The following operations related to messages are defined for these semantic conv
122136
| `messaging.conversation_id` | string | The [conversation ID](#conversations) identifying the conversation to which the message belongs, represented as a string. Sometimes called "Correlation ID". | `MyConversationId` | No |
123137
| `messaging.message_payload_size_bytes` | number | The (uncompressed) size of the message payload in bytes. Also use this attribute if it is unknown whether the compressed or uncompressed payload size is reported. | `2738` | No |
124138
| `messaging.message_payload_compressed_size_bytes` | number | The compressed size of the message payload in bytes. | `2048` | No |
139+
| [`net.peer.ip`](span-general.md) | string | Remote address of the peer (dotted decimal for IPv4 or [RFC5952](https://tools.ietf.org/html/rfc5952) for IPv6) | `127.0.0.1` | Conditional<br>If available. |
140+
| [`net.peer.name`](span-general.md) | string | Remote hostname or similar, see note below. [2] | `example.com` | Conditional<br>If available. |
125141

126142
**[1]:** Required only if the message destination is either a `queue` or `topic`.
127143

128-
**Additional attribute requirements:** At least one of the following sets of attributes is required:
129-
130-
* [`net.peer.name`](span-general.md)
131-
* [`net.peer.ip`](span-general.md)
144+
**[2]:** This should be the IP/hostname of the broker (or other network-level peer) this specific message is sent to/received from.
132145

133146
`messaging.destination_kind` MUST be one of the following:
134147

135148
| Value | Description |
136149
|---|---|
137150
| `queue` | A message sent to a queue |
138-
| `topic` | A message broadcasted to the subscribers of the topic |
151+
| `topic` | A message sent to a topic |
139152
<!-- endsemconv -->
140153

141154
Additionally `net.peer.port` from the [network attributes][] is recommended.
@@ -176,6 +189,26 @@ In RabbitMQ, the destination is defined by an _exchange_ and a _routing key_.
176189
`messaging.destination` MUST be set to the name of the exchange. This will be an empty string if the default exchange is used.
177190
The routing key MUST be provided to the attribute `messaging.rabbitmq.routing_key`, unless it is empty.
178191

192+
#### Apache Kafka
193+
194+
For Apache Kafka, the following additional attributes are defined:
195+
196+
<!-- semconv messaging.kafka -->
197+
| Attribute | Type | Description | Example | Required |
198+
|---|---|---|---|---|
199+
| `messaging.kafka.message_key` | string | Message keys in Kafka are used for grouping alike messages to ensure they're processed on the same partition. They differ from `messaging.message_id` in that they're not unique. If the key is `null`, the attribute MUST NOT be set. [1] | `myKey` | No |
200+
| `messaging.kafka.consumer_group` | string | Name of the Kafka Consumer Group that is handling the message. Only applies to consumers, not producers. | `my-group` | No |
201+
| `messaging.kafka.client_id` | string | Client Id for the Consumer or Producer that is handling the message. | `client-5` | No |
202+
| `messaging.kafka.partition` | number | Partition the message is sent to. | `2` | No |
203+
| `messaging.kafka.tombstone` | boolean | A boolean that is true if the message is a tombstone. | | Conditional<br>If missing, it is assumed to be false. |
204+
205+
**[1]:** If the key type is not string, it's string representation has to be supplied for the attribute. If the key has no unambiguous, canonical string form, don't include its value.
206+
<!-- endsemconv -->
207+
208+
For Apache Kafka producers, [`peer.service`](./span-general.md#general-remote-service-attributes) SHOULD be set to the name of the broker or service the message will be sent to.
209+
The `service.name` of a Consumer's Resource SHOULD match the `peer.service` of the Producer, when the message is directly passed to another service.
210+
If an intermediary broker is present, `service.name` and `peer.service` will not be the same.
211+
179212
## Examples
180213

181214
### Topic with multiple consumers
@@ -199,12 +232,45 @@ Process CB: | Span CB1 |
199232
| Status | `Ok` | `Ok` | `Ok` |
200233
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` |
201234
| `net.peer.port` | `1234` | `1234` | `1234` |
202-
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` |
235+
| `messaging.system` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` |
203236
| `messaging.destination` | `"T"` | `"T"` | `"T"` |
204237
| `messaging.destination_kind` | `"topic"` | `"topic"` | `"topic"` |
205238
| `messaging.operation` | | `"process"` | `"process"` |
206239
| `messaging.message_id` | `"a1"` | `"a1"`| `"a1"` |
207240

241+
### Apache Kafka Example
242+
243+
Given is a process P, that publishes a message to a topic T1 on Apache Kafka.
244+
One process, CA, receives the message and publishes a new message to a topic T2 that is then received and processed by CB.
245+
246+
```
247+
Process P: | Span Prod1 |
248+
--
249+
Process CA: | Span Rcv1 |
250+
| Span Proc1 |
251+
| Span Prod2 |
252+
--
253+
Process CB: | Span Rcv2 |
254+
```
255+
256+
| Field or Attribute | Span Prod1 | Span Rcv1 | Span Proc1 | Span Prod2 | Span Rcv2
257+
|-|-|-|-|-|-|
258+
| Span name | `"T1 send"` | `"T1 receive"` | `"T1 process"` | `"T2 send"` | `"T2 receive`" |
259+
| Parent | | Span Prod1 | Span Rcv1 | | Span Prod2 |
260+
| Links | | | | Span Prod1 | |
261+
| SpanKind | `PRODUCER` | `CONSUMER` | `CONSUMER` | `PRODUCER` | `CONSUMER` |
262+
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
263+
| `peer.service` | `"myKafka"` | | | `"myKafka"` | |
264+
| `service.name` | | `"myConsumer1"` | `"myConsumer1"` | | `"myConsumer2"` |
265+
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
266+
| `messaging.destination` | `"T1"` | `"T1"` | `"T1"` | `"T2"` | `"T2"` |
267+
| `messaging.destination_kind` | `"topic"` | `"topic"` | `"topic"` | `"topic"` | `"topic"` |
268+
| `messaging.operation` | | `"receive"` | `"process"` | | `"receive"` |
269+
| `messaging.kafka.message_key` | `"myKey"` | `"myKey"` | `"myKey"` | `"anotherKey"` | `"anotherKey"` |
270+
| `messaging.kafka.consumer_group` | | `"my-group"` | `"my-group"` | | `"another-group"` |
271+
| `messaging.kafka.client_id` | | `"5"` | `"5"` | `"5"` | `"8"` |
272+
| `messaging.kafka.partition` | | `"1"` | `"1"` | | `"3"` |
273+
208274
### Batch receiving
209275

210276
Given is a process P, that sends two messages to a queue Q on messaging system MS, and a process C, which receives both of them in one batch (Span Recv1) and processes each message separately (Spans Proc1 and Proc2).
@@ -228,7 +294,7 @@ Process C: | Span Recv1 |
228294
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
229295
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
230296
| `net.peer.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
231-
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
297+
| `messaging.system` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` |
232298
| `messaging.destination` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
233299
| `messaging.destination_kind` | `"queue"` | `"queue"` | `"queue"` | `"queue"` | `"queue"` |
234300
| `messaging.operation` | | | `"receive"` | `"process"` | `"process"` |
@@ -261,7 +327,7 @@ Process C: | Span Recv1 | Span Recv2 |
261327
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
262328
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
263329
| `net.peer.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
264-
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
330+
| `messaging.system` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` | `"rabbitmq"` |
265331
| `messaging.destination` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
266332
| `messaging.destination_kind` | `"queue"` | `"queue"` | `"queue"` | `"queue"` | `"queue"` |
267333
| `messaging.operation` | | | `"receive"` | `"receive"` | `"process"` |

0 commit comments

Comments
 (0)