Skip to content

Commit d7ced57

Browse files
committed
Resolve semantic inconsistencies for non traditional messaging
Fixes #977
1 parent dc5b511 commit d7ced57

File tree

1 file changed

+35
-23
lines changed

1 file changed

+35
-23
lines changed

specification/trace/semantic_conventions/messaging.md

Lines changed: 35 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -27,16 +27,18 @@
2727

2828
Although messaging systems are not as standardized as, e.g., HTTP, it is assumed that the following definitions are applicable to most of them that have similar concepts at all (names borrowed mostly from JMS):
2929

30-
A *message* usually consists of headers (or properties, or meta information) and an optional body. It is sent by a single message *producer* to:
31-
32-
* Physically: some message *broker* (which can be e.g., a single server, or a cluster, or a local process reached via IPC). The broker handles the actual routing, delivery, re-delivery, persistence, etc. In some messaging systems the broker may be identical or co-located with (some) message consumers.
33-
* Logically: some particular message *destination*.
30+
A *message* is an envelope around a potentially empty payload.
31+
This envelope may offer the possibility to convey additional metadata, often under the key/value form.
32+
Messages can be delivered to 0, 1, or multiple consumers depending on the dispatching semantic of the protocol.
33+
Traditional messaging brokers, such as JMS, use the concept of topics when a message is dispatched to potentially multiple consumers and queues when a message is dispatched to a single consumer.
34+
In a messaging system such as Apache Kafka, consumer groups are used. Each record, or message, is sent to a single consumer per consumer group.
35+
Whether a specific message is processed as if it was sent to a topic or queue entirely depends on the consumer groups and their composition.
3436

3537
### Destinations
3638

37-
A destination is usually identified by some name unique within the messaging system instance, which might look like an URL or a simple one-word identifier.
38-
Two kinds of destinations are distinguished: *topic*s and *queue*s.
39-
A message that is sent (the send-operation is often called "*publish*" in this context) to a *topic* is broadcasted to all *subscribers* of the topic.
39+
A destination is usually identified by some name unique within the messaging system instance, which might look like a URL or a simple one-word identifier.
40+
Traditional messaging involves two kinds of destinations: *topic*s and *queue*s.
41+
A message that is sent (the send-operation is often called "*publish*" in this context) to a *topic* is broadcasted to all consumers that have *subscribed* to the topic.
4042
A message submitted to a queue is processed by a message *consumer* (usually exactly once although some message systems support a more performant at-least-once mode for messages with [idempotent][] processing).
4143

4244
[idempotent]: https://en.wikipedia.org/wiki/Idempotence
@@ -47,11 +49,10 @@ The consumption of a message can happen in multiple steps.
4749
First, the lower-level receiving of a message at a consumer, and then the logical processing of the message.
4850
Often, the waiting for a message is not particularly interesting and hidden away in a framework that only invokes some handler function to process a message once one is received
4951
(in the same way that the listening on a TCP port for an incoming HTTP message is not particularly interesting).
50-
However, in a synchronous conversation, the wait time for a message is important.
5152

5253
### Conversations
5354

54-
In some messaging systems, a message can receive a reply message that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*.
55+
In some messaging systems, a message can receive a reply message, or possibly multiple, that answers a particular other message that was sent earlier. All messages that are grouped together by such a reply-relationship are called a *conversation*.
5556
The grouping usually happens through some sort of "In-Reply-To:" meta information or an explicit *conversation ID* (sometimes called *correlation ID*).
5657
Sometimes a conversation can span multiple message destinations (e.g. initiated via a topic, continued on a temporary one-to-one queue).
5758

@@ -74,6 +75,7 @@ The span name SHOULD be set to the message destination name and the operation be
7475

7576
The destination name SHOULD only be used for the span name if it is known to be of low cardinality (cf. [general span name guidelines](../api.md#span)).
7677
This can be assumed if it is statically derived from application code or configuration.
78+
Wherever possible, the preference is to use real destination names over logical or aliased names.
7779
If the destination name is dynamic, such as a [conversation ID](#conversations) or a value obtained from a `Reply-To` header, it SHOULD NOT be used for the span name.
7880
In these cases, an artificial destination name that best expresses the destination, or a generic, static fallback like `"(temporary)"` for [temporary destinations](#temporary-destinations) SHOULD be used instead.
7981

@@ -118,14 +120,15 @@ The following operations related to messages are defined for these semantic conv
118120
| `messaging.protocol` | string | The name of the transport protocol. | `AMQP`<br>`MQTT` | No |
119121
| `messaging.protocol_version` | string | The version of the transport protocol. | `0.9.1` | No |
120122
| `messaging.url` | string | Connection string. | `tibjmsnaming://localhost:7222`<br>`https://queue.amazonaws.com/80398EXAMPLE/MyQueue` | No |
123+
| `messaging.service` | string | Name of the external broker, or name of the service being interacted with. See note below for a definition. | No |
121124
| `messaging.message_id` | string | A value used by the messaging system as an identifier for the message, represented as a string. | `452a7c7c7c7048c2f887f61572b18fc2` | No |
122125
| `messaging.conversation_id` | string | The [conversation ID](#conversations) identifying the conversation to which the message belongs, represented as a string. Sometimes called "Correlation ID". | `MyConversationId` | No |
123126
| `messaging.message_payload_size_bytes` | number | The (uncompressed) size of the message payload in bytes. Also use this attribute if it is unknown whether the compressed or uncompressed payload size is reported. | `2738` | No |
124127
| `messaging.message_payload_compressed_size_bytes` | number | The compressed size of the message payload in bytes. | `2048` | No |
125128

126129
**[1]:** Required only if the message destination is either a `queue` or `topic`.
127130

128-
**Additional attribute requirements:** At least one of the following sets of attributes is required:
131+
**Additional attribute recommendations:** At least one of the following sets of attributes is recommended:
129132

130133
* [`net.peer.name`](span-general.md)
131134
* [`net.peer.ip`](span-general.md)
@@ -140,6 +143,7 @@ The following operations related to messages are defined for these semantic conv
140143

141144
Additionally `net.peer.port` from the [network attributes][] is recommended.
142145
Furthermore, it is strongly recommended to add the [`net.transport`][] attribute and follow its guidelines, especially for in-process queueing systems (like [Hangfire][], for example).
146+
`messaging.service` refers to the logical name of the external broker or messaging system where a message was sent to, or received from. In an environment such as Kubernetes, it would be the Kubernetes Service Name.
143147
These attributes should be set to the broker to which the message is sent/from which it is received.
144148

145149
[network attributes]: span-general.md#general-network-connection-attributes
@@ -176,6 +180,17 @@ In RabbitMQ, the destination is defined by an _exchange_ and a _routing key_.
176180
`messaging.destination` MUST be set to the name of the exchange. This will be an empty string if the default exchange is used.
177181
The routing key MUST be provided to the attribute `messaging.rabbitmq.routing_key`, unless it is empty.
178182

183+
#### Apache Kafka
184+
185+
For Apache Kafka, the following additional attributes are defined:
186+
187+
| Attribute name | Notes and examples |
188+
| -------------- | ---------------------------------------------------------------------- |
189+
| `messaging.kafka.message_key` | Differs from `messaging.message_id` in that it's not unique, and can be `null`. The type is a String representation of the type of the actual value. |
190+
| `messaging.kafka.consumer_group` | Name of the Kafka Consumer Group that is handling the message. Only applies to consumers, not producers. |
191+
| `messaging.kafka.client_id` | Client Id for the Consumer or Producer that is handling the message. |
192+
| `messaging.kafka.partition` | Partition the message is sent to. |
193+
179194
## Examples
180195

181196
### Topic with multiple consumers
@@ -197,17 +212,16 @@ Process CB: | Span CB1 |
197212
| Links | | | |
198213
| SpanKind | `PRODUCER` | `CONSUMER` | `CONSUMER` |
199214
| Status | `Ok` | `Ok` | `Ok` |
200-
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` |
201-
| `net.peer.port` | `1234` | `1234` | `1234` |
202215
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` |
203216
| `messaging.destination` | `"T"` | `"T"` | `"T"` |
204217
| `messaging.destination_kind` | `"topic"` | `"topic"` | `"topic"` |
218+
| `messaging.service` | `"ms"` | `"ms"` | `"ms"` |
205219
| `messaging.operation` | | `"process"` | `"process"` |
206-
| `messaging.message_id` | `"a1"` | `"a1"`| `"a1"` |
220+
| `messaging.kafka.message_key` | `"a1"` | `"a1"` | `"a1"` |
207221

208222
### Batch receiving
209223

210-
Given is a process P, that sends two messages to a queue Q on messaging system MS, and a process C, which receives both of them in one batch (Span Recv1) and processes each message separately (Spans Proc1 and Proc2).
224+
Given is a process P, that sends two messages to a topic Q on messaging system MS, and a process C, which receives both of them in one batch (Span Recv1) and processes each message separately (Spans Proc1 and Proc2).
211225

212226
Since a span can only have one parent and the propagated trace and span IDs are not known when the receiving span is started, the receiving span will have no parent and the processing spans are correlated with the producing spans using links.
213227

@@ -226,17 +240,16 @@ Process C: | Span Recv1 |
226240
| Links | | | | Span Prod1 | Span Prod2 |
227241
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
228242
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
229-
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
230-
| `net.peer.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
231243
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
232244
| `messaging.destination` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
233-
| `messaging.destination_kind` | `"queue"` | `"queue"` | `"queue"` | `"queue"` | `"queue"` |
245+
| `messaging.destination_kind` | `"topic"` | `"topic"` | `"topic"` | `"topic"` | `"topic"` |
246+
| `messaging.service` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
234247
| `messaging.operation` | | | `"receive"` | `"process"` | `"process"` |
235-
| `messaging.message_id` | `"a1"` | `"a2"` | | `"a1"` | `"a2"` |
248+
| `messaging.kafka.message_key` | `"a1"` | `"a2"` | | `"a1"` | `"a2"` |
236249

237250
### Batch processing
238251

239-
Given is a process P, that sends two messages to a queue Q on messaging system MS, and a process C, which receives both of them separately (Span Recv1 and Recv2) and processes both messages in one batch (Span Proc1).
252+
Given is a process P, that sends two messages to a topic Q on messaging system MS, and a process C, which receives both of them separately (Span Recv1 and Recv2) and processes both messages in one batch (Span Proc1).
240253

241254
Since each span can only have one parent, C3 should not choose a random parent out of C1 and C2, but rather rely on the implicitly selected parent as defined by the [tracing API spec](../api.md).
242255
Similarly, only one value can be set as `message_id`, so C3 cannot report both `a1` and `a2` and therefore attribute is left out.
@@ -259,10 +272,9 @@ Process C: | Span Recv1 | Span Recv2 |
259272
| Links | | | | | Span Prod1 + Prod2 |
260273
| SpanKind | `PRODUCER` | `PRODUCER` | `CONSUMER` | `CONSUMER` | `CONSUMER` |
261274
| Status | `Ok` | `Ok` | `Ok` | `Ok` | `Ok` |
262-
| `net.peer.name` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
263-
| `net.peer.port` | `1234` | `1234` | `1234` | `1234` | `1234` |
264275
| `messaging.system` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` | `"kafka"` |
265276
| `messaging.destination` | `"Q"` | `"Q"` | `"Q"` | `"Q"` | `"Q"` |
266-
| `messaging.destination_kind` | `"queue"` | `"queue"` | `"queue"` | `"queue"` | `"queue"` |
277+
| `messaging.destination_kind` | `"topic"` | `"topic"` | `"topic"` | `"topic"` | `"topic"` |
278+
| `messaging.service` | `"ms"` | `"ms"` | `"ms"` | `"ms"` | `"ms"` |
267279
| `messaging.operation` | | | `"receive"` | `"receive"` | `"process"` |
268-
| `messaging.message_id` | `"a1"` | `"a2"` | `"a1"` | `"a2"` | |
280+
| `messaging.kafka.message_key` | `"a1"` | `"a2"` | `"a1"` | `"a2"` | |

0 commit comments

Comments
 (0)