Skip to content

Commit 1c34706

Browse files
Protocol Buffer TeamLogofile
Protocol Buffer Team
authored andcommitted
This documentation update includes the following:
* Fixes formatting in Cargo Cult topic * Adds a news article and a new topic for the upcoming 31.x release * Updates the encoding and field presence topics to default to editions rather than proto2/proto3 * Adds information to the JSON topic to include information about what happens when numerical data is cast to a smaller type (such as 64-bit data being stored in a 32-bit field) PiperOrigin-RevId: 738934320 Change-Id: Iaf063ca7479668f5d0639c4ac3a3e7519257830b
1 parent c14731f commit 1c34706

File tree

6 files changed

+123
-79
lines changed

6 files changed

+123
-79
lines changed

content/best-practices/no-cargo-cults.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ type = "docs"
77

88
Do not
99
[cargo cult](https://en.wikipedia.org/wiki/Cargo_cult_programming)
10-
settings in proto files. If \
11-
you are creating a new proto file based on existing schema definitions, don't
12-
apply option settings except for those that you understand the need for.
10+
settings in proto files. If you are creating a new proto file based on existing
11+
schema definitions, don't apply option settings except for those that you
12+
understand the need for.
1313

1414
## Best Practices Specific to Editions {#editions}
1515

content/news/2025-03-18.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
+++
2+
title = "Changes Announced on March 18, 2025"
3+
linkTitle = "March 18, 2025"
4+
toc_hide = "true"
5+
description = "Changes announced for Protocol Buffers on March 18, 2025."
6+
type = "docs"
7+
+++
8+
9+
## Dropping Ruby 3.0 Support
10+
11+
As per our official
12+
[Ruby support policy](https://cloud.google.com/ruby/getting-started/supported-ruby-versions),
13+
we will be dropping support for Ruby 3.0 and lower in Protobuf version 31, due
14+
to release in April, 2025. The minimum supported Ruby version will be 3.1.

content/news/v31.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
+++
2+
title = "News Announcements for Version 31.x"
3+
linkTitle = "Version 31.x"
4+
toc_hide = "true"
5+
description = "Changes announced for Protocol Buffers version 31.x."
6+
type = "docs"
7+
+++
8+
9+
The following announcements are specific to Version 31.x. For information
10+
presented chronologically, see [News](/news).
11+
12+
The following sections cover planned breaking changes in the v31 release,
13+
expected in 2025 Q2. Also included are some changes that aren't breaking but may
14+
require action on your part. These describe changes as we anticipate them being
15+
implemented, but due to the flexible nature of software some of these changes
16+
may not land or may vary from how they are described in this topic.
17+
18+
### Dropping Ruby 3.0 Support
19+
20+
As per our official
21+
[Ruby support policy](https://cloud.google.com/ruby/getting-started/supported-ruby-versions),
22+
we will be dropping support for Ruby 3.0. The minimum supported Ruby version
23+
will be 3.1.

content/programming-guides/encoding.md

Lines changed: 52 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,15 @@ discuss aspects of the wire format.
2929
The Protoscope tool can also dump encoded protocol buffers as text. See
3030
https://github.com/protocolbuffers/protoscope/tree/main/testdata for examples.
3131

32+
All examples in this topic assume that you are using Edition 2023 or later.
33+
3234
## A Simple Message {#simple}
3335

3436
Let's say you have the following very simple message definition:
3537

3638
```proto
3739
message Test1 {
38-
optional int32 a = 1;
40+
int32 a = 1;
3941
}
4042
```
4143

@@ -241,7 +243,7 @@ Consider this message schema:
241243

242244
```proto
243245
message Test2 {
244-
optional string b = 2;
246+
string b = 2;
245247
}
246248
```
247249

@@ -275,7 +277,7 @@ an embedded message of our original example message, `Test1`:
275277

276278
```proto
277279
message Test3 {
278-
optional Test1 c = 3;
280+
Test1 c = 3;
279281
}
280282
```
281283

@@ -293,36 +295,49 @@ and a length of 3, exactly the same way as strings are encoded.
293295
In Protoscope, submessages are quite succinct. ` ``1a03089601`` ` can be written
294296
as `3: {1: 150}`.
295297

296-
## Optional and Repeated Elements {#optional}
298+
## Missing Elements {#optional}
297299

298-
Missing `optional` fields are easy to encode: we just leave out the record if
300+
Missing fields are easy to encode: we just leave out the record if
299301
it's not present. This means that "huge" protos with only a few fields set are
300302
quite sparse.
301303

302-
`repeated` fields are a bit more complicated. Ordinary (not [packed](#packed))
303-
repeated fields emit one record for every element of the field. Thus, if we have
304+
<span id="packed"></span>
305+
306+
## Repeated Elements {#repeated}
307+
308+
Starting in Edition 2023, `repeated` fields of a primitive type
309+
(any [scalar type](/programming-guides/proto2#scalar)
310+
that is not `string` or `bytes`) are ["packed"](/editions/features#repeated_field_encoding) by default.
311+
312+
Packed `repeated` fields, instead of being encoded as one
313+
record per entry, are encoded as a single `LEN` record that contains each
314+
element concatenated. To decode, elements are decoded from the `LEN` record one
315+
by one until the payload is exhausted. The start of the next element is
316+
determined by the length of the previous, which itself depends on the type of
317+
the field. Thus, if we have:
304318

305319
```proto
306320
message Test4 {
307-
optional string d = 4;
308-
repeated int32 e = 5;
321+
string d = 4;
322+
repeated int32 e = 6;
309323
}
310324
```
311325

312326
and we construct a `Test4` message with `d` set to `"hello"`, and `e` set to
313-
`1`, `2`, and `3`, this *could* be encoded as `` `220568656c6c6f280128022803`
314-
``, or written out as Protoscope,
327+
`1`, `2`, and `3`, this *could* be encoded as `` `3206038e029ea705` ``, or
328+
written out as Protoscope,
315329

316330
```proto
317331
4: {"hello"}
318-
5: 1
319-
5: 2
320-
5: 3
332+
6: {3 270 86942}
321333
```
322334

323-
However, records for `e` do not need to appear consecutively, and can be
324-
interleaved with other fields; only the order of records for the same field with
325-
respect to each other is preserved. Thus, this could also have been encoded as
335+
However, if the repeated field is set to expanded (overriding the default packed
336+
state) or is not packable (strings and messages) then an entry for each
337+
individual value is encoded. Also, records for `e` do not need to appear
338+
consecutively, and can be interleaved with other fields; only the order of
339+
records for the same field with respect to each other is preserved. Thus, this
340+
could look like the following:
326341

327342
```proto
328343
5: 1
@@ -331,6 +346,24 @@ respect to each other is preserved. Thus, this could also have been encoded as
331346
5: 3
332347
```
333348

349+
Only repeated fields of primitive numeric types can be declared "packed". These
350+
are types that would normally use the `VARINT`, `I32`, or `I64` wire types.
351+
352+
Note that although there's usually no reason to encode more than one key-value
353+
pair for a packed repeated field, parsers must be prepared to accept multiple
354+
key-value pairs. In this case, the payloads should be concatenated. Each pair
355+
must contain a whole number of elements. The following is a valid encoding of
356+
the same message above that parsers must accept:
357+
358+
```proto
359+
6: {3 270}
360+
6: {86942}
361+
```
362+
363+
Protocol buffer parsers must be able to parse repeated fields that were compiled
364+
as `packed` as if they were not packed, and vice versa. This permits adding
365+
`[packed=true]` to existing fields in a forward- and backward-compatible way.
366+
334367
### Oneofs {#oneofs}
335368

336369
[`Oneof` fields](/programming-guides/proto2#oneof) are
@@ -368,53 +401,6 @@ message.MergeFrom(message2);
368401
This property is occasionally useful, as it allows you to merge two messages (by
369402
concatenation) even if you do not know their types.
370403

371-
### Packed Repeated Fields {#packed}
372-
373-
Starting in v2.1.0, `repeated` fields of a primitive type
374-
(any [scalar type](/programming-guides/proto2#scalar)
375-
that is not `string` or `bytes`) can be declared as "packed". In proto2 this is
376-
done using the field option `[packed=true]`. In proto3 it is the default.
377-
378-
Instead of being encoded as one record per entry, they are encoded as a single
379-
`LEN` record that contains each element concatenated. To decode, elements are
380-
decoded from the `LEN` record one by one until the payload is exhausted. The
381-
start of the next element is determined by the length of the previous, which
382-
itself depends on the type of the field.
383-
384-
For example, imagine you have the message type:
385-
386-
```proto
387-
message Test5 {
388-
repeated int32 f = 6 [packed=true];
389-
}
390-
```
391-
392-
Now let's say you construct a `Test5`, providing the values 3, 270, and 86942
393-
for the repeated field `f`. Encoded, this gives us `` `3206038e029ea705` ``, or
394-
as Protoscope text,
395-
396-
```proto
397-
6: {3 270 86942}
398-
```
399-
400-
Only repeated fields of primitive numeric types can be declared "packed". These
401-
are types that would normally use the `VARINT`, `I32`, or `I64` wire types.
402-
403-
Note that although there's usually no reason to encode more than one key-value
404-
pair for a packed repeated field, parsers must be prepared to accept multiple
405-
key-value pairs. In this case, the payloads should be concatenated. Each pair
406-
must contain a whole number of elements. The following is a valid encoding of
407-
the same message above that parsers must accept:
408-
409-
```proto
410-
6: {3 270}
411-
6: {86942}
412-
```
413-
414-
Protocol buffer parsers must be able to parse repeated fields that were compiled
415-
as `packed` as if they were not packed, and vice versa. This permits adding
416-
`[packed=true]` to existing fields in a forward- and backward-compatible way.
417-
418404
### Maps {#maps}
419405

420406
Map fields are just a shorthand for a special kind of repeated field. If we have
@@ -430,8 +416,8 @@ this is actually the same as
430416
```proto
431417
message Test6 {
432418
message g_Entry {
433-
optional string key = 1;
434-
optional int32 value = 2;
419+
string key = 1;
420+
int32 value = 2;
435421
}
436422
repeated g_Entry g = 7;
437423
}

content/programming-guides/field_presence.md

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,6 @@ are two different manifestations of presence for protobufs: *implicit presence*,
1313
where the generated message API stores field values (only), and *explicit
1414
presence*, where the API also stores whether or not a field has been set.
1515

16-
Historically, proto2 has mostly followed *explicit presence*, while proto3
17-
exposes only *implicit presence* semantics. Singular proto3 fields of basic
18-
types (numeric, string, bytes, and enums) which are defined with the `optional`
19-
label have *explicit presence*, like proto2 (this feature is enabled by default
20-
as release 3.15).
21-
2216
{{% alert title="Note" color="note" %}} We
2317
recommend always adding the `optional` label for proto3 basic types. This
2418
provides a smoother path to editions, which uses explicit presence by
@@ -179,10 +173,8 @@ affirmatively expose presence, although the same set of hazzer methods may not
179173
generated as in proto2 APIs.
180174

181175
This default behavior of not tracking presence without the `optional` label is
182-
different from the proto2 behavior. We reintroduced
183-
[explicit presence](/editions/features#field_presence) as
184-
the default in edition 2023. We recommend using the `optional` field with proto3
185-
unless you have a specific reason not to.
176+
different from the proto2 behavior. We recommend using the `optional` label with
177+
proto3 unless you have a specific reason not to.
186178

187179
Under the *implicit presence* discipline, the default value is synonymous with
188180
"not present" for purposes of serialization. To notionally "clear" a field (so
@@ -195,6 +187,28 @@ required to have an enumerator value which maps to 0. By convention, this is an
195187
the domain of valid values for the application, this behavior can be thought of
196188
as tantamount to *explicit presence*.
197189

190+
### Presence in Editions APIs
191+
192+
This table outlines whether presence is tracked for fields in editions APIs
193+
(both for generated APIs and using dynamic reflection):
194+
195+
Field type | Explicit Presence
196+
-------------------------------------------- | -----------------
197+
Singular numeric (integer or floating point) | ✔️
198+
Singular enum | ✔️
199+
Singular string or bytes | ✔️
200+
Singular message&#8224; | ✔️
201+
Repeated |
202+
Oneofs&#8224; | ✔️
203+
Maps |
204+
205+
&#8224; Messages and oneofs have never had implicit presence, and editions
206+
doesn't allow you to set `field_presence = IMPLICIT`.
207+
208+
Editions-based APIs track field presence explicitly, similarly to proto2, unless
209+
`features.field_presence` is set to `IMPLICIT`. Similar to proto2 APIs,
210+
editions-based APIs do not track presence explicitly for repeated fields.
211+
198212
## Semantic Differences {#semantic-differences}
199213

200214
The *implicit presence* serialization discipline results in visible differences

content/programming-guides/json.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,13 @@ field in any edition of protobuf supports field presence and if set will appear
4040
in the output. Proto3 implicit-presence scalar fields will only appear in the
4141
JSON output if they are not set to the default value for that type.
4242

43+
When representing numerical data in a JSON file, if the number that is is parsed
44+
from the wire doesn't fit in the corresponding type, you will get the same
45+
effect as if you had cast the number to that type in C++ (for example, if a
46+
64-bit number is read as an int32, it will be truncated to 32 bits).
47+
48+
The following table shows how data is represented in JSON files.
49+
4350
<table>
4451
<tbody>
4552
<tr>

0 commit comments

Comments
 (0)