From 613d75c97203c1440754aaf0916bb5d4999c20db Mon Sep 17 00:00:00 2001 From: Protocol Buffer Team Date: Fri, 3 Jan 2025 16:22:40 +0000 Subject: [PATCH 1/3] This documentation changes includes the following: * Removes from `api.md` a tag used internally that prevents rendering * Removes `.md` from all links to fix those links that it breaks * Splits the scalar values tables into two to make them easier to read (proto2, proto3, editions topics) * Clarifies that Editions are supported in proto2 and editions, but not proto3 PiperOrigin-RevId: 711758660 Change-Id: I34c0585b825731cc729c7c4bf74d9a94ac6f370c --- content/best-practices/api.md | 2 - content/getting-started/kotlintutorial.md | 2 +- content/programming-guides/editions.md | 123 +++++++++++++++----- content/programming-guides/proto-limits.md | 3 +- content/programming-guides/proto2.md | 123 +++++++++++++++----- content/programming-guides/proto3.md | 125 ++++++++++++++++----- content/programming-guides/style.md | 2 +- content/support/migration.md | 6 +- content/support/version-support.md | 2 +- 9 files changed, 289 insertions(+), 99 deletions(-) diff --git a/content/best-practices/api.md b/content/best-practices/api.md index 2b1c0cda..3ba341af 100644 --- a/content/best-practices/api.md +++ b/content/best-practices/api.md @@ -1,5 +1,3 @@ - - +++ title = "API Best Practices" weight = 100 diff --git a/content/getting-started/kotlintutorial.md b/content/getting-started/kotlintutorial.md index f16b6a53..d2b67fcc 100644 --- a/content/getting-started/kotlintutorial.md +++ b/content/getting-started/kotlintutorial.md @@ -37,7 +37,7 @@ ways to solve this problem: - Use kotlinx.serialization. This does not work very well if you need to share data with applications written in C++ or Python. kotlinx.serialization has a - [protobuf mode](https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/formats.md#protobuf-experimental), + [protobuf mode](https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/formats#protobuf-experimental), but this does not offer the full features of protocol buffers. - You can invent an ad-hoc way to encode the data items into a single string -- such as encoding 4 ints as "12:3:-23:67". This is a simple and diff --git a/content/programming-guides/editions.md b/content/programming-guides/editions.md index fe873282..fbc0c9d3 100644 --- a/content/programming-guides/editions.md +++ b/content/programming-guides/editions.md @@ -329,12 +329,91 @@ A scalar message field can have one of the following types – the table shows t type specified in the `.proto` file, and the corresponding type in the automatically generated class: -
- +
+
- + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
.proto TypeProto Type Notes
double
float
int32Uses variable-length encoding. Inefficient for encoding negative + numbers – if your field is likely to have negative values, use sint32 + instead.
int64Uses variable-length encoding. Inefficient for encoding negative + numbers – if your field is likely to have negative values, use sint64 + instead.
uint32Uses variable-length encoding.
uint64Uses variable-length encoding.
sint32Uses variable-length encoding. Signed int value. These more + efficiently encode negative numbers than regular int32s.
sint64Uses variable-length encoding. Signed int value. These more + efficiently encode negative numbers than regular int64s.
fixed32Always four bytes. More efficient than uint32 if values are often + greater than 228.
fixed64Always eight bytes. More efficient than uint64 if values are often + greater than 256.
sfixed32Always four bytes.
sfixed64Always eight bytes.
bool
stringA string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot + be longer than 232.
bytesMay contain any arbitrary sequence of bytes no longer than 232.
+
+ +
+ + + + @@ -347,7 +426,6 @@ automatically generated class: - @@ -360,7 +438,6 @@ automatically generated class: - @@ -373,9 +450,6 @@ automatically generated class: - @@ -388,9 +462,6 @@ automatically generated class: - @@ -403,7 +474,6 @@ automatically generated class: - @@ -416,7 +486,6 @@ automatically generated class: - @@ -429,8 +498,6 @@ automatically generated class: - @@ -443,8 +510,6 @@ automatically generated class: - @@ -457,8 +522,6 @@ automatically generated class: - @@ -471,8 +534,6 @@ automatically generated class: - @@ -485,7 +546,6 @@ automatically generated class: - @@ -498,7 +558,6 @@ automatically generated class: - @@ -511,7 +570,6 @@ automatically generated class: - @@ -524,8 +582,6 @@ automatically generated class: - @@ -538,10 +594,9 @@ automatically generated class: - - + @@ -1440,7 +1495,17 @@ language in the relevant [API reference](/reference/). ``` * If the parser encounters multiple members of the same oneof on the wire, - only the last member seen is used in the parsed message. + only the last run of the last member seen is used in the parsed message. + When parsing data on the wire, starting at the beginning of the bytes, + evaluate the next value, and apply the following parsing rules: + + * First, check if a *different* field in the same oneof is currently set, + and if so clear it. + + * Then apply the contents as though the field was not in a oneof: + + * A primitive will overwrite any value already set + * A message will merge into any value already set * Extensions are not supported for oneof. diff --git a/content/programming-guides/proto-limits.md b/content/programming-guides/proto-limits.md index e306f828..6b56da55 100644 --- a/content/programming-guides/proto-limits.md +++ b/content/programming-guides/proto-limits.md @@ -24,8 +24,7 @@ Empty message extended by singular fields (such as Boolean): * ~4100 fields (proto2) -Extensions are supported -[only by proto2](/programming-guides/version-comparison#extensionsany). +Extensions are not supported in proto3. To test this limitation, create a proto message with more than the upper bound number of fields and compile using a Java proto rule. The limit comes from JVM diff --git a/content/programming-guides/proto2.md b/content/programming-guides/proto2.md index d0499654..47852fc0 100644 --- a/content/programming-guides/proto2.md +++ b/content/programming-guides/proto2.md @@ -359,12 +359,91 @@ A scalar message field can have one of the following types – the table shows t type specified in the `.proto` file, and the corresponding type in the automatically generated class: -
-
Proto Type C++ Type Java/Kotlin Type[1] Python Type[3]
double double double float
float float float float
int32Uses variable-length encoding. Inefficient for encoding negative - numbers – if your field is likely to have negative values, use sint32 - instead. int32_t int int
int64Uses variable-length encoding. Inefficient for encoding negative - numbers – if your field is likely to have negative values, use sint64 - instead. int64_t long int/long[4]
uint32Uses variable-length encoding. uint32_t int[2] int/long[4]
uint64Uses variable-length encoding. uint64_t long[2] int/long[4]
sint32Uses variable-length encoding. Signed int value. These more - efficiently encode negative numbers than regular int32s. int32_t int int
sint64Uses variable-length encoding. Signed int value. These more - efficiently encode negative numbers than regular int64s. int64_t long int/long[4]
fixed32Always four bytes. More efficient than uint32 if values are often - greater than 228. uint32_t int[2] int/long[4]
fixed64Always eight bytes. More efficient than uint64 if values are often - greater than 256. uint64_t long[2] int/long[4]
sfixed32Always four bytes. int32_t int int
sfixed64Always eight bytes. int64_t long int/long[4]
bool bool boolean bool
stringA string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot - be longer than 232. string String str/unicode[5]
bytesMay contain any arbitrary sequence of bytes no longer than 232. string ByteStringstr (Python 2)
bytes (Python 3)
str (Python 2), bytes (Python 3) []byte String (ASCII-8BIT) ByteString
+
+
- + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
.proto TypeProto Type Notes
double
float
int32Uses variable-length encoding. Inefficient for encoding negative + numbers – if your field is likely to have negative values, use sint32 + instead.
int64Uses variable-length encoding. Inefficient for encoding negative + numbers – if your field is likely to have negative values, use sint64 + instead.
uint32Uses variable-length encoding.
uint64Uses variable-length encoding.
sint32Uses variable-length encoding. Signed int value. These more + efficiently encode negative numbers than regular int32s.
sint64Uses variable-length encoding. Signed int value. These more + efficiently encode negative numbers than regular int64s.
fixed32Always four bytes. More efficient than uint32 if values are often + greater than 228.
fixed64Always eight bytes. More efficient than uint64 if values are often + greater than 256.
sfixed32Always four bytes.
sfixed64Always eight bytes.
bool
stringA string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot + be longer than 232.
bytesMay contain any arbitrary sequence of bytes no longer than 232.
+
+ +
+ + + + @@ -377,7 +456,6 @@ automatically generated class: - @@ -390,7 +468,6 @@ automatically generated class: - @@ -403,9 +480,6 @@ automatically generated class: - @@ -418,9 +492,6 @@ automatically generated class: - @@ -433,7 +504,6 @@ automatically generated class: - @@ -446,7 +516,6 @@ automatically generated class: - @@ -459,8 +528,6 @@ automatically generated class: - @@ -473,8 +540,6 @@ automatically generated class: - @@ -487,8 +552,6 @@ automatically generated class: - @@ -501,8 +564,6 @@ automatically generated class: - @@ -515,7 +576,6 @@ automatically generated class: - @@ -528,7 +588,6 @@ automatically generated class: - @@ -541,7 +600,6 @@ automatically generated class: - @@ -554,11 +612,9 @@ automatically generated class: - - + @@ -568,7 +624,6 @@ automatically generated class: - @@ -1515,7 +1570,17 @@ for your chosen language in the relevant ``` * If the parser encounters multiple members of the same oneof on the wire, - only the last member seen is used in the parsed message. + only the last run of the last member seen is used in the parsed message. + When parsing data on the wire, starting at the beginning of the bytes, + evaluate the next value, and apply the following parsing rules: + + * First, check if a *different* field in the same oneof is currently set, + and if so clear it. + + * Then apply the contents as though the field was not in a oneof: + + * A primitive will overwrite any value already set + * A message will merge into any value already set * Extensions are not supported for oneof. diff --git a/content/programming-guides/proto3.md b/content/programming-guides/proto3.md index b1d0b797..f22a60a2 100644 --- a/content/programming-guides/proto3.md +++ b/content/programming-guides/proto3.md @@ -363,12 +363,91 @@ A scalar message field can have one of the following types – the table shows t type specified in the `.proto` file, and the corresponding type in the automatically generated class: -
-
Proto Type C++ Type Java/Kotlin Type[1] Python Type[3]
double double double float
float float float float
int32Uses variable-length encoding. Inefficient for encoding negative - numbers – if your field is likely to have negative values, use sint32 - instead. int32_t int int
int64Uses variable-length encoding. Inefficient for encoding negative - numbers – if your field is likely to have negative values, use sint64 - instead. int64_t long int/long[4]
uint32Uses variable-length encoding. uint32_t int[2] int/long[4]
uint64Uses variable-length encoding. uint64_t long[2] int/long[4]
sint32Uses variable-length encoding. Signed int value. These more - efficiently encode negative numbers than regular int32s. int32_t int int
sint64Uses variable-length encoding. Signed int value. These more - efficiently encode negative numbers than regular int64s. int64_t long int/long[4]
fixed32Always four bytes. More efficient than uint32 if values are often - greater than 228. uint32_t int[2] int/long[4]
fixed64Always eight bytes. More efficient than uint64 if values are often - greater than 256. uint64_t long[2] int/long[4]
sfixed32Always four bytes. int32_t int int
sfixed64Always eight bytes. int64_t long int/long[4]
bool bool boolean bool
stringA string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot - be longer than 232. string Stringunicode (Python 2) or str (Python 3)unicode (Python 2), str (Python 3) *string String (UTF-8) string
bytesMay contain any arbitrary sequence of bytes no longer than 232. string ByteString bytes
+
+
- + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
.proto TypeProto Type Notes
double
float
int32Uses variable-length encoding. Inefficient for encoding negative + numbers – if your field is likely to have negative values, use sint32 + instead.
int64Uses variable-length encoding. Inefficient for encoding negative + numbers – if your field is likely to have negative values, use sint64 + instead.
uint32Uses variable-length encoding.
uint64Uses variable-length encoding.
sint32Uses variable-length encoding. Signed int value. These more + efficiently encode negative numbers than regular int32s.
sint64Uses variable-length encoding. Signed int value. These more + efficiently encode negative numbers than regular int64s.
fixed32Always four bytes. More efficient than uint32 if values are often + greater than 228.
fixed64Always eight bytes. More efficient than uint64 if values are often + greater than 256.
sfixed32Always four bytes.
sfixed64Always eight bytes.
bool
stringA string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot + be longer than 232.
bytesMay contain any arbitrary sequence of bytes no longer than 232.
+
+ +
+ + + + @@ -381,7 +460,6 @@ automatically generated class: - @@ -394,7 +472,6 @@ automatically generated class: - @@ -407,9 +484,6 @@ automatically generated class: - @@ -422,9 +496,6 @@ automatically generated class: - @@ -437,7 +508,6 @@ automatically generated class: - @@ -450,7 +520,6 @@ automatically generated class: - @@ -463,9 +532,6 @@ automatically generated class: - - @@ -477,9 +543,6 @@ automatically generated class: - - @@ -491,8 +554,6 @@ automatically generated class: - @@ -505,8 +566,6 @@ automatically generated class: - @@ -519,7 +578,6 @@ automatically generated class: - @@ -532,7 +590,6 @@ automatically generated class: - @@ -545,7 +602,6 @@ automatically generated class: - @@ -558,8 +614,6 @@ automatically generated class: - @@ -572,10 +626,9 @@ automatically generated class: - - + @@ -1116,7 +1169,17 @@ language in the relevant [API reference](/reference/). ``` * If the parser encounters multiple members of the same oneof on the wire, - only the last member seen is used in the parsed message. + only the last run of the last member seen is used in the parsed message. + When parsing data on the wire, starting at the beginning of the bytes, + evaluate the next value, and apply the following parsing rules: + + * First, check if a *different* field in the same oneof is currently set, + and if so clear it. + + * Then apply the contents as though the field was not in a oneof: + + * A primitive will overwrite any value already set + * A message will merge into any value already set * A oneof cannot be `repeated`. diff --git a/content/programming-guides/style.md b/content/programming-guides/style.md index 8542aad8..a505c70d 100644 --- a/content/programming-guides/style.md +++ b/content/programming-guides/style.md @@ -135,7 +135,7 @@ For more service-related guidance, see and [Don't Include Primitive Types in a Top-level Request or Response Proto](/programming-guides/api#dont-include-primitive-types) in the API Best Practices topic, and -[Define Messages in Separate Files](/best-practices/dos-donts.md#separate-files) +[Define Messages in Separate Files](/best-practices/dos-donts#separate-files) in Proto Best Practices. ## Things to Avoid {#avoid} diff --git a/content/support/migration.md b/content/support/migration.md index 947f00c0..53468a84 100644 --- a/content/support/migration.md +++ b/content/support/migration.md @@ -41,7 +41,7 @@ In v22.0, we removed all Autotools support from the protobuf compiler and the C++ runtime. If you're using Autotools to build either of these, you must migrate to [CMake](http://cmake.org) or [Bazel](http://bazel.build). We have some -[dedicated instructions](https://github.com/protocolbuffers/protobuf/blob/main/cmake/README.md) +[dedicated instructions](https://github.com/protocolbuffers/protobuf/blob/main/cmake/README) for setting up protobuf with CMake. ### Abseil Dependency {#abseil} @@ -94,7 +94,7 @@ notable changes include: * For CMake builds, we will first look for an existing Abseil installation pulled in by the top-level CMake configuration (see - [instructions](https://github.com/abseil/abseil-cpp/blob/master/CMake/README.md#traditional-cmake-set-up)). + [instructions](https://github.com/abseil/abseil-cpp/blob/master/CMake/README#traditional-cmake-set-up)). Otherwise, if `protobuf_ABSL_PROVIDER` is set to `module` (its default) we will attempt to build and link Abseil from our git [submodule](https://github.com/protocolbuffers/protobuf/tree/main/third_party). @@ -109,7 +109,7 @@ Prior to v22.x, Protobuf incorrectly removed the macro definition for including ``. Starting with v22.x, Protobuf preserves the macro definition. This may break customer code relying on the previous behavior, such as if they use the expression -[`google::protobuf::util::TimeUtil::GetCurrentTime()`](/reference/cpp/api-docs/google.protobuf.util.time_util.md#TimeUtil). +[`google::protobuf::util::TimeUtil::GetCurrentTime()`](/reference/cpp/api-docs/google.protobuf.util.time_util#TimeUtil). To migrate your app to the new behavior, change your code to do one of the following: diff --git a/content/support/version-support.md b/content/support/version-support.md index c0a2ecb8..f3ae4400 100644 --- a/content/support/version-support.md +++ b/content/support/version-support.md @@ -468,7 +468,7 @@ For specific versions supported, see On Android, Protobuf supports the minimum SDK version that is supported by [Google Play services](https://developers.google.com/android/guides/setup) and is the default in -[Jetpack](https://android.googlesource.com/platform/frameworks/support/+/refs/heads/androidx-main/docs/api_guidelines/modules.md#module-minsdkversion). +[Jetpack](https://android.googlesource.com/platform/frameworks/support/+/refs/heads/androidx-main/docs/api_guidelines/modules#module-minsdkversion). If both versions differ, the lower version is supported. ## Objective-C {#objc} From 27b3acb898a88ebaa72e97277bd40ca7258e1477 Mon Sep 17 00:00:00 2001 From: Protocol Buffer Team Date: Mon, 6 Jan 2025 15:34:30 +0000 Subject: [PATCH 2/3] This documentation change includes the following: * Updated language in `/programming-guides/editions.md` * Fixing a link in `/reference/go/go-generated-opaque.md` * Adding an explanation of default behavior for presence to the `/programming-guides/field_presence.md` topic. PiperOrigin-RevId: 712523304 Change-Id: Ia2dd5518a91f8cd77431c924c6dbdb56b4189ad7 --- content/programming-guides/editions.md | 6 +++--- content/programming-guides/field_presence.md | 6 ++++++ content/reference/go/go-generated-opaque.md | 2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/content/programming-guides/editions.md b/content/programming-guides/editions.md index fbc0c9d3..fc98b7b0 100644 --- a/content/programming-guides/editions.md +++ b/content/programming-guides/editions.md @@ -1495,9 +1495,9 @@ language in the relevant [API reference](/reference/). ``` * If the parser encounters multiple members of the same oneof on the wire, - only the last run of the last member seen is used in the parsed message. - When parsing data on the wire, starting at the beginning of the bytes, - evaluate the next value, and apply the following parsing rules: + only the last member seen is used in the parsed message. When parsing data + on the wire, starting at the beginning of the bytes, evaluate the next + value, and apply the following parsing rules: * First, check if a *different* field in the same oneof is currently set, and if so clear it. diff --git a/content/programming-guides/field_presence.md b/content/programming-guides/field_presence.md index cc30c2fb..5e7b5b76 100644 --- a/content/programming-guides/field_presence.md +++ b/content/programming-guides/field_presence.md @@ -179,6 +179,12 @@ basic types (numeric, string, bytes, and enums), either. Oneof fields affirmatively expose presence, although the same set of hazzer methods may not generated as in proto2 APIs. +This default behavior of not tracking presence without the `optional` label is +different from the proto2 behavior. We reintroduced +[explicit presence](/editions/features#field_presence) as +the default in edition 2023. We recommend using the `optional` field with proto3 +unless you have a specific reason not to. + Under the *implicit presence* discipline, the default value is synonymous with "not present" for purposes of serialization. To notionally "clear" a field (so it won't be serialized), an API user would set it to the default value. diff --git a/content/reference/go/go-generated-opaque.md b/content/reference/go/go-generated-opaque.md index c68b6d33..b745a1b5 100644 --- a/content/reference/go/go-generated-opaque.md +++ b/content/reference/go/go-generated-opaque.md @@ -175,7 +175,7 @@ protoc […] --go_opt=default_api_level=API_HYBRID To override the default API level for a specific file (instead of all files), use the `apilevelM` mapping flag (similar to [the `M` flag for import -paths](/reference/go/go-generated/#package)): +paths](#package)): ``` protoc […] --go_opt=apilevelMhello.proto=API_HYBRID From b0d0efcb278b022f8a53b72f3cbe3b769e602f54 Mon Sep 17 00:00:00 2001 From: Protocol Buffer Team Date: Mon, 6 Jan 2025 19:55:12 +0000 Subject: [PATCH 3/3] This documentation change includes updated language in `/programming-guides/proto2.md` and `/programming-guides/proto3.md` PiperOrigin-RevId: 712609450 Change-Id: Ie277449ddc556b1a773211cb06c9a5a53332704f --- content/programming-guides/proto2.md | 6 +++--- content/programming-guides/proto3.md | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/content/programming-guides/proto2.md b/content/programming-guides/proto2.md index 47852fc0..dcdfdd78 100644 --- a/content/programming-guides/proto2.md +++ b/content/programming-guides/proto2.md @@ -1570,9 +1570,9 @@ for your chosen language in the relevant ``` * If the parser encounters multiple members of the same oneof on the wire, - only the last run of the last member seen is used in the parsed message. - When parsing data on the wire, starting at the beginning of the bytes, - evaluate the next value, and apply the following parsing rules: + only the last member seen is used in the parsed message. When parsing data + on the wire, starting at the beginning of the bytes, evaluate the next + value, and apply the following parsing rules: * First, check if a *different* field in the same oneof is currently set, and if so clear it. diff --git a/content/programming-guides/proto3.md b/content/programming-guides/proto3.md index f22a60a2..2a0aab30 100644 --- a/content/programming-guides/proto3.md +++ b/content/programming-guides/proto3.md @@ -1169,9 +1169,9 @@ language in the relevant [API reference](/reference/). ``` * If the parser encounters multiple members of the same oneof on the wire, - only the last run of the last member seen is used in the parsed message. - When parsing data on the wire, starting at the beginning of the bytes, - evaluate the next value, and apply the following parsing rules: + only the last member seen is used in the parsed message. When parsing data + on the wire, starting at the beginning of the bytes, evaluate the next + value, and apply the following parsing rules: * First, check if a *different* field in the same oneof is currently set, and if so clear it.
Proto Type C++ Type Java/Kotlin Type[1] Python Type[3]
double double double float
float float float float
int32Uses variable-length encoding. Inefficient for encoding negative - numbers – if your field is likely to have negative values, use sint32 - instead. int32_t int int
int64Uses variable-length encoding. Inefficient for encoding negative - numbers – if your field is likely to have negative values, use sint64 - instead. int64_t long int/long[4]
uint32Uses variable-length encoding. uint32_t int[2] int/long[4]
uint64Uses variable-length encoding. uint64_t long[2] int/long[4]
sint32Uses variable-length encoding. Signed int value. These more - efficiently encode negative numbers than regular int32s.int32_t int int int32
sint64Uses variable-length encoding. Signed int value. These more - efficiently encode negative numbers than regular int64s.int64_t long int/long[4] int64
fixed32Always four bytes. More efficient than uint32 if values are often - greater than 228. uint32_t int[2] int/long[4]
fixed64Always eight bytes. More efficient than uint64 if values are often - greater than 256. uint64_t long[2] int/long[4]
sfixed32Always four bytes. int32_t int int
sfixed64Always eight bytes. int64_t long int/long[4]
bool bool boolean bool
stringA string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot - be longer than 232. string String str/unicode[5]
bytesMay contain any arbitrary sequence of bytes no longer than 232. string ByteStringstr (Python 2)
bytes (Python 3)
str (Python 2), bytes (Python 3) []byte String (ASCII-8BIT) ByteString