Skip to content

Conversation

@blizzardfinnegan
Copy link

For more information see Issue #2

This PR is largely similar to #574 , rebased to main, and with edits suggested by the maintainer on said PR implemented to the best of my ability.

Blizzard Finnegan added 18 commits September 22, 2025 11:14
Does not compile, incomplete.
- Rename UnknownFieldSet to UnknownFieldList, as requested by maintainer
  caspermeijn
- Change fixed32 and fixed64 representation to be consistent with C++
  implementation, as requested by caspermeijn
- start resolving build-time errors
    - Currently failing with reexported crate issues
- TODO: What endianness is protobuf?
This section should be added back (and/or modified to fit new
standards), but commenting out for now to reduce points of failure to
resolve creating bad Copy derives
Instead of checking against empty, which can fail differently.
Attempting to force fields with unknown to not have a derived Copy.
While this is a change made by rustfmt/clippy, it's outside the scope of
this PR, so it's been removed
Seems like local clippy check runs in rust edition 2021, CI runs in 2024
@blizzardfinnegan blizzardfinnegan changed the title Support for Unknown Fields Support for Unknown Fields (Rebase of #574) Oct 8, 2025
@blizzardfinnegan
Copy link
Author

@caspermeijn
If there are any objections or additional comments on this, I would love to know. This is an important feature, and one the company I work with is dependent on.

Unfortunately, Github doesn't have a clear way to request reviews on a PR for someone outside the org and/or not a direct developer on the project, otherwise I would have done so. Apologies for the ping if that isn't the right channel.

@alebar42
Copy link

alebar42 commented Oct 9, 2025

@blizzardfinnegan AFAIK you could remove or comment out the ignored tests here in order to verify the implementation: https://github.com/blizzardfinnegan/prost/blob/feature/unknown_fields_support/conformance/failing_tests.txt

@blizzardfinnegan
Copy link
Author

blizzardfinnegan commented Oct 9, 2025

@alebar42 Thank you for the correction and clarification!
I made the relevant changes, and pushed them to my branch. It doesn't seem to have passed through to the PR yet? Might be a CI issue... 🤷

Either way, the tests do in fact pass (locally, at least).

Edit: Unclear what was the issue, but said changes are now in the PR, which is the important part.

@blizzardfinnegan
Copy link
Author

@caspermeijn
Any chance there's some movement on this?

@aidandj
Copy link

aidandj commented Nov 14, 2025

Ran into an issue using pbjson_build with this. Because when you generate unknown fields, pbjson_build doesn't know what to do with them and it errors:

error[E0063]: missing field `unknown_fields` in initializer of `google::protobuf::generated_code_info::Annotation`
    --> /out/pbjson/google.protobuf.serde.rs:3929:20
     |
3929 |                 Ok(super::tonic::google::protobuf::generated_code_info::Annotation {

Copy link
Contributor

@caspermeijn caspermeijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking this forward. It is a large PR, so it takes some time for me to respond.

I miss user facing documentation. Could you add a section to the README that describes this new functionality?


for attr in attrs {
if word_attr("unknown_fields", attr) {
set_bool(&mut unknown, "duplicate message attribute")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message is incorrect

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that? If the same attribute (unknown_fields) is found twice, that should be a duplicate attribute. I do think that it should probably have different behaviour though; it should probably just pull all the unknown fields in, I would assume.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the miscommunication, I do agree the error should be here.

The error text doesn't match the error situation. I think it should be something like: "multiple unknown_fields in one message". Do you have an example of what the complete error message that the users sees is?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can definitely make the error message something more descriptive, sure! I don't have an example off-hand though, unfortunately.

});
let merge_fallback = match fields.iter().find(|&(_, f)| f.is_unknown()) {
Some((field_ident, field)) => {
let merge = field.merge(&prost_path, quote!(value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in seperate section? The fields are already sorted with the unknown fields as last, so this can be in the first loop, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it's in a separate iterator call so it can check if there are any unknown fields, and if there aren't, add the skip_field call at the end of the iteration. There's definitely a better way to do it though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great (but not required) if you could find a better way. My suggestion would be to do a single loop for all field and then always do the skip. If unknown fields is enabled, then the skip branch is never triggered. I don't know whether the compiler accepts a double catch all

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

@blizzardfinnegan
Copy link
Author

Thank you for the review! I pretty much did as clean of a direct uplift as I could with the original MR, without changing too much. I'll go through and add docs and resolve the issues you pointed out.

Apologies for the repeated pings. It's definitely a bear of a PR, I wasn't sure if you had seen it or not.

- Remove Option wrapper on unknown fields
- Actually utilise custom iterator instead of nested for loops
- use Iterator.sum() instead of manual counter
- use Buf.get_u64_le() and Buf.get_u32_le() instead of manual copies
- Update conformance to not require failing_tests.txt
- remove empty failing_tests.txt
- Add user-facing documentation for `prost` and `prost-build` for
  enabling unknown fields, along with limitations
- Modify `prost_build::Config::include_unknown_fields` to have a default
  value for the field in which unknown fields are stored. (defaults to
  `unknown_fields`)
@blizzardfinnegan
Copy link
Author

CI is currently failing, for reasons unclear to me. All CI failures are due to this error:

no function or associated item named `new` found for struct `error::DecodeError` in the current scope

Which is not accurate. It's only failing in CI (this failure does not occur if you run the same commands locally, afaict), and only on the files I've changed.

@blizzardfinnegan
Copy link
Author

blizzardfinnegan commented Nov 19, 2025

Ran into an issue using pbjson_build with this. Because when you generate unknown fields, pbjson_build doesn't know what to do with them

I think that might be a separate uplift for the pbjson to support unknown fields as currently written in this PR. There might be some changes to make in the prost implementation to help with that change, but I think getting initial support in to prost is more important than making sure the entire ecosystem works all at once.

If this PR goes through, I would highly recommend opening an issue in pbjson to that effect. I'm not familiar with the project personally, and the devs there would have a better sense of what, if anything, would need to change in prost to get their project to work properly, and could stack a patch on top of this PR.

@aidandj
Copy link

aidandj commented Nov 19, 2025

The change was quite minor, and it worked for me, but I wanted to call out how much of a breaking change this is. And how it is kind of awkward to have to initialize the unknown field field during struct creation.

In all other languages, unknown fields are accessed after the fact, but the user doesn't need to think about them unless needed. I think this is partially an issue with rust not allowing an inherent default value (like other languages) and prost allowing struct literal creation.

In the issue I suggested moving to a constructor with a private unknown fields or something, but those are also breaking changes.

@blizzardfinnegan
Copy link
Author

Part of the issue is also the fact that, the PR as originally written and as reimplemented here, is almost entirely for round trip consistency, and largely doesn't expose the unknown fields. You get an iterator, so it's not impossible to check based on tags, but it's not intuitive or ergonomic, especially since other structs just directly expose the internal fields. This PR also doesn't fully interpret fields beyond the bare minimum to parse the raw message, so there would be more logic necessary to fully interpret contained information.

I like the suggestion of introducing constructors, I think that would help a lot in terms of new stuff being written. That being said, I think you could build a transition plan to introduce constructors, and still allow but deprecate manually creating struct literals, and then loop back around to disallowing struct construction directly later on.

@aidandj
Copy link

aidandj commented Nov 19, 2025

That's fair. I'm new to the rust world and was mostly hoping there was some magic that could be used to include something in a struct, but not have it be in a struct literal. As my use case (and most unknown fields use cases) are just making sure they don't get dropped in transit. I'm not actually doing anything with the unknown fields.

@blizzardfinnegan
Copy link
Author

There is the ..Default::default() call, which sets any not already set values to their default values, but you do have to know that that call is necessary. I've added that to the documentation in the README, but it's not an obvious solution if you aren't already aware of the function.

@aidandj
Copy link

aidandj commented Nov 19, 2025

Yes, that's what I used in my pbjson update, but either way, it's a breaking change if you want to support unknown Fields

1 similar comment
@aidandj
Copy link

aidandj commented Nov 19, 2025

Yes, that's what I used in my pbjson update, but either way, it's a breaking change if you want to support unknown Fields

@caspermeijn
Copy link
Contributor

CI is currently failing, for reasons unclear to me. All CI failures are due to this error:

no function or associated item named `new` found for struct `error::DecodeError` in the current scope

I recently merge a change to DecodeError: #1330

I think you can reproduce this problem if you rebase your branch to the latest main

@caspermeijn
Copy link
Contributor

Yes, that's what I used in my pbjson update, but either way, it's a breaking change if you want to support unknown Fields

No, this PR is not a breaking change. You have to explicitly enable the unknown fields, so existing users are not broken. Only the users that choose for unknown fields are broken.

@aidandj
Copy link

aidandj commented Nov 21, 2025

Right. I was saying that enabling unknown fields is a breaking change, and in all other languages it is just implicitly supported. I don't think there is a current way to make this not the case, but I originally was starting a discussion to see if one could be teased out.


for attr in attrs {
if word_attr("unknown_fields", attr) {
set_bool(&mut unknown, "duplicate message attribute")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the miscommunication, I do agree the error should be here.

The error text doesn't match the error situation. I think it should be something like: "multiple unknown_fields in one message". Do you have an example of what the complete error message that the users sees is?

});
let merge_fallback = match fields.iter().find(|&(_, f)| f.is_unknown()) {
Some((field_ident, field)) => {
let merge = field.merge(&prost_path, quote!(value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great (but not required) if you could find a better way. My suggestion would be to do a single loop for all field and then always do the skip. If unknown fields is enabled, then the skip branch is never triggered. I don't know whether the compiler accepts a double catch all


/// Gets an iterator over the fields contained in this set.
pub fn iter(&self) -> impl Iterator<Item = (u32, &UnknownField)> {
UnknownFieldIter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to replace the custom iterator by some kind of Iterator::flatten() call? Please let me know if you don't understand what I mean with that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried tinkering with it for a while, but since the second iterator is inside a tuple, I wasn't having much luck. I'll have another go at it this coming Tuesday, see if I can manage something better.

@aidandj
Copy link

aidandj commented Nov 21, 2025

Random thought. Protobuf naming guide recommends not using underscores as the initial character. Although it's not enforced, suggesting or using a default struct field name with a underscore prefix would be much more unlikely to have naming conflicts.

https://protobuf.dev/programming-guides/style/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants