Skip to content

Fix and enhance support for different bazel metadata versions #4194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

abraemer
Copy link
Contributor

@abraemer abraemer commented Mar 17, 2025

I noticed that the conditions deciding between the two version for .bzl files are somewhat wrong. In Python 'a' and 'b' in some_string is equal to ('a' and 'b') in some_string and thus simplifies to 'b' in some_string. In this instance, I think it shouldn't make a huge difference because fortunately the strings upstream_address and vcs_commit_hash still somewhat distinguish between versions.

I think just fixing the conditions makes them too restrictive (AFAIK there is no real standard for these METADATA.bzl files), so I combined both paths into one. This ensures that we always extract as much information as possible.

I also added the option to use information from a package_url field which is important to me as we often have some METADATA.bzl files which are generated from Maven dependencies and thus the name is invalid by PURL specification. I wrote a somewhat realistic test case for this.

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁

Closes #4196

Signed-off-by: Adrian Braemer <[email protected]>
@abraemer abraemer force-pushed the fix-conditions-for-bzl-package branch from 1647116 to bdb908c Compare March 17, 2025 14:36
* by combining the paths we ensure to extract maximal information
* I also added the possibility to extract information from a 'package_url' field

Signed-off-by: Adrian Braemer <[email protected]>
@abraemer abraemer changed the title Fix conditions for bzl package versions Fix and enhance support for different bazel metadata versions Mar 17, 2025
@abraemer
Copy link
Contributor Author

Test failures seem unrelated to my changes

Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @abraemer , a couple nits for your consideration. Looking good otherwise.

"name": "androidx.compose.animation:animation",
"upstream_address": "https://developer.android.com/jetpack/androidx/releases/compose-animation#0.0.1",
"version": "0.0.1",
"package_url" : "pkg:maven/androidx.compose.animation/[email protected]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you maybe link to some examples of this type of manifests with package_url fields and maybe add one of those as tests, it's best to use real world examples probably. Or if you got this from some real example, you can also link that file here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I cannot link you one of our internal files. However my example file is very close to an actual file (the version number is wrong but that's all).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I cannot link you one of our internal files.

No worries on that obviously, but can you find some other real examples with package_url anywhere on github/elsewhere, or is this relatively new?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched on GitHub but did not find any other files that use that field but also I did not find many examples of bzl files at all. So perhaps it is not very common (yet?).
As this doesn't interfere with other things, would you be okay with merging this anyways? After all it just adds another possible data source for bzl-files that has the potential to be useful for others :)

# TODO: Store 'upstream_hash` somewhere
)
if 'vcs_commit_hash' in metadata_fields:
package_data["extra_data"] = dict(vcs_commit_hash=metadata_fields['vcs_commit_hash'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to do a get() rather than checking if this exists in the mapping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When including the upstream_hash, I rewrote this part. Now we always create the extra_data dict with all keys and I use .get to extract the values.

@abraemer
Copy link
Contributor Author

abraemer commented Apr 4, 2025

friendly ping since it has been 2 weeks now and I think this ready :) What do you think @AyanSinhaMahapatra?

Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @abraemer, this is almost ready with a couple small changes, apologies for the late review 😅

"name": "androidx.compose.animation:animation",
"upstream_address": "https://developer.android.com/jetpack/androidx/releases/compose-animation#0.0.1",
"version": "0.0.1",
"package_url" : "pkg:maven/androidx.compose.animation/[email protected]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I cannot link you one of our internal files.

No worries on that obviously, but can you find some other real examples with package_url anywhere on github/elsewhere, or is this relatively new?

@abraemer abraemer force-pushed the fix-conditions-for-bzl-package branch from 65d2995 to 1051aa6 Compare April 7, 2025 07:51
@abraemer
Copy link
Contributor Author

@AyanSinhaMahapatra friendly ping :)

Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks++ @abraemer, last nit for your considerations. Ready to merge otherwise. Thanks for your patience.

Could you also merge from develop again, should get rid of the flaky test failures.

* always create extra_data dictionary
* use get to extract information from metadatafields instead of branches
* also extract upstream_hash

Signed-off-by: Adrian Braemer <[email protected]>
@abraemer abraemer force-pushed the fix-conditions-for-bzl-package branch from 1051aa6 to 658f0d8 Compare April 17, 2025 14:04
@abraemer
Copy link
Contributor Author

@AyanSinhaMahapatra last nit is fixed and branch is up-to-date with develop. Let's see what CI says :)

As we now store the upstream hash in the field `extra_data`, we need to make the test aware of that.

Signed-off-by: Adrian Braemer <[email protected]>
@AyanSinhaMahapatra AyanSinhaMahapatra merged commit 98fa3fd into aboutcode-org:develop Apr 23, 2025
38 checks passed
@AyanSinhaMahapatra
Copy link
Member

Thanks++ @abraemer, all green and merging!

@abraemer abraemer deleted the fix-conditions-for-bzl-package branch April 25, 2025 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extraction of package data from Bazel Metadata files is too strict
2 participants