Skip to content

Refactor scripts/create_repository.py, fix some issues #1639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

mbland
Copy link
Contributor

@mbland mbland commented Oct 30, 2024

Description

Reformats and refactors scripts/create_repository.py, prevents artifact downgrades, and improves performance. Part of #1482.

This change is comprised of several commits:

  • The first is a pure reformatting and refactoring that produces the same output as before.
  • The second is further reformatting and refactoring that still produces the same output.
  • The third is further reformatting and refactoring still, but with a performance improvement, and prevents downgrading protobuf-java.
  • The fourth is a small refactoring, but one I thought worth separating from previous ones.

Commits after that proceed to resolve various issues and improve performance:

  • Ensure Scala 2 libs get "_2" repo names in Scala 3
  • Bump Scalafmt to 3.8.3 in to match the version set in Update to Scalafmt 3.8.3 #1631
  • Prevent jar downgrades by comparing new and existing versions
  • Write all artifact deps to repo files, not just ones already existing in the original repository file

If this is too much for one pull request, I'm happy to tease it into as many smaller ones as requested.

I tested along the way by first restoring previous repository files at specific $COMMIT values via:

for f in third_party/repositories/scala_*.bzl; do git show $COMMIT:$f > $f; done

then I'd:

  • run the script the first time to ensure the repository files remained the same as before (for the pure refactorings), or that the output was as expected
  • git add third_party/repositories/scala_*.bzl if the updates were as expected
  • run the script a second time to ensure no new changes appeared in the working directory

Motivation

Some of the trickiest, most time consuming work I've done while resolving #1482 has been in resolving repository versions by hand. This script is now starting to save me soooo much time. However, it needed some extra love not to undo some of the changes I've needed to make.

The easiest way for me to do that was to start with some pure reformatting and easy refactoring. After a couple iterations, I understood the script better, and could see where to add or update the logic to resolve some issues.

There's more I could do with this script to improve it further, I think, but that's for future pull requests. Many thanks and kudos to @lm1nrt for contributing the original in #1608.

Preserves the behavior of the previous version in preparation for
further improvements and eventual behavior changes. Part of bazel-contrib#1482.

Tested by running the original script, calling `git stash
third_party/...`, then running the updated script. Confirmed that the
updated script produced no new changes in the working tree compared to
the indexed changes.
Preserves the behavior of the previous version in preparation for further
improvements and eventual behavior changes. Part of bazel-contrib#1482.

This change eliminates a lot of duplication, and replaces the previous
functions to add trailing commas to the output with a `re.sub()` call.

Tested by running the original script, calling `git stash third_party/...`,
then running the updated script. Confirmed that the updated script produced no
new changes in the working tree compared to the indexed changes.

Also:

- Added `#!/usr/bin/env python3` and set file mode 0755 to enable
  running the script directly instead of as an argument to `python3`.

- Used `pathlib.Path` on `__file__` to enable running the script from
  any directory, not just from inside `scripts/`.

- Updated `scripts/README.md` to reflect some of the script changes, and
  to improve the Markdown formatting in general.

- Added the `DOWNLOADED_ARTIFACTS_FILE` variable, added its value to
  `.gitignore`, and added a `Path.unlink()` at the end of the script to
  clean it up.
Preserves the behavior of the previous version in preparation for
further improvements and eventual behavior changes. Part of bazel-contrib#1482.

This vastly improves performance by _not_ downloading and hashing
artifacts that are already present in the current configuration.

Added `PROTOBUF_JAVA_VERSION = "4.28.2"` as a core artifact to avoid
downgrades, and to signal the importance of this particular artifact.

Tested by running the original script, calling `git stash
third_party/...`, then running the updated script. Confirmed that the
updated script produced no new changes in the working tree compared to
the indexed changes, modulo the previous `protobuf-java` downgrade.
Added missing traliling comma in `COORDINATE_GROUPS[0]`.

Updated `get_label` to check for groups starting with `org.scala-lang.`

Sorted `deps` in `to_rules_scala_compatible_dict` and always sets `deps`
in the return value, even when empty.
Without this change, it was possible for Scala 3 core library versions
to become overwritten with Scala 2 versions specified as dependencies of
other jars. Specifically, all the `@io_bazel_rules_scala_scala_*_2` deps
explicitly added to the `scala_3_{1,2,3,4,5}.bzl` files in bazel-contrib#1631 for
Scalafmt get stripped of the `_2` suffix before this change.

Also computed `is_scala_3` in `create_file` and passed it through where
needed. At some point it might be worth refactoring the script into a
proper object instead.
The script will now only update an existing artifact if the newly
resolved version is greater than that already in the
third_party/repositories file. Part of bazel-contrib#1482.

Tested by running on `master` both before and after bazel-contrib#1631. The script
produced no new changes, and is even faster now, since it also won't try
to downgrade existing artifact versions. More specifically, this change
prevents the following downgrades:

- scalap: from latest Scala version to an earlier version, including
  2.13.14 to 2.13.11
- com.lihaoyi.pprint: from pprint_3:0.9.0 to pprint_2.13:0.6.4
- compiler-interface: 1.10.1 to 1.3.5 or 1.9.6
- com.lihaoyi.geny: from geny_3:1.1.1 to geny_2.13:0.6.5
Updates the main loop in `create_file` to iterate over the newly
generated artifact dictionary, not the original dictionary parsed from
the existing file. This ensures that all dependencies of the updated
artifacts are written to the file.

This also improves performance when rerunning, since the script will no
longer keep trying to fetch the previously missing artifacts.

Tested by running once, `git add`ing the results, and running again to
ensure the second run was a no-op.
Fixes a bug in the previous implementation whereby the `scala3_`
components of org.scala-lang artifact labels listed in an artifact's
`deps` weren't transformed to `scala_`. Improved the handling of
org.scala-lang artifacts more generally via the new
`adjust_scala_lang_label` function.

Refactored the `COORDINATE_GROUPS` array into a series of separate `set`
variables named after their `get_label` transformations. Makes the
`get_label` implementation read a little more easily.
I finally noticed that `get_maven_coordinates` already did the same
parsing, essentially. I migrated the comment and implementation from
`split_artifact_and_version` into it.
Updates `create_artifact_version_map` to
`create_current_resolved_artifacts_map` to set up future improvements.

Specifically, an existing artifact's dependency coordinates aren't
updated unless the resolved artifact version is newer. A small change
after this will ensure all existing dependency coordinates get updated.
This could be a one-off change, or we could decide to keep it; either
way, it would be a very small one to make.

Also:

- Added `MavenCoordinates.artifact_name` to replace
  `artifact_name_from_maven_coordinates`.

- Renamed `is_newer_than_current_version` to `is_newer_version`.

- Added logic in `create_artifact_metadata_map` to ensure values with
  `testonly` set to `True` aren't part of the automatic resolution
  process.
@WojciechMazur
Copy link
Contributor

Nice improvements!
I've thought about 1 additional thing that might be included as I was trying to use this script to make currently broken dangerous_test_thirdparty_version.sh work that is to allow updating/boostrapping single version of Scala using argument.

Below there's a patch to old repo version that would handle that, you might include it now or I can create a follow-up pr later.
It include 2 additional small fixes:

  • running script from the root directory (previously it was requiring to run it from ./scritps)
  • printing cs resolve stderr outputs only when command fails
diff --git a/scripts/create_repository.py b/scripts/create_repository.py
index a0bb4df..aaa0339 100644
--- a/scripts/create_repository.py
+++ b/scripts/create_repository.py
@@ -11,6 +11,7 @@ import ast
 import copy
 import glob
 import os
+import argparse
 
 root_scala_versions = ["2.11.12", "2.12.20", "2.13.15", "3.1.3", "3.2.2", "3.3.4", "3.4.3", "3.5.2"]
 scala_test_version = "3.2.9"
@@ -105,7 +106,9 @@ def map_to_resolved_artifacts(output) -> List[ResolvedArtifact]:
 def resolve_artifacts_with_checksums_and_direct_dependencies(root_artifacts) -> List[ResolvedArtifact]:
   command = f'cs resolve {' '.join(root_artifacts)}'
   proc = subprocess.run(command, capture_output=True, text=True, shell=True)
-  print(proc.stderr)
+  if proc.returncode > 0:
+    print("Resolving artifacts failed for command: ", command)
+    print(proc.stderr)
   output = proc.stdout.splitlines()
   return map_to_resolved_artifacts(output)
 
@@ -159,8 +162,8 @@ def write_to_file(artifact_dict, version, file):
     data.write('\nartifacts = ')
     data.write(f'{get_with_trailing_commas(json.dumps(artifact_dict, indent=4).replace('true', 'True').replace('false', 'False'))}\n')
 
-def create_file(version):
-  path = os.getcwd().replace('/scripts', '/third_party/repositories')
+def create_or_update_repository_file(version):
+  path = os.getcwd().replace('/scripts', '') + '/third_party/repositories'
   file = Path(f'{path}/{'scala_' + "_".join(version.split(".")[:2]) + '.bzl'}')
 
   if not file.exists():
@@ -196,5 +199,18 @@ def create_file(version):
 
   write_to_file(original_artifact_dict, version, file)
 
-for version in root_scala_versions:
-  create_file(version)
\ No newline at end of file
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Setups repository configurations for given Scala version.")
+    parser.add_argument(
+        "--version", 
+        type=str, 
+        help="Scala version to bootstrap or update repositiry. If not provided, updates all default versions."
+    )
+    
+    args = parser.parse_args()
+    
+    if args.version is None:
+      for version in root_scala_versions:
+        create_or_update_repository_file(version)
+    else:
+      create_or_update_repository_file(args.version)

Preserves the existing labels for these artifacts.

Tested by running in a new branch that updates the direct dependencies
of core artifacts. Prior to this change, the `scalap` repo and some of
the `org.thesamet.scalapb` repos were duplicated with a new label. After
this change, that duplication disappears.
@mbland
Copy link
Contributor Author

mbland commented Oct 31, 2024

Nice improvements! I've thought about 1 additional thing that might be included as I was trying to use this script to make currently broken dangerous_test_thirdparty_version.sh work that is to allow updating/boostrapping single version of Scala using argument.

Thanks! 🙂

Below there's a patch to old repo version that would handle that, you might include it now or I can create a follow-up pr later. It include 2 additional small fixes:

  • running script from the root directory (previously it was requiring to run it from ./scritps)
  • printing cs resolve stderr outputs only when command fails

Heh, this branch already includes a change to allow it to run from the repo root. I'll see about patching in the stderr change, as well as the --version flag, right now.

@mbland
Copy link
Contributor Author

mbland commented Oct 31, 2024

@WojciechMazur OK, just pushed a new commit incorporating your suggestions. I added a new run_command wrapper around subprocess.run to make sure all the commands only emit output on failure.

Further embellishment of the suggestions from @WojciechMazur in bazel-contrib#1639.

This way we can see exactly which version updates encountered an error,
and how many, while continuing to attempt to update other versions.
Makes the logic slightly easier to follow.
I still like passing it as an argument to
`create_or_update_repository_file`.
Making the output directory configurable from the command line was the
next logical extension. This makes it possible to see what the script
will generate from scratch without having to erase the existing repo
files.

Figured it would be nice to see the default values in the `--help`
output.
Decided it might be better to literally start from scratch in
`--output_dir` than trying to copy a file from `OUTPUT_DIR`. One could
always copy files from `OUTPUT_DIR` before running the script if so
desired.

Also changed the `__main__` logic to exit immediately on
`CreateRepositoryError` rather than attempt to keep going.
Eliminated the check in `create_or_update_repository_file` in favor of
that in `copy_previous_version_or_create_new_file_if_missing`.
`get_label` now uses the `SPECIAL_CASE_GROUP_LABELS` dict instead of the
`LAST_GROUP_COMPONENT_GROUP` and `NEXT_TO_LAST_GROUP_COMPONENT_GROUP`
sets.

Also added `com.google.guava` to `ARTIFACT_LABEL_ONLY_GROUPS` to
generate the correct `io_bazel_rules_scala_guava` label. Added
`com.google.api.grpc` and `dev.dirs.directories` to
`SCALA_PROTO_RULES_GROUPS`.

Updated indentation of `ARTIFACT_LABEL_ONLY_GROUPS` and
`GROUP_AND_ARTIFACT_LABEL_GROUPS` elements.
Specifically, `org_scala_lang_modules_scala_collection_compat`.
Copy link
Collaborator

@simuons simuons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbland! To be fair I didn't look very deeply into this PR (it's a bit big and this is fine, no need for many small PRs) I trust you and @WojciechMazur ;) Thank you both.

@liucijus liucijus merged commit 35353b7 into bazel-contrib:master Nov 4, 2024
2 checks passed
@mbland mbland deleted the bzlmod-refactor-create_repository branch November 4, 2024 15:27
mbland added a commit to mbland/rules_scala that referenced this pull request Nov 15, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
mbland added a commit to mbland/rules_scala that referenced this pull request Nov 27, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
mbland added a commit to mbland/rules_scala that referenced this pull request Nov 27, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
mbland added a commit to mbland/rules_scala that referenced this pull request Dec 8, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
mbland added a commit to mbland/rules_scala that referenced this pull request Dec 10, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
mbland added a commit to mbland/rules_scala that referenced this pull request Dec 16, 2024
Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of bazel-contrib#1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (bazel-contrib#1639, bazel-contrib#1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)
liucijus pushed a commit that referenced this pull request Jan 14, 2025
* Toolchainize all testing toolchains

Moves the Scalatest, JUnit, and Specs2 toolchains into
`@io_bazel_rules_scala_toolchains//testing`. Part of #1482.

Updates all `WORKSPACE` files to set the appropriate `scala_toolchains`
parameters and to remove the unnecessary repository import and toolchain
registration macros.

Adds a `fetch_sources_by_id` parameter to `repositories` from
`third_party/repositories/repositories.bzl`. This enables
`scala_toolchains` to build the `artifact_ids_to_fetch_sources` mapping
from artifact ID lists returned by new macros extracted from `WORKSPACE`
macros. The values assigned to each id match the original
`fetch_sources` settings in the corresponding original `WORKSPACE`
macros.

Updates `scala/scala_maven_import_external.bzl` to generate a `load`
line for `//scala:scala_import.bzl` based on the repo's canonical name,
not `@io_bazel_rules_scala`.

As usual, includes several other opportunistic removals of the
`@io_bazel_rules_scala` repo name prefix to avoid an internal dependency
on that name. This means Bzlmod users won't necessarily have to set the
`repo_name` parameter of `bazel_dep` when using `rules_scala`.

---

Introduces more macros to return a framework's Maven artifact
dependencies, rather than inlining them in a `repositories` call. These
inlined lists are replaced by macro invocations, and now the
`scala_toolchains` macro can invoke these macros to collect artifact IDs
to pass to `repositories`. This also allows for future changes to
introduce a `scala_version` parameter if necessary, similar to how
`scalafmt_artifact_ids` already works.

This is important to avoid collisions when creating repositories for
artifacts upon which more than one framework depends under Bzlmod.
`WORKSPACE` doesn't seem affected by these collisions, but Bzlmod will
produce errors like the following, where both `scala_proto` and
`twitter_scrooge` depend upon `io_bazel_rules_scala_guava`:

```txt
$ bazel build //src/...

ERROR: .../scala/scala_maven_import_external.bzl:299:24:
  Traceback (most recent call last):
    File ".../scala/extensions/deps.bzl",
      line 140, column 21, in _scala_deps_impl
        scala_toolchains(
    File ".../scala/private/macros/toolchains.bzl",
      line 140, column 17, in scala_toolchains
        _scrooge(
    File ".../twitter_scrooge/twitter_scrooge.bzl",
      line 96, column 17, in twitter_scrooge
        repositories(
    File ".../third_party/repositories/repositories.bzl",
      line 113, column 37, in repositories
        _scala_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 263, column 30, in scala_maven_import_external
        jvm_maven_import_external(
    File ".../scala/scala_maven_import_external.bzl",
      line 299, column 24, in jvm_maven_import_external
        jvm_import_external(jar_urls = jar_urls, srcjar_urls = srcjar_urls, coordinates = artifact, **kwargs)

Error in repository_rule: A repo named
  io_bazel_rules_scala_guava_2_13_15
  is already generated by this module extension at
  .../scala/scala_maven_import_external.bzl:299:24

ERROR: Analysis of target
  '//src/java/io/bazel/rulesscala/worker:worker_test'
  failed; build aborted:
  error evaluating module extension scala_deps in
  //scala/extensions:deps.bzl
```

Recent updates to `scripts/create_repository.py` (#1639, #1642) make it
easy to emit full direct dependency lists for artifacts included in
`third_party/repositories/scala_*.bzl`. This increases the likelihood of
collisions, since this expanded metadata forces the macros that
instantiate artifact repos to instantiate even more repos.

By fetching list of artifact IDs from these macros, `scala_toolchains`
can now consolidate them into dictionary keys. Then it passes these
unique keys to `repositories` directly, avoiding the problem of
instantiating the same repo multiple times in the same module extension.

This, in turn, also avoids the need to add parameters to the original
`WORKSPACE` macros that instantiate dependencies to avoid collisions
under Bzlmod. The `scala_toolchains` macro never needs to call these
original macros, under either `WORKSPACE` or Bzlmod.

Finally, it also reduces duplication between these artifact ID lists and
the `_*_DEPS` symbols originally from `testing/BUILD` (and now in
`testing/deps.bzl`). The dependency labels are now generated
programatically.

(Aside: As I mentioned, we may eventually need to pass a Scala version
argument to these macros. It will be possible to cross that bridge
without too much trouble if and when that day comes. Or I can try to
future proof it in a follow up pull request.)

* Move `scala_toolchains` to `scala/toolchains.bzl`

Removes this symbol from `scala/scala.bzl` as well as
`setup_scala_testing_toolchain`, and deletes
`scala/private/macros/toolchains.bzl`. Part of #1482 and #1652.

This is required for Bazel 8 and `rules_java` 8 compatibility, but is
also compatible with Bazel 6 and 7. In #1652, @hvadehra suggested
partitioning the `.bzl` files such that `WORKSPACE` doesn't `load` a
file that tries to `load` symbols from `rules_java`. I successfully did
so in a separate branch, and along with other minor changes, got
`rules_scala` to build with `rules_java` 8.5.1.

The other changes will come in separate pull requests, but it makes
sense to land this change now before adding any other toolchains to
`scala_toolchains`.

---

Arguably, we should remove all macros exported from `scala/scala.bzl`
that only instantiate toolchain dependencies and define toolchains. That
may be a breaking change for some users, but will ultimately be
necessary for these macros to remain compatible with Bazel 8.

* Extract versioned `_JUNIT_DEPS` in `test/BUILD`

Eliminates reliance on the default `@io_bazel_rules_scala_junit_junit`
artifact repository.

* Update `{junit,specs2_junit}_toolchain()`

These macros now mirror the implementation of `scalatest_toolchain()`.

However, I realized that the old pattern of calling
`scalatest_repositories()` followed by `scalatest_toolchain()` will no
longer work without first calling `scala_toolchains(scalatest = True)`.
This is because the `alias` targets in `testing/BUILD` that replace the
previous implementations all point to
`@io_bazel_rules_scala_toolchains`.

So if we want to keep these macros, it seems like we should maybe
restore the original toolchain targets in `testing/BUILD`. If we don't,
we can remove these macros, but we can document these as breaking
changes, and update other documentation accordingly.

* Move `scala/{private/macros/,}toolchains_repo.bzl`

Like "Move `scala_toolchains` to `scala/toolchains.bzl`",
removes the `scala_toolchains_repo` symbol from `scala/scala.bzl` and
makes it available from `scala/toolchains_repo.bzl`.

This avoids a future `test_scala_version 2.12.20` failure during Bazel 8
builds after adding `twitter_scrooge` toolchain support in the new
`test_version/version_specific_tests_dir/scrooge_repositories.bzl` file.
Otherwise, this new file would load `toolchains_repo` from
`scala/scala.bzl`. The `test_version/test_scala_version_.../WORKSPACE`
file generated from `test_version/WORKSPACE.template` would then
transitively load `.bzl` files with `rules_java` symbols, breaking the
test.

```txt
$ RULES_SCALA_TEST_ONLY="test_scala_version 2.12.20" ./test_version.sh

ERROR: Traceback (most recent call last):
  File ".../external/rules_scala/scala/private/common_attributes.bzl",
  line 18, column 28, in <toplevel>
    "deps": attr.label_list(

Error in label_list:
  Illegal argument: element in 'providers' is of unexpected type.
  Either all elements should be providers,
  or all elements should be lists of providers,
  but got list with an element of type NoneType.

ERROR: Error computing the main repository mapping:
  at test_version/test_scala_version_.../scrooge_repositories.bzl:1:6:
  at .../scala/scala.bzl:30:5:
  at .../scala/private/rules/scala_junit_test.bzl:5:5:
initialization of module 'scala/private/common_attributes.bzl' failed
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants