A Scala code parser and Gazelle plugin for Bazel build file generation.
This repo is actively used and maintained with the ultimate goal of upstreaming it to a broader bazel ruleset such as rules_scala or rules_jvm. Contributions and bug reports are welcome here with the understanding that they will be handled on a best effort basis, with priority given to those which advance towards this goal.
- A maven install lockfile generated by rules_jvm_external, or some other json file providing artifact and package mappings which matches its format.
See Caveats below for additional compatibility information.
Create new gazelle_binary and gazelle targets in your root BUILD/BUILD.bazel file (or add the Scala language
plugin to an existing gazelle_binary if you wish):
gazelle_binary(
name = "gazelle_bin",
languages = [
"@scala_gazelle//scala",
],
)
gazelle(
name = "gazelle",
args = [
# "-scala_rules_scala_repo_name=io_bazel_rules_scala", # required with older versions of rules_scala
# "-scala_parsing_cache_file=...", # beneficial for large repos; specify a .json or .json.gz file path
],
gazelle = ":gazelle_bin",
)TODO: scala_gazelle is not currently published to the Bazel Central Registry and currently requires an
archive_override to consume via Bzlmod:
bazel_dep(name = "scala_gazelle", version = "0.0.0")
archive_override(
module_name = "scala_gazelle",
integrity = "sha256-/vzREre9dQDEyOWap49Ki+tc1/eEVWRPSzY4q/kIh+g=",
strip_prefix = "scala-gazelle-c8d5b376b65724ca3d473af7f171a114a7e19585",
urls = ["https://github.com/foursquare/scala-gazelle/archive/c8d5b376b65724ca3d473af7f171a114a7e19585.zip"],
)Note that building scala_gazelle requires either rules_go version 55.0 or later, or a patch to the go-tree-sitter
module. If upgrading rules_go is not an option for you, you will additionally require the following override in your
MODULE.bazel file:
go_deps = use_extension("@bazel_gazelle//:extensions.bzl", "go_deps")
go_deps.module_override(
patches = ["@scala_gazelle//:tree-sitter_cdeps.patch"],
path = "github.com/smacker/go-tree-sitter",
)http_archive(
name = "scala_gazelle",
sha256 = "fefcd112b7bd7500c4c8e59aa78f4a8beb5cd7f78455644f4b3638abf90887e8",
strip_prefix = "scala-gazelle-c8d5b376b65724ca3d473af7f171a114a7e19585",
url = "https://github.com/foursquare/scala-gazelle/archive/c8d5b376b65724ca3d473af7f171a114a7e19585.zip",
)
load("@scala_gazelle//:deps.bzl", "scala_gazelle_deps")
scala_gazelle_deps()Note that with WORKSPACE being order dependent, if you get errors building the gazelle binary you may need to move
scala_gazelle_deps() earlier in the file to ensure the proper dependency versioning, especially if you use other
Gazelle language plugins.
If following the installation steps above, bazel run //:gazelle will run Gazelle with the Scala plugin active.
When specified, indicates which languages the scala language plugin should attempt to CrossResolve imports for.
Accepted values are a comma-delimited list of strings.
When specified, symbol parsing will generate and update a json file on disk at the given location. Specify a .gz file extension to enable gzipping of the json cache file.
This is entirely optional, but as runtime is dominated by code parsing it can result in significant performance
improvements for large repos. Typically this cache file would not be committed and would instead be .gitignored.
Specifies the default rules_scala repo name used for kind imports. In older rules_scala versions, this was required
to be io_bazel_rules_scala, but this is no longer the case and the getting started docs now recommend rules_scala.
See bazel-contrib/rules_scala#1696 for details.
Defaults to rules_scala.
In addition to the config directives recognized by Gazelle itself (documentation), the follow directives are added for configuration of the Scala plugin, some of which are taken from the Java Gazelle plugin:
Tells the resolver to disregard a given label, meaning it will never be considered for dependency mapping. Can be repeated.
This can be helpful for resolving split packages across maven artifacts, particularly if you configure
strict_visibility = True in your maven install as the plugin does not parse or query maven targets for their
visibility status. In many cases the correct solution to a resolve conflict is simply to exclude one of the jars
involved from ever being considered as a direct dependency.
Defaults to @maven//:org_scala_lang_scala_library.
Specifies the filesystem path to the maven install lockfile generated by rules_jvm_external to be used for dependency
resolution of 3rdparty jars.
Defaults to maven_install.json
Specifies the name of the the maven install repository generated by rules_jvm_external.
Defaults to maven.
Provides a way to force additional labels to be added as deps whenever a particular label is added as a dep. It takes two arguments: the initial label and a comma separated string of transitive dependency labels. Can be repeated.
This can be particularly useful with Scala code where transitive dependencies may be required on the compile classpath
without being referenced directly in code (see rules_scala docs):
if you set dependency_mode = "direct" or dependency_mode = "plus-one" on your Scala toolchain it is likely you will
want to make use of this directive. It can also be used to work around jars with broken poms.
By default, the scala language plugin generates one target per source directory, and will not aggregate source files
from sub-directories. Setting # gazelle:scala_infer_recursive_modules true will have the plugin recurse into those
sub-directories which don't have their own build files, which corresponds more closely with how Bazel thinks about
package boundaries (see https://bazel.build/versions/8.2.0/concepts/build-ref for details).
This effectively allows Gradle-style module targets, where a single build file at the root of a source tree contains aggregate targets for the entire tree. Such patterns are generally discouraged under Bazel as they can result in significantly worse build performance, but may still be necessary or desired in some circumstances.
Defaults to false.
Indicates within a test directory which files are test classes vs utility classes, based on their basename. It should
be set up to match the value used for the test rule's suffixes attribute if applicable, with the '.scala' file
extensions added.
Accepted values are a comma-delimited list of strings.
Defaults to Test.scala.
Indicates whether scalatest or junit test rules should be generated. Note that setting this to "junit" will cause the Scala plugin to set the test rule's 'suffixes' attribute; if this is something you handle via a macro wrapper, you may wish to set this to "scalatest" and use '# gazelle:map_kind' to convert to the macro instead.
Accepted values are either scalatest or junit.
Defaults to scalatest.
If set to true, the Scala language plugin will output a warning when an existing non-test rule would contain source files matching the configured test file suffixes. This can help avoid human error when unit tests are accidentally added to a library rule, in which case the tests will silently never run. But this can also be noisy so you may wish to disable it.
Defaults to true.
This plugin was developed against a robust integration test suite in the form of two Bazel monorepos comprising ~2 million lines of Scala 2 code, both under active development, with the following characteristics:
- code written and built with Scala 2.12
dependency_mode = "transitive"configured for the Scala toolchain (see rules_scala docs- junit used as the primary test framework
- repo layouts generally following the 1:1:1 rule, with some larger recursive modules sprinkled in
The closer your repo comes to these characteristics, the more likely this plugin will work for you out of the box with minimal fuss. But it aims to be flexible and will likely work with most Scala 2 repos, though your mileage may vary.
This plugin has not been tested on a Scala 3 codebase. While the underlying tree-sitter
parsing library it uses does support Scala 3, it is very likely the plugin itself may experience crashes or produce
erroneous output when run over a Scala 3 codebase. In such cases, it may be possible to work around bugs via
#gazelle:ignore and #gazelle:resolve directives with some effort.
Full test coverage of the Scala parser and Gazelle plugin here is a work in progress. See the testing readme for details.
The plugin aims to be as flexible as possible, however some assumptions are necessary for the sake of reducing complexity.
-
Source files live inside a single package and contain one or more
packagedeclarations. -
Packages are not split across directories, excepting test code which may exist in a separate directory from the code it tests and share a package namespace. While the plugin is capable of functioning in the face of split packages, you will need to utilize
# gazelle:resolveor# gazelle:java_exclude_artifactdirectives to manually map affected symbols to a providing build rule. -
Circular dependencies between packages do not exist. While the plugin will likely function with them present, it will happily generate dep lists containing the dependency cycle which will be unable to build. If you have inter- package dependency cycles and cannot easily refactor to fix them, you will need to set
# gazelle:scala_infer_recursive_modules trueto generate recursive module-style targets (see its documentation above). -
Standard naming convention is followed: package names are all lower-cased and class/object/trait names begin with an upper-cased letter.
-
The plugin expects to generate a single library or test rule per directory containing all Scala and Java sources in the directory and with a name matching the directory name, or if generating a recursive module, at most one library and one test rule with names matching the directory name and
<name>-testsrespectively.
Generally the plugin is smart enough to match existing rules even if they have different names and handle them appropriately. However, if the wrong number of Scala rules are present or if a different kind of rule exists with a conflicting name, this will need to be fixed manually.
These are current shortcomings that ideally would be fixed or supported at some point.
-
The plugin requires at least empty build files to be manually created in order to define a Bazel package structure in the codebase.
-
The Scala code parser only handles imports at the top level of the source file, and will ignore inline imports contained within classes or objects.
-
All imports are treated as absolute whether or not they are prefixed with
_root_. Relative imports will either not resolve, or may mis-resolve to an incorrect dependency (# gazelle:resolvedirectives may help here). -
The plugin does not infer runtime dependencies (e.g. class loading via reflection).
-
There is no support currently for generating binary rules when
mainmethods are encountered. -
The plugin is not able to merge generated rules with existing rules containing
srcsdefined viaglob().
Adoption of the plugin in an existing repo can be a tedious process, though it tries to provide helpful error messages where it can. Roughly speaking, required changes boil down to:
- Remove globs in existing rules (setting source lists to
[]is fine)- Something like
for file in $(git grep -l 'srcs = glob(\["\*\.scala"\]),'); do sed -i '' 's/srcs = glob(\["\*\.scala"\]),/srcs = [],/' $file; donemay be useful to automate this.
- Something like
- Refactor to a single library and/or test rule per package (build file).
- Rename existing rules, if they conflict with generation conventions.
- If conflicting rules exist, the plugin should detect this and fail with a message asking you to rename the rule in question.
- Add resolve or exclude directives to fix duplicate package or symbol definitions.
- This is most common with external jars or in-repo code with split packages (e.g. some Hadoop and Spark libraries). In such cases, the plugin will output an error message noting the symbol in question and what providing targets were found containing it.
- Note the right fix for conflicts here often requires a judgement call. Sometimes the solution is to simply add a
# gazelle:java_exclude_artifactdirective for one of the providing targets if it should never be chosen as a direct dependency. Sometimes it will require knowing specifically what classes an external jar contains or when one jar should be chosen over another. Usually the fix is somewhat obvious, especially if you can compare against which providing target the existing build rules in your repo are using.
Note it may be helpful to utilize the # gazelle:exclude directive to allow more a more iterative setup process,
focusing on smaller or more self-contained sections of the repo first while utilizing # gazelle:resolve directives to
map packages outside the managed scope. This may not be feasible in large repos with highly connected dependency graphs
however.
There is currently no CI coverage for pull requests, please make sure to run bazel test //... manually.
TL;DR:
go.modmay be updated by hand, or preferably viabazel run @io_bazel_rules_go//go get example.com/pkg@version.- Run
bazel run @io_bazel_rules_go//go -- mod tidyto update indirect deps andgo.sum. - Run
bazel mod tidyto ensure any changes are reflected in thego_depsuse_repodeclaration inMODULE.bazel. - Run
bazel run //:gazelle_update_reposto update thescala_gazelle_depsmacro indeps.bzl - Run
bazel run //:gazelleto updateBUILDfiles as needed.
For more details, see the upstream docs from rules_go here.
See bazel run @io_bazel_rules_go//go help get for the full documentation on go get.