Skip to content

foursquare/scala-gazelle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scala-gazelle

A Scala code parser and Gazelle plugin for Bazel build file generation.

Status

This repo is actively used and maintained with the ultimate goal of upstreaming it to a broader bazel ruleset such as rules_scala or rules_jvm. Contributions and bug reports are welcome here with the understanding that they will be handled on a best effort basis, with priority given to those which advance towards this goal.

Requirements

  • A maven install lockfile generated by rules_jvm_external, or some other json file providing artifact and package mappings which matches its format.

See Caveats below for additional compatibility information.

Installation

Create new gazelle_binary and gazelle targets in your root BUILD/BUILD.bazel file (or add the Scala language plugin to an existing gazelle_binary if you wish):

gazelle_binary(
    name = "gazelle_bin",
    languages = [
        "@scala_gazelle//scala",
    ],
)

gazelle(
    name = "gazelle",
    args = [
        # "-scala_rules_scala_repo_name=io_bazel_rules_scala", # required with older versions of rules_scala
        # "-scala_parsing_cache_file=...", # beneficial for large repos; specify a .json or .json.gz file path
    ],
    gazelle = ":gazelle_bin",
)

Bzlmod

TODO: scala_gazelle is not currently published to the Bazel Central Registry and currently requires an archive_override to consume via Bzlmod:

bazel_dep(name = "scala_gazelle", version = "0.0.0")
archive_override(
    module_name = "scala_gazelle",
    integrity = "sha256-/vzREre9dQDEyOWap49Ki+tc1/eEVWRPSzY4q/kIh+g=",
    strip_prefix = "scala-gazelle-c8d5b376b65724ca3d473af7f171a114a7e19585",
    urls = ["https://github.com/foursquare/scala-gazelle/archive/c8d5b376b65724ca3d473af7f171a114a7e19585.zip"],
)

Note that building scala_gazelle requires either rules_go version 55.0 or later, or a patch to the go-tree-sitter module. If upgrading rules_go is not an option for you, you will additionally require the following override in your MODULE.bazel file:

go_deps = use_extension("@bazel_gazelle//:extensions.bzl", "go_deps")
go_deps.module_override(
    patches = ["@scala_gazelle//:tree-sitter_cdeps.patch"],
    path = "github.com/smacker/go-tree-sitter",
)

WORKSPACE

http_archive(
    name = "scala_gazelle",
    sha256 = "fefcd112b7bd7500c4c8e59aa78f4a8beb5cd7f78455644f4b3638abf90887e8",
    strip_prefix = "scala-gazelle-c8d5b376b65724ca3d473af7f171a114a7e19585",
    url = "https://github.com/foursquare/scala-gazelle/archive/c8d5b376b65724ca3d473af7f171a114a7e19585.zip",
)

load("@scala_gazelle//:deps.bzl", "scala_gazelle_deps")

scala_gazelle_deps()

Note that with WORKSPACE being order dependent, if you get errors building the gazelle binary you may need to move scala_gazelle_deps() earlier in the file to ensure the proper dependency versioning, especially if you use other Gazelle language plugins.

Usage

If following the installation steps above, bazel run //:gazelle will run Gazelle with the Scala plugin active.

Command line flags

--scala_cross_resolve_langs

When specified, indicates which languages the scala language plugin should attempt to CrossResolve imports for.

Accepted values are a comma-delimited list of strings.

--scala_parsing_cache_file

When specified, symbol parsing will generate and update a json file on disk at the given location. Specify a .gz file extension to enable gzipping of the json cache file.

This is entirely optional, but as runtime is dominated by code parsing it can result in significant performance improvements for large repos. Typically this cache file would not be committed and would instead be .gitignored.

--scala_rules_scala_repo_name

Specifies the default rules_scala repo name used for kind imports. In older rules_scala versions, this was required to be io_bazel_rules_scala, but this is no longer the case and the getting started docs now recommend rules_scala. See bazel-contrib/rules_scala#1696 for details.

Defaults to rules_scala.

Directives

In addition to the config directives recognized by Gazelle itself (documentation), the follow directives are added for configuration of the Scala plugin, some of which are taken from the Java Gazelle plugin:

# gazelle:java_exclude_artifact <label>

Tells the resolver to disregard a given label, meaning it will never be considered for dependency mapping. Can be repeated.

This can be helpful for resolving split packages across maven artifacts, particularly if you configure strict_visibility = True in your maven install as the plugin does not parse or query maven targets for their visibility status. In many cases the correct solution to a resolve conflict is simply to exclude one of the jars involved from ever being considered as a direct dependency.

Defaults to @maven//:org_scala_lang_scala_library.

# gazelle:java_maven_install_file

Specifies the filesystem path to the maven install lockfile generated by rules_jvm_external to be used for dependency resolution of 3rdparty jars.

Defaults to maven_install.json

# gazelle:java_maven_repository_name

Specifies the name of the the maven install repository generated by rules_jvm_external.

Defaults to maven.

# gazelle:scala_forced_transitive_deps

Provides a way to force additional labels to be added as deps whenever a particular label is added as a dep. It takes two arguments: the initial label and a comma separated string of transitive dependency labels. Can be repeated.

This can be particularly useful with Scala code where transitive dependencies may be required on the compile classpath without being referenced directly in code (see rules_scala docs): if you set dependency_mode = "direct" or dependency_mode = "plus-one" on your Scala toolchain it is likely you will want to make use of this directive. It can also be used to work around jars with broken poms.

# gazelle:scala_infer_recursive_modules

By default, the scala language plugin generates one target per source directory, and will not aggregate source files from sub-directories. Setting # gazelle:scala_infer_recursive_modules true will have the plugin recurse into those sub-directories which don't have their own build files, which corresponds more closely with how Bazel thinks about package boundaries (see https://bazel.build/versions/8.2.0/concepts/build-ref for details).

This effectively allows Gradle-style module targets, where a single build file at the root of a source tree contains aggregate targets for the entire tree. Such patterns are generally discouraged under Bazel as they can result in significantly worse build performance, but may still be necessary or desired in some circumstances.

Defaults to false.

# gazelle:scala_test_file_suffixes

Indicates within a test directory which files are test classes vs utility classes, based on their basename. It should be set up to match the value used for the test rule's suffixes attribute if applicable, with the '.scala' file extensions added.

Accepted values are a comma-delimited list of strings.

Defaults to Test.scala.

# gazelle:scala_test_framework

Indicates whether scalatest or junit test rules should be generated. Note that setting this to "junit" will cause the Scala plugin to set the test rule's 'suffixes' attribute; if this is something you handle via a macro wrapper, you may wish to set this to "scalatest" and use '# gazelle:map_kind' to convert to the macro instead.

Accepted values are either scalatest or junit.

Defaults to scalatest.

# gazelle:scala_warn_test_rule_mismatch

If set to true, the Scala language plugin will output a warning when an existing non-test rule would contain source files matching the configured test file suffixes. This can help avoid human error when unit tests are accidentally added to a library rule, in which case the tests will silently never run. But this can also be noisy so you may wish to disable it.

Defaults to true.

Caveats

Compatibility and test coverage

This plugin was developed against a robust integration test suite in the form of two Bazel monorepos comprising ~2 million lines of Scala 2 code, both under active development, with the following characteristics:

  • code written and built with Scala 2.12
  • dependency_mode = "transitive" configured for the Scala toolchain (see rules_scala docs
  • junit used as the primary test framework
  • repo layouts generally following the 1:1:1 rule, with some larger recursive modules sprinkled in

The closer your repo comes to these characteristics, the more likely this plugin will work for you out of the box with minimal fuss. But it aims to be flexible and will likely work with most Scala 2 repos, though your mileage may vary.

This plugin has not been tested on a Scala 3 codebase. While the underlying tree-sitter parsing library it uses does support Scala 3, it is very likely the plugin itself may experience crashes or produce erroneous output when run over a Scala 3 codebase. In such cases, it may be possible to work around bugs via #gazelle:ignore and #gazelle:resolve directives with some effort.

Full test coverage of the Scala parser and Gazelle plugin here is a work in progress. See the testing readme for details.

Restrictions

The plugin aims to be as flexible as possible, however some assumptions are necessary for the sake of reducing complexity.

  1. Source files live inside a single package and contain one or more package declarations.

  2. Packages are not split across directories, excepting test code which may exist in a separate directory from the code it tests and share a package namespace. While the plugin is capable of functioning in the face of split packages, you will need to utilize # gazelle:resolve or # gazelle:java_exclude_artifact directives to manually map affected symbols to a providing build rule.

  3. Circular dependencies between packages do not exist. While the plugin will likely function with them present, it will happily generate dep lists containing the dependency cycle which will be unable to build. If you have inter- package dependency cycles and cannot easily refactor to fix them, you will need to set # gazelle:scala_infer_recursive_modules true to generate recursive module-style targets (see its documentation above).

  4. Standard naming convention is followed: package names are all lower-cased and class/object/trait names begin with an upper-cased letter.

  5. The plugin expects to generate a single library or test rule per directory containing all Scala and Java sources in the directory and with a name matching the directory name, or if generating a recursive module, at most one library and one test rule with names matching the directory name and <name>-tests respectively.

Generally the plugin is smart enough to match existing rules even if they have different names and handle them appropriately. However, if the wrong number of Scala rules are present or if a different kind of rule exists with a conflicting name, this will need to be fixed manually.

Limitations

These are current shortcomings that ideally would be fixed or supported at some point.

  1. The plugin requires at least empty build files to be manually created in order to define a Bazel package structure in the codebase.

  2. The Scala code parser only handles imports at the top level of the source file, and will ignore inline imports contained within classes or objects.

  3. All imports are treated as absolute whether or not they are prefixed with _root_. Relative imports will either not resolve, or may mis-resolve to an incorrect dependency (# gazelle:resolve directives may help here).

  4. The plugin does not infer runtime dependencies (e.g. class loading via reflection).

  5. There is no support currently for generating binary rules when main methods are encountered.

  6. The plugin is not able to merge generated rules with existing rules containing srcs defined via glob().

Adopting scala-gazelle in an existing repo

Adoption of the plugin in an existing repo can be a tedious process, though it tries to provide helpful error messages where it can. Roughly speaking, required changes boil down to:

  • Remove globs in existing rules (setting source lists to [] is fine)
    • Something like for file in $(git grep -l 'srcs = glob(\["\*\.scala"\]),'); do sed -i '' 's/srcs = glob(\["\*\.scala"\]),/srcs = [],/' $file; done may be useful to automate this.
  • Refactor to a single library and/or test rule per package (build file).
  • Rename existing rules, if they conflict with generation conventions.
    • If conflicting rules exist, the plugin should detect this and fail with a message asking you to rename the rule in question.
  • Add resolve or exclude directives to fix duplicate package or symbol definitions.
    • This is most common with external jars or in-repo code with split packages (e.g. some Hadoop and Spark libraries). In such cases, the plugin will output an error message noting the symbol in question and what providing targets were found containing it.
    • Note the right fix for conflicts here often requires a judgement call. Sometimes the solution is to simply add a # gazelle:java_exclude_artifact directive for one of the providing targets if it should never be chosen as a direct dependency. Sometimes it will require knowing specifically what classes an external jar contains or when one jar should be chosen over another. Usually the fix is somewhat obvious, especially if you can compare against which providing target the existing build rules in your repo are using.

Note it may be helpful to utilize the # gazelle:exclude directive to allow more a more iterative setup process, focusing on smaller or more self-contained sections of the repo first while utilizing # gazelle:resolve directives to map packages outside the managed scope. This may not be feasible in large repos with highly connected dependency graphs however.

Maintainer notes

There is currently no CI coverage for pull requests, please make sure to run bazel test //... manually.

Managing go dependencies

TL;DR:

  1. go.mod may be updated by hand, or preferably via bazel run @io_bazel_rules_go//go get example.com/pkg@version.
  2. Run bazel run @io_bazel_rules_go//go -- mod tidy to update indirect deps and go.sum.
  3. Run bazel mod tidy to ensure any changes are reflected in the go_deps use_repo declaration in MODULE.bazel.
  4. Run bazel run //:gazelle_update_repos to update the scala_gazelle_deps macro in deps.bzl
  5. Run bazel run //:gazelle to update BUILD files as needed.

For more details, see the upstream docs from rules_go here.

See bazel run @io_bazel_rules_go//go help get for the full documentation on go get.

About

a Scala code parser and Gazelle plugin for Bazel build file generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published