Replies: 6 comments
-
|
I'm not against implementing some kind of functionality like this in Bazel. Bazel has two invariants that I know that must be upheld:
In addition, we have already have an API that reads files ( Also, echoing the philosophy of Buck2, I'd much rather not encourage people to put expensive computation (like parsing complicated output formats) into Bazel. |
Beta Was this translation helpful? Give feedback.
-
+1. I'm not against the ability to read files in general, but I'm worried that providing too flexible an API (e.g. My view (happy to be proven wrong) is that the contents of a file are only as relevant to Blaze insofar as we can map them to concepts that Blaze understands (e.g. artifacts) and can further use in other APIs (e.g. If the problem we're trying to address here is a dynamically generating a subgraph, then perhaps we only need an API that is sufficient to encode information regarding the nodes/edges to construct. We could enforce that Blaze only reads a list of filepaths from the file, that Blaze can map to actual artifacts that For example (and I'm handwaving a bunch here): |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the comments @lberki and @zhengwei143 I took the morning to throw together a proof of concept demo of adding The demoed user experience: https://github.com/reutermj/bazel/tree/master/examples/map_directory_read_artifacts
At least for cases like ML style languages and their build ordering constraints, this is certainly true enough. I'm not familiar enough with other things The big open question I have is related to the manifest format. For the demo, I just used the full exec path of the input files. Is there another format that would be better suited for this? Fair warning: I've never touched actual bazel internals and the implementation of |
Beta Was this translation helpful? Give feedback.
-
I glossed over path resolution previously, but yeah exec path seems reasonable. cc @pzembrod @fmeum @rrbutani who might be interested -- would like to gather more data points first. |
Beta Was this translation helpful? Give feedback.
-
|
It's great to see progress towards dynamic dependencies in Bazel!!!
For Buck2 reference on API design, please take a look at the new As another reference point, I've spoken about how we've used Buck2's API for Haskell (another ML family language) at BazelCon 2025. As you'll see there the new Buck2 API goes a bit further and makes it possible to share dynamically resolved dependency information efficiently across targets. This was important to achieve module granular incremental builds across Haskell package (target) boundaries without having to reconstruct the full cross package module dependency graph at each target which would be quadratic on the project level. |
Beta Was this translation helpful? Give feedback.
-
|
I posted this on the slack thread but figured I'd add here too. we're going to be heavily using map_directory for a dynamic flow for cpu/gpu/soc builds at NVIDIA, but we have a desire for the dynamic dependencies to be a bit more first-class feature level. We're following the pattern in the adventures_in_map_directory for the DAG traversal, using a similar mechanism to the marker files there, but it's a bit hacky (though it does seem to work well). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Bazel 9 adds
map_directorywhich allows for some analysis to be deferred to execution time. This turns out to be powerful enough to, albeit in a hacky manner, encode dynamic dependency resolution. Starting this discussion for a more robust API to handle such cases.https://bazel.build/rules/lib/builtins/actions#map_directory
Compiling Lean: An Example
I came across this problem with working on a ruleset for Lean. Lean modules must be compiled in a topological order declared by the include statements in the files. You must first compile all modules included by a Lean module before compiling the module itself. Typically, the Lean build system, Lake, handles this by compiling in 2 phases:
This presents an issue in Bazel, typically the action graph is fixed at analysis time before the source files can be read.
map_directoryWith the new introduction of
map_directory, I was able to hack around the typical analysis time restrictions by instead encoding the dependency graph in file names, and then insidemap_directoryutilizing those encoded file names to produce the dynamic action graph with correct dependencies. See full explanation: https://github.com/reutermj/adventures_in_map_directoryPrior Art for API
Buck2 introduces
dynamic_outputfor such cases: https://buck2.build/docs/rule_authors/dynamic_dependencies/dynamic_outputcan read a select set of artifacts listed in thedynamicargument at analysis time. This provides two options for handling the Lean case:dynamic_outputtakes the json as adynamicargument, takes as inputs all the Lean source files, and produces the action graph with dependencies from the jsonor
dynamicargument and handle dependency resolution inside thedynamic_outputin starlark, thenBeta Was this translation helpful? Give feedback.
All reactions