[Experiment] Introduce DependencyFile#precedence to control graph generation #12816

brrygrdn · 2025-08-11T14:13:02Z

What are you trying to accomplish?

The Dependabot updater needs to know about quite a lot of files to do its job, often needing information from lockfiles about dependencies in order to affect changes in manifests.

When it comes to graphing the dependencies used by a project, we generally only need to examine lockfiles for the relevant information as they will know things manifest do not provide like:

the exact versions used
any transitive dependencies that are pulled in by direct dependencies
the relationships between dependencies as its ancestors decedents

Example

If we look at the snapshots being generated for a Ruby project by Dependency Graph's existing source-based parser for the Sinatra example in our tests, we see 28 dependencies, all of which are attributed to the Gemfile.lock:

If we take the submission payload generated by our experiment so far we see 32 dependencies, with 4 duplicates attributed to the Gemfile:

Which is the right approach?

The Dependency Submission API is mainly used by projects which do not have exhaustive lockfiles to fill in blanks our source-based parsing cannot cover - as a result Bundler is in a funny spot where we accept snapshots for it but until now there wasn't much real-world incentive build them so as our source-based parsing was good enough and used bundler internals in much the same way Dependabot does.

Bundler is also something of a bystander in this experiment as we're less interested in making it work fully than other ecosystems, but all things being equal it makes most sense to align with the current source-based parsing in terms of which files are submitted.

Anything you want to highlight for special attention from reviewers?

Rather than bubble Bundler-specific knowledge up into the submission components, I've introduced a simple precedence flag on Dependabot::DependencyFile. When we encounter a lockfile with 1 or more dependencies, we set it to precedence of 1 so it supersedes any manifests.

This is an effort to have a simple but generic way for us to limit the files used for graphing as required within the ecosystem's file parsers.

Dependabot currently fails fast when attempting to parse Bundler projects without a Gemfile ( or gemspec ), so I haven't added tests for these scenarios and just updated our existing tests to validate that only the lockfiles make it into the submission.

How will you know you've accomplished your goal?

When we merge this, the payloads we submit will only show 28 dependencies instead of 32, omitting the Gemfile records.

Checklist

I have run the complete test suite to ensure all tests and linters pass.
I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
I have written clear and descriptive commit messages.
I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
I have ensured that the code is well-documented and easy to understand.

bundler/lib/dependabot/bundler/file_parser.rb

phillmv · 2025-08-13T14:31:03Z

bundler/lib/dependabot/bundler/file_parser.rb

@@ -217,6 +217,9 @@ def lockfile_dependencies
          dependencies << dep
        end

+        # If the lockfile has at least one dependency, then it should precedent over any other files
+        # for its directory when building a graph.
+        T.must(lockfile).precedence = 1 if T.must(lockfile).dependencies.any?


thoughts:

as we discussed over zoom, precedence is a bit of a hack / proxy for lockfile but without having to use that name which in this context is confusing. that said it's still what it means – i feel like this ought to be set when the DepFile is initialized, because we should know at init whether a file is a lockfile or not. In this file we can set it in the lockfile method i think.

this is straightforward for bundler and maybe not genericizable to other ecosystems

also, thoughts on maybe calling it priority? cos then we could be like,

def priority? dependencies.any? && self.priority end

(Obv this method still works if we call it precendence)

I've addressed this in d4e3195, but it has caused quite a lot of collateral test failures on first pass

i feel like this ought to be set when the DepFile is initialized

This ends up being the complexity - the DependencyFile is actually initialised in a separate process and then results of it are serialized into the job_definition and then deserialised back into an object in the Updater process.

Relaying the priority via this serde roundtrip is totally doable and sort of feels like the 'right' way, but I'm nervous about making this deep change. This is something we only really care about parse time¹ so we can just assign it via a setter and punt on dealing with the side effects of the 'right' way.

In this file we can set it in the lockfile method i think

This is probably the better option, it's slightly less precise in that the priority remains unset from fetch-time to parse-time but other than it feeling weird to set an attribute via a setter vs init, it's the just-in-time for us using it 🤷🏻

def priority? dependencies.any? && self.priority end

This method turned out to be tricky to implement due to the serde roundtrip where we need to pass the @priority / self.priority carefully since dependencies.any? is false until the parsing is actually done.

This wrinkle does make me think that 'promoting' the lockfile to priority 1 once we know it is a/ there and b/ has dependencies is kind of useful semantically but, this is offset by the advantages of just filtering out any file without dependencies assigned to it in my follow-up commit: fc41afd

This ends up giving us vendor/support file filtering 'for free' without the grapher having to start to learn about those concepts - if it doesn't have dependencies assigned, we don't see it.

Footnotes

It's a sidebar / deep cut, but the separate fetch/update processes isn't necessary anymore and could be gotten rid of so it's also slightly throwaway work. ↩

omg okay yeah way too much work to do it this way. maybe split the baby in half for now? i.e.

don't modify the initializer because that's a big annoying change

but maybe still set the priority when the @lockfile is first fetched & cached? that way we're not setting it within the dependency parser method

either way i approve

brrygrdn requested a review from a team as a code owner August 11, 2025 14:13

github-actions bot added the L: ruby:bundler RubyGems via bundler label Aug 11, 2025

brrygrdn force-pushed the brrygrdn/dg-7449-ignore-gemfiles branch 2 times, most recently from edffa0a to d8ee1c9 Compare August 11, 2025 14:20

brrygrdn commented Aug 11, 2025

View reviewed changes

bundler/lib/dependabot/bundler/file_parser.rb Outdated Show resolved Hide resolved

brrygrdn force-pushed the brrygrdn/dg-7449-ignore-gemfiles branch from d56da06 to f3d7a8f Compare August 11, 2025 15:15

brrygrdn mentioned this pull request Aug 11, 2025

[Experiment] Populate child dependencies for Bundler graphs #12818

Draft

5 tasks

phillmv reviewed Aug 13, 2025

View reviewed changes

Introduce DependencyFile#precedence to control graph generation

f41a8cc

brrygrdn force-pushed the brrygrdn/dg-7449-ignore-gemfiles branch from f3d7a8f to f41a8cc Compare August 13, 2025 15:24

Switch to priority, Set at file fetch time

d4e3195

brrygrdn force-pushed the brrygrdn/dg-7449-ignore-gemfiles branch from 4bc309d to d4e3195 Compare August 13, 2025 16:19

Add test cases around empty files, vendored files, etc

fc41afd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Experiment] Introduce DependencyFile#precedence to control graph generation #12816

[Experiment] Introduce DependencyFile#precedence to control graph generation #12816

brrygrdn commented Aug 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

phillmv Aug 13, 2025

Uh oh!

brrygrdn Aug 13, 2025

Uh oh!

phillmv Aug 13, 2025

Uh oh!

Uh oh!

[Experiment] Introduce DependencyFile#precedence to control graph generation #12816

Are you sure you want to change the base?

[Experiment] Introduce DependencyFile#precedence to control graph generation #12816

Conversation

brrygrdn commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What are you trying to accomplish?

Example

Which is the right approach?

Anything you want to highlight for special attention from reviewers?

How will you know you've accomplished your goal?

Checklist

Uh oh!

Uh oh!

phillmv Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

brrygrdn Aug 13, 2025

Choose a reason for hiding this comment

Footnotes

Uh oh!

phillmv Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brrygrdn commented Aug 11, 2025 •

edited

Loading