Skip to content

lib: merge testutils crate directly into jj_lib::testutils #3878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thoughtpolice
Copy link
Member

While reading the code while working on Buck-style builds, I realized that there is a strange cyclic dependency between jj-lib and testutils:

  • jj-lib has a dev-dep on testutils
  • testutils has a normal dep on jj_lib

Cargo apparently has (some?) kind of logic to topologize this sort of graph, but it's awkward to understand or otherwise express, and has knock-on issues with rust-analyzer too because certain code is only built under the proper testing configuration, e.g. with #[cfg(feature = "testing")]

This instead just merges all of the testutils code as a single jj_lib::testutils module, which breaks the cycle and also simplifies a few of the mod imports as well. To fix the remaining compile errors we have to explicitly use jj_lib::testutils first, which minimizes the overall diff. Some call sites are simple enough to just inline though.

I do think that some of this can be pulled back out into a jj-utils package that jj-lib unconditionally depends on and contains other various functionality, too.

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

@thoughtpolice thoughtpolice force-pushed the aseipp/push-vrxtoymnuvpv branch from 95410b1 to 73ef8a5 Compare June 13, 2024 23:30
@thoughtpolice
Copy link
Member Author

thoughtpolice commented Jun 14, 2024

I forgot to mention:

  • This unfortunately brings unsafe back into jj-lib so I had to remove forbid(unsafe_code) :(
  • This unfortunately changes warn(missing_docs) to allow(missing_docs) because a lot of the testutils code is undocumented and gives too many warnings. :(

I can commit myself to fixing both of these, for example by splitting out the code into a new jj-utils crate. But I wanted to get this out there first; I'm also happy to include those changes in this stack as well, if asked.

@martinvonz
Copy link
Member

martinvonz commented Jun 14, 2024

I forgot to mention:

  • This unfortunately brings unsafe back into jj-lib so I had to remove forbid(unsafe_code) :(
  • This unfortunately changes warn(missing_docs) to allow(missing_docs) because a lot of the testutils code is undocumented and gives too many warnings. :(

Is it not possible to override the checks to be more permissive only at the testutils level?

@thoughtpolice
Copy link
Member Author

thoughtpolice commented Jun 14, 2024

I tried that but I think that there can only be one unique set of top-level allow/deny rules for a single crate, ones that apply to entire modules, typically located in lib.rs. So even though testutils is a module within the crate you can't apply rules specifically to the individual module. Maybe I can attach them to individual items to suppress the warnings? I can try again and see.

@ilyagr
Copy link
Contributor

ilyagr commented Jun 14, 2024

Thank you for working on this. The rust-analyzer issues that stem from the cycle are quite annoying in VS Code (lots of bogus and cryptic type errors with {unknown} types in some tests).

Update: Actually, it seems like recent rust-analyzer is better at filtering them out from the Problems pane, maybe, right now for me, but that might break if there's a breeze or the phase of the moon shifts. It still complains about cyclic deps in the Output pane:

2024-06-14T03:37:17.959163Z ERROR project_model::workspace: cyclic deps: jj_cli(Idx::<CrateData>(11)) -> jj_cli(Idx::<CrateData>(11)), alternative path: jj_cli(Idx::<CrateData>(11))
2024-06-14T03:37:17.959199Z ERROR project_model::workspace: cyclic deps: jj_lib(Idx::<CrateData>(24)) -> testutils(Idx::<CrateData>(29)), alternative path: testutils(Idx::<CrateData>(29)) -> jj_lib(Idx::<CrateData>(24))
2024-06-14T03:37:18.295974Z ERROR project_model::workspace: cyclic deps: jj_cli(Idx::<CrateData>(11)) -> jj_cli(Idx::<CrateData>(11)), alternative path: jj_cli(Idx::<CrateData>(11))
2024-06-14T03:37:18.296005Z ERROR project_model::workspace: cyclic deps: jj_lib(Idx::<CrateData>(24)) -> testutils(Idx::<CrateData>(29)), alternative path: testutils(Idx::<CrateData>(29)) -> jj_lib(Idx::<CrateData>(24))
2024-06-14T03:37:19.065040Z ERROR project_model::workspace: cyclic deps: jj_cli(Idx::<CrateData>(11)) -> jj_cli(Idx::<CrateData>(11)), alternative path: jj_cli(Idx::<CrateData>(11))
2024-06-14T03:37:19.065068Z ERROR project_model::workspace: cyclic deps: jj_lib(Idx::<CrateData>(24)) -> testutils(Idx::<CrateData>(29)), alternative path: testutils(Idx::<CrateData>(29)) -> jj_lib(Idx::<CrateData>(24))

Copy link
Contributor

@yuja yuja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the right way forward. At some point, we might want to extract some library code to separate crate (to build proc-macro on top of it for example.) Then, there will be some project-level dependency cycle as we have right now: {libA/tests, libB/tests} (-> testutils) -> {libA, libB}.

Update: Actually, it seems like recent rust-analyzer is better at filtering them out from the Problems pane, maybe, right now for me, but that might break if there's a breeze or the phase of the moon shifts.

There's some progress on this. I think rust-analyzer will (or can already?) handle fake jj-lib/tests -> testutils -> jj-lib dependency cycle.

rust-lang/rust-analyzer#14167 (comment)

@thoughtpolice
Copy link
Member Author

thoughtpolice commented Jun 14, 2024

I'm not sure if this is the right way forward. At some point, we might want to extract some library code to separate crate (to build proc-macro on top of it for example.) Then, there will be some project-level dependency cycle as we have right now

My plan was actually to look into this as a next step, something like a jj-utils or jj-gazebo (this is what Meta calls their "catch all Rust utilities library" fwiw, I think it's a cute little name) which jj-lib would depend on, but why do you think cycles are inevitable? I imagine the testutils module would be moved into this new crate and would use normal #[test] directives for testing things, e.g. jj-utils might contain things like small data structures or utilities that don't need a whole separate crate to test. In general, I don't think cycles are necessary (even these "special ones"), and the less conditional compilation the better. At some point we'll have to cut the knot so to speak, I guess. I see such a crate as being purely a "root crate" (i.e. it has no other deps on jj-* crates) so it would not be subject to this.

I can of course abandon this change and start over with that in mind instead and simply move testutils out into such a crate at the start, but just moving everything into lib and then moving pieces out incrementally seemed like a better starting point. I'm happy to do the work to untangle this if you have ideas on what you'd like to see.

@thoughtpolice
Copy link
Member Author

After thinking more and taking a walk I think I see what you mean; the testutils code also needs to be refactored in order to pull apart the implicit dependencies on jj-lib before anything "above" jj-lib can use it. But, I think maybe consolidating everything first, so we can start gradually teasing that stuff out, might still be the right way to go. And crates above jj-lib are currently theoretical, we might not know what to move out until they exist. I'm Interested in other peoples thoughts here.

@ilyagr
Copy link
Contributor

ilyagr commented Jun 14, 2024

There's some progress on this. I think rust-analyzer will (or can already?) handle fake jj-lib/tests -> testutils -> jj-lib dependency cycle.

That's great news! 🎉

I don't have an opinion on whether this refactor is good in the long term. After reading Yuya's comment, my mind also (like Austin's) went to the idea of making a minimal subset of jj-lib crate that could be merged with testutils and that everything else could depend on, but I'm unsure how practical or worthwhile that is.

@thoughtpolice
Copy link
Member Author

thoughtpolice commented Jun 14, 2024

So, I went and spent some time more closely reading the testutils code to see what could be split out and my basic takeaway was: none of it, this code is all too specific to jj in general to be part of a crate higher in the dependency chain (jj-utils) and it very much belongs in jj-lib. Maybe the only real things we could move out would be, what, some of the first few functions in testutils/lib.rs? But it's hardly worth making a whole crate just for testutils::new_temp_dir or whatever. Concepts like TestRepo or SecretBackend or commit_transaction are simply Jujutsu concepts by definition, so it makes sense they would be here and not elsewhere in the tree.

If we came to a point where we wanted to split out utils code from lib into another crate in the chain — I don't think any of this code would qualify for that privilege anyway, i.e. it would still just remain in what is now the testutils package, and anything we did move to a utils crate (as Yuya said maybe for usage between lib and proc-macros), I think testutils would not change much at all, it would be all the other stuff in jj-lib that we'd have to look at. In my mind, such a utils package also needs to be understood as a place for future code too, a polite dumping ground, so we keep it in mind as we write things in the future, for example it may be a place where we could put code that replaces some dependencies we have, ultimately shrinking our footprint. jj-utils should be generic stuff: one-off data structures, utilities, cross-OS wrappers for basic stuff, replacing a big dependency with the small bit of code we actually need, etc etc. For stuff like this #[test] stanzas are perfectly sufficient and mean that no cycles are needed.

So, I think the real question isn't about whether the code here will end up in a different crate that's higher up in the crate dep chain. I don't think that's going to happen. So then it becomes: should this module be in jj-lib itself, or in a crate used as a cyclic dev-dep like it is now. Personally, the whole Cargo feature here seems useful for some obvious cases, but it also comes across as a bit gross, it makes understanding the build graph harder in general (and the downstream cost on things like rust-analyzer is a good example of that IMO) and ultimately seems like it just made our IDEs buggier with not much clear conceptual benefit.

@thoughtpolice thoughtpolice force-pushed the aseipp/push-vrxtoymnuvpv branch from 73ef8a5 to 50d5d69 Compare June 14, 2024 20:38
@thoughtpolice
Copy link
Member Author

The latest version of the patch transitions jj-lib to use deny(unsafe_code, missing_docs) because deny allows it to be overridden (forbidden does not allow overrides, hence my confusion) and the change to missing_docs is because we would block a PR on missing docs anyway due to warnings, so we might as well just go ahead and deny them outright, which can then be overridden only for jj_lib::testutils instead

@yuja
Copy link
Contributor

yuja commented Jun 15, 2024

After thinking more and taking a walk I think I see what you mean; the testutils code also needs to be refactored in order to pull apart the implicit dependencies on jj-lib before anything "above" jj-lib can use it.

My point is that the testutils (or crate containing testutils) will depend on lib crates, but these lib creates will also have integration tests that depend on the entire lib environment.

Let's say jj-utils consolidates jj-repo and jj-git-backend for example. The current testutils would be contained in jj-utils. Integration tests in jj-git-backend would probably need testutils to set up test repo.

  • jj-utils -> {jj-repo, jj-git-backend}
    • jj-repo
    • jj-git-backend -> jj-repo ?
      • jj-git-backend (tests) -> jj-utils -> {jj-repo, jj-git-backend}

The last one isn't a real cycle, but old rust-analyzer was confused with that.

We can of course split testutils further (or move integration tests to jj-utils), but I think doing that would make things messier.

The following post explains valid use cases.
rust-lang/rust-analyzer#14167 (comment)

@yuja
Copy link
Contributor

yuja commented Jun 15, 2024

FYI, this is minimal change to remove scary dependency cycle within jj-lib unit tests.
#3888

While reading the code while working on Buck-style builds, I realized that there
is a strange cyclic dependency between `jj-lib` and `testutils`:

- `jj-lib` has a dev-dep on `testutils`
- `testutils` has a normal dep on `jj_lib`

Cargo apparently has (some?) kind of logic to topologize this sort of graph,
but it's awkward to understand or otherwise express, and has knock-on issues
with `rust-analyzer` too because certain code is only built under the proper
testing configuration, e.g. with `#[cfg(feature = "testing")]`

This instead just merges all of the `testutils` code as a single
`jj_lib::testutils` module, which breaks the cycle and also simplifies a few
of the mod imports as well. To fix the remaining compile errors we have to
explicitly use `jj_lib::testutils` first, which minimizes the overall diff. Some
call sites are simple enough to just inline though.

I do think that some of this can be pulled back out into a `jj-utils`
package that `jj-lib` unconditionally depends on and contains other various
functionality, too.

Signed-off-by: Austin Seipp <[email protected]>
@thoughtpolice thoughtpolice force-pushed the aseipp/push-vrxtoymnuvpv branch from 50d5d69 to b29f4c5 Compare June 30, 2024 21:59
@emilazy
Copy link
Contributor

emilazy commented Oct 10, 2024

FWIW, rust-analyzer team member @davidbarsky’s statements at various points on the Discord:

(we, on rust-analyzer, have a really difficult time handling cycles. i would not expect additional fixes, as I think most of the low-hanging fruit have been plucked)

and yeah, crate cycles are super dicey. it might work briefly, but think of rust-analyzer like bob ross: it just wants a bunch of nice trees everywhere

that PR should be merged regardless independent of this issue; cycles are no good

I think we shouldn’t expect rust-analyzer to get better at handling this, and there’s some vague indication that this might be contributing to issues getting rust-analyzer to work in Zed (though that’s not certain). Circular crate dependencies might be useful in some cases, but they also break Cargo’s model, so we can’t expect a good experience with them, and it seems like we’re likely to be playing whack‐a‐mole with individual problematic circular references if we continue to have the cycle. I think it makes most sense to go with this approach for now and worry about better factoring when crates are split.

@thoughtpolice
Copy link
Member Author

thoughtpolice commented Oct 10, 2024

Sorry for not replying sooner to this @yuja, life gets in the way...

Aside from the rust-analyzer stuff, I would like to respond to this:

My point is that the testutils (or crate containing testutils) will depend on lib crates, but these lib creates will also have integration tests that depend on the entire lib environment.

Let's say jj-utils consolidates jj-repo and jj-git-backend for example. The current testutils would be contained in jj-utils. Integration tests in jj-git-backend would probably need testutils to set up test repo.

I think this is the main sticking point is: what are you splitting up the crates for? I don't think splitting up things this way makes sense without a motivation. Actually, on that note, in my mind, I think of crate splitting in our case as having one major selling point:

  • More crates helps expose build parallelism. This is one of the primary reasons to split, in my mind, once a project gets large.
  • The idea is basically to write traits and types in their own crates, and then write a bunch of abstract crates that work in terms of these types and traits, and so on and so forth.
  • This is not so bad for us right now. We probably don't need to worry about splitting things for a while still, I think.

Why is this the main selling point? Doesn't it help if users can only depend on a subset of crates? That's nice for them. Maybe it is, but I think it's an illusion of choice (more on that later). But here is my second point that can help expose my thought process:

  • From the users POV, jj-lib is where our crate users should go. For example, gg is a tool built on jj-lib, and it should stay that way. And that is what we should expose to users as the API, and what we should test, and what we should support.
  • Corollary: There may be many jj-* crates that collectively are used inside up jj-lib, but jj-lib is what we support for users and what we expect people to interact with. So even if jj-lib has 20 other jj-foo crates behind it, the user doesn't know or care. (Note that gix is similar to this.)

Now, let's apply this thought process to the SecretBackend from the testutils package:

  • Example: Let's say we had jj-backend that had the Backend trait.
  • Question: Where should impl SecretBackend for Backend go?

The first answer is "in jj-backend, of course!" because but that runs afoul of what you said, since then it might need to depend on other types exposed from jj-lib. But if you rephrase the problem a little, my answer is:

  • Answer: in jj-lib, because that's where jj-backend is integrated and tested and that's what users use!

Under this interpretation, jj-lib is the lowest-level, supported, public interface that we have. So that's where all the full integration tests belong, because it is the integrated component we ship to users! That SecretBackend code doesn't have any use anywhere else. Yes, this does mean that maybe jj-backend is only a 200 line crate and the tests exist elsewhere. But that's not so bad, and the tests being in jj-lib means you don't "miss the forest for the trees"; you have to think about changes in the larger context of where they are used.

Note that people would probably not even depend on jj-backend anyway because they want WorkingCopy, and Repository, etc when they are integrating things. When crates are split up this way, most clients actually want a lot of crates; things like orphan trait impls are occasionally useful, but a user can just as easily depend on jj-lib instead of jj-backend and there's little harm to it.

In short, if we split up crates, I suspect it would mostly be a hierarchy of types and traits that are appropriately abstracted. But integration testing still happens in jj-lib because that's what we are actually exposing to users. So that's where all this test junk needs to go.


Note that there are also downsides to having a lot of crates. One I really dislike is having to maintain "equi-consistent" versions of many published crates. If we have 10 crates and a bugfix only touches 1 crate (and 2 transitively dependent crates), how do you cut a release? Do you do a full version bump and cut a new release for all 10 crates, or only bump each package's minor version and then release 1 new crate? Or 3 new crates if the bugfix requires a lower-bound? One advantage of jj-lib being the "entry point" for all users is that we can solve problems like this by simply having fixed version constraints for all crates; e.g. every release causes all 10 crates to be bumped and released, with hard constraints that jj-cli 0.25.1 requires exactly jj-lib 0.25.1 transitively.

This isn't related to this ticket but I'm pointing this out because, often users will get themselves into weird situations — let's say you forget a lower bound bump, so a cargo update doesn't update transitive dependencies, causing a build failure — and having a bunch of split up crates is often an illusion of choice, because it's a false choice; you almost always do want the components that were tested together at release time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants