-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Move to one-file-per-query for sqlx-cli prepare #570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It will for schema changes where the queries were unaffected. Then you'll have potentially dozens of files to resolve merge conflicts on rather than one. In general, I think we agreed that a single file was better because it could easily be skimmed over in diffs and if a merge conflict did arise it would be a single "accept ours"/"accept theirs" resolution. |
Having worked with this for a bit, it's always annoying to have to go back when CI failed because of a stale cache. I would be very happy if the separate
I think what I would do after resolving the rest of the merge conflicts (if any) is just delete |
If the files were contained in a standard directory then it would probably be possible to set default resolution rules with I'm weary of the macros automatically creating such a directory outside of As for the directory name, maybe |
The reason the macros don't directly write to one file is because they might potentially be run in parallel (not currently I don't think but in the foreseeable future) so if they all tried to rewrite it they would probably clobber each other. However, if the Although the more that I think about it, this would end up just being an inferior implementation of writing everything to a directory anyway. |
Sounds like changing to one file per query is realistic then? Would this be something that might end up in 0.4, or would you rather release 0.4 with |
If that's all it does, that seems like a pointless abstraction though. In that case the documentation could just mention that users can create |
Actually, I don't think we have to do anything special to arrive at a file format that's good for appends. We could just atomically append JSON objects to a file, each object essentially being the same as the current
Then a user could have a pre-commit hook that runs @mehcode any objections? |
Hmm, on further research it seems like we may not be able to guarantee atomicity of writes to files anyway. It's very OS and filesystem specific and not at all specified. Multiple files it is then. |
As per the discussion on Discord, we still have the issue with concurrent writes with multiple files, if the filenames are the hashes of the queries (which is necessary for the macros looking up the query data). We could make the filenames to be the spans of the macro invocations, which should be unique, but that has other issues:
Instead, I propose we encourage users to configure Git to run |
@abonander: What about first writing to a temporary file and then moving that to the actual location? I think I've seen that used to work around concurrent write issues before. Personally I'm pretty sure I won't use a pre-commit hook for something that takes as long as |
There's one more advantage to this that hasn't been mentioned: It should improve performance quite a bit for projects with lots of queries, since every macro call only has to parse (possibly with debug-compiled parsing code) its own JSON metadata, not a JSON file with all of the queries for the entire application. |
I assume with one file per query, always rebuilding the entire crate also becomes unnecessary, right? I'm not entirely sure but I think it has a decent impact on both perf and cache disk usage. @abonander @mehcode Would you be interested in a PR that implements this, using the aforementioned strategy of writing to a temporary file and then moving it to the actual location to avoid issues with concurrent writes to the same file? |
I'm personally for this, yes. I imagine a |
This is in scope for 0.4 too, right? I would be happy to also then rebase it on top of 0.5, but if possible would like to see another 0.4.x release that has this. Edit: Actually, it would probably be too confusing to have the |
Since 0.5 is going to actually not take long, I'm now working on this and am already positive I'll have a PR ready later 🙂 |
That's true, but replacing a file is atomic. To do that:
This is guaranteed to result in one of the following:
None of these results involve the original file being only partially overwritten. Hence, replacing the file is atomic.
I haven't seen the Discord discussion, but if you mean the hash could collide, that shouldn't be an issue if you use a cryptographic hash like SHA256. It's incredibly hard to find collisions in such a hash even intentionally, let alone accidentally. |
We ultimately came to the conclusion that the concurrent write issue is rather moot since macros aren't expanded in parallel and probably won't be for some time.
I'm not worried about hash collisions, which are certainly unlikely with SHA256, but actually just the user copypasting the exact same query into multiple invocations, which is orders of magnitude more likely. |
That shouldn't be a problem, as long as the query data files are replaced atomically (as described above). |
This seems to have a lot of advantages and I can't remember why we didn't do this originally.
git merge
won't get in the waycargo build
transparently always docargo sqlx prepare
and leave the latter as a "make sure" / verify stepThe text was updated successfully, but these errors were encountered: