Skip to content

Conversation

@aryairani
Copy link
Contributor

@aryairani aryairani commented Dec 19, 2025

Overview

The transcript.inplace runner seems to open the codebase for migrations, then opens it a second time to tun the actual transcript. There seems to be a race condition in which releasing the lock the first time doesn't actually block for the lock to be released; and in fact it doesn't seem to be released in time to acquire the lock the second time, and we were seeing crashes in CI as a result. cc @dolio

Not sure why this locking issue just started manifesting recently.

  • What does this change accomplish and why?

    • i.e. How does it change the user experience?
    • i.e. What was the old behavior/API and what is the new behavior/API?
  • Include "before and after" examples if appropriate. (You can copy/paste screenshots directly into this editor.)

  • List any Github issues that this PR closes, in closing-issues-using-keywords format.

n/a

Implementation approach and notes

Adds a new locking type which prints a message and blocks until it can acquire a lock, if it can't immediately acquire one. Then it uses that new locking type when trying to open the codebase the second time after running migrations before running the transcript.

Interesting/controversial decisions

There were a few other places where it looked like it might be fine and good (though not necessary) to also block for the lock, but I left them out in order to have a more surgical PR.

Test coverage

  • Have you included tests (which could be a transcript) for this change, or is it somehow covered by existing tests?

CI has started passing again
https://github.com/unisonweb/unison/actions/runs/20403439799/job/58629670481?pr=6075
vs
https://github.com/unisonweb/unison/actions/runs/20391436262/job/58601486802?pr=6075

  • Would you recommend improving the test coverage (either as part of this PR or as a separate issue) or do you think it’s adequate?

not needed

  • If you only tested by hand, because that's all that's practical to do for this change, mention that. Include screenshots.

Loose ends

Not exactly lose ends because they're not related to this issue, but see the remaining open conversations.

Also, like with any PR I'm reminded that CI is too slow, and could maybe use mOrE cAcHiNg

Final checklist

  • Choose your PR title well: Your pull request title is what's used to create release notes, so please make it descriptive of the change itself, which may be different from the initial motivation to make the change.
  • Update your PR description if the specifics of the PR have changed over time.
  • Include transcripts or screenshots that demonstrate the changed behavior.
  • If you changed .cabal files, make sure the package.yaml files are up-to-date instead.

@aryairani aryairani force-pushed the arya/tweak-cache-key branch 2 times, most recently from 9046950 to bf0caff Compare December 20, 2025 08:26

- name: verify stack ghci startup
if: runner.os == 'macOS' && steps.cache-ucm-binaries.outputs.cache-hit != 'true'
if: runner.arch == 'x64' && runner.os == 'Linux' && steps.cache-ucm-binaries.outputs.cache-hit != 'true'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to a different PR?

with:
path: ${{ env.runtime_tests_codebase }}
key: runtime-tests-codebase-${{env.runtime_tests_causalhash}}
key: runtime-tests-codebase-${{ matrix.os }}-${{env.runtime_tests_causalhash}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether we need this

if: steps.cache-interpreter-test-results.outputs.cache-hit != 'true'
run: |
echo "passing=true" >> "${{env.interpreter_test_results}}"
- name: setup tmate session
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could leave this in for convenience?


if [ -z "$1" ]; then
stack build
stack build --fast
Copy link
Contributor Author

@aryairani aryairani Dec 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dolio any objections to leaving this in? if incurring a rebuild with optimizations disabled is a concern, you can also pass in your already-built unison to this script as $1

@aryairani aryairani force-pushed the arya/tweak-cache-key branch from 4e0c5c2 to f34fe5f Compare December 21, 2025 02:31
@aryairani aryairani changed the title tweak runtime tests codebase cache key block for lock during transcript.inplace Dec 21, 2025
@aryairani aryairani changed the title block for lock during transcript.inplace block for lock during transcript.inplace Dec 21, 2025
the transcript.inplace runner seems to open the codebase for migrations,
then opens it a second time to tun the actual transcript. there seems to
be a race condition in which releasing the lock the first time doesn't
actually block for the lock to be released; and in fact it doesn't seem
to be released in time to acquire the lock the second time, and we were
seeing crashes in CI as a result.

there were a few other places where it looked like it might be fine and
good (though not necessary) to also block for the lock, but i left them
out in order to have a more surgical PR.
@aryairani aryairani force-pushed the arya/tweak-cache-key branch from 141f0d1 to d2ed60f Compare December 21, 2025 02:43
@aryairani aryairani marked this pull request as ready for review December 21, 2025 03:28
@aryairani aryairani merged commit ee18126 into trunk Dec 21, 2025
31 checks passed
@aryairani aryairani deleted the arya/tweak-cache-key branch December 21, 2025 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants