-
Notifications
You must be signed in to change notification settings - Fork 232
Running pub global run
concurrently
#3165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks! cc @cskau-g The PR that started using this: flutter/flutter#87231 |
I just tried a test: I ran a short-running command: However, if I did this instead:
Then it fails, all of the executions crash, probably because on the first run after install, 72 cores are busy trying to build the package executable, and something isn't quite locked right. (I used the process_runner app in https://pub.dev/packages/process_runner to do the concurrent runs, which makes a task queue that keeps 72 running at the same time) |
Here's some of the stderr output:
(I eliminated duplicate lines) |
That is exactly what we were seeing on the flutter analyze step on the HHH bot. Thanks @gspencergoog! |
In fact all of those errors you list, Greg, is what we've been seeing. Which is great because it ties it all together and points to the same underlying issue. @sigurdm, I'm surprised by Greg's findings above as I thought for sure we'd established that |
I'm also a bit puzzled. @gspencergoog can you share the script/setup you used? I tried something like:
But did not seem to provoke any failures. |
It would be interesting to bisect this, and see if it has anything to do with the support for incremental compilation: #3074 |
This fails consistently for me after about 10-15 seconds (when the first failures start returning). rm -rf ~/.pub-cache
rm -f /tmp/pub_race.log
pub global activate process_runner
pub global activate snippets
cat <<EOF > do_pub.sh
#!/bin/bash
pub global run snippets --help 2>> /tmp/pub_race.log
EOF
chmod +x do_pub.sh
yes "./do_pub.sh" | head -1000 | pub global run process_runner --report --source=- I assume you have the same monster machine as I do, so it should fail for you too. |
Oh, and just to point out that if you have run it once, and don't clean the pub cache, and just run: yes "do_pub.sh" | head -1000 | pub global run process_runner --report --source=- then all the runs succeed. It has something to do with the simultaneous building of the snapshots. |
My computer seemingly is not beefy enough to reproduce (16 cores) :( @dcharkes I heard rumors that you have a mean machine, could you try running the script above? (I had to change the last line to @jakemac53 Do you think this could be related to the incremental compilation change? My intuition says that after |
I do not get a reproduction on my beefy machines (nor MacBook) either. Script used: rm -rf ~/.pub-cache
rm -f /tmp/pub_race.log
dart pub global activate process_runner
dart pub global activate snippets
cat <<EOF > do_pub.sh
#!/bin/bash
dart pub global run snippets --help 2>> /tmp/pub_race.log
EOF
chmod +x do_pub.sh
yes "./do_pub.sh" | head -1000 | dart pub global run process_runner --report --source=- Dart version used:
|
pub global run
pub global run
concurrently
Sounds like a perfect excuse for an upgrade! :-)
Huh, OK, I wonder what the difference is. I get a repro pretty much every time I run it. I'll try and bisect it today. |
OK, I'm really confused now. I can't get it to reproduce now, despite having run it four or five times in a row yesterday and had it happen every time. |
Could it have something to do with package extraction? After rolling back to yesterday's Flutter, I just started to be able to reproduce the failure again, and it looks like this:
And the errors were:
(and so on) The interesting thing to me are the warnings during the activation: they seem to indicate that a check for the executables is happening before the package is done being unpacked, and it doesn't do the normal "Building package executables" stage (presumably because it doesn't think they exist). I don't remember seeing that yesterday, but I suppose I could have missed it. This is the dart version (from yesterday's Flutter) that I can reproduce this in:
The corresponding Flutter hash is 1b73a35fba3d5a4dfbeae42107d9a7227bac1615 for the Dart version I was using, which corresponds to the Flutter Engine hash 74fdd30dc25e91d355ff18fdf52de27b7713f11a. I was able to reproduce this multiple (at least 20) times with that version of Dart, and it reproduced every time. But not with a more current version on Flutter's HEAD from today ( The script I was able to reproduce it with on yesterday's Dart was: #!/bin/bash
DART="/usr/local/google/home/gspencer/code/flutter/bin/cache/dart-sdk/bin/dart"
TMPDIR=$(mktemp -d /tmp/pub_test.XXXXX)
echo "Tmpdir is $TMPDIR"
DO_PUB="$TMPDIR/do_pub.sh"
LOG_FILE="$TMPDIR/pub_race.log"
CMD_FILE="$TMPDIR/pub_cmd.txt"
rm -f "$LOG_FILE"
cat <<EOF > "$DO_PUB"
#!/bin/bash
"$DART" pub global run snippets --help 2>> "$LOG_FILE"
EOF
chmod +x "$DO_PUB"
rm -rf ~/.pub-cache
"$DART" pub global activate process_runner 4.1.2
"$DART" pub global activate snippets 0.2.5
yes "$DO_PUB" | head -1000 > "$CMD_FILE"
"$DART" pub global run process_runner --report --source="$CMD_FILE" Interestingly, the following does NOT fail at all with that same revision: #!/bin/bash
DART="/usr/local/google/home/gspencer/code/flutter/bin/cache/dart-sdk/bin/dart"
TMPDIR=$(mktemp -d /tmp/pub_test.XXXXX)
PUB_CACHE=$TMPDIR/pub-cache
export PUB_CACHE
echo "Pub cache is $PUB_CACHE"
DO_PUB="$TMPDIR/do_pub.sh"
LOG_FILE="$TMPDIR/pub_race.log"
CMD_FILE="$TMPDIR/pub_cmd.txt"
rm -f "$LOG_FILE"
cat <<EOF > "$DO_PUB"
#!/bin/bash
export PUB_CACHE
PUB_CACHE="$PUB_CACHE"
"$DART" pub global run snippets --help 2>> "$LOG_FILE"
EOF
chmod +x "$DO_PUB"
"$DART" pub global activate process_runner 4.1.2
"$DART" pub global activate snippets 0.2.5
yes "$DO_PUB" | head -1000 > "$CMD_FILE"
"$DART" pub global run process_runner --report --source="$CMD_FILE" I checked to make sure there weren't any other |
So it seems like some change may have fixed it between |
One difference between the outcome of the two scripts could possibly be because one is writing to a memory filesystem (/tmp is a "tmpfs" memdisk on my machine), and one is writing to a physical (SSD) filesystem. So it could just be that changing that means one wins the race and the other doesn't. EDIT: I tested this, and the second script still works even if I change the TMPDIR to a non-/tmp directory, so that's not it. Anecdotally, I did find that
|
I'm curious if there are any updates, this is still indirectly causing flakes daily on the Flutter HHH bots. |
I stopped collecting data because I was no longer able to reproduce the problem. |
I just ran the scripts above again, and I still can't repro the problem with Dart version |
thanks for looking into this @sigurdm ! |
I am getting a 100% repro of this bug if I attempt to After manually activating a custom version of dart, I then commented out the following lines in
Then simply running The weird thing is that no matter how many times I run the following command:
It always insists on building the snippets binary. This may be the root cause of this bug.
|
If the snapshot is missing we do
deleteBinstubs
before recreating them, this opens a window for a race condition.The text was updated successfully, but these errors were encountered: