-
Notifications
You must be signed in to change notification settings - Fork 81
possible race conditions on intel MacOS #15730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am observing a crash on my intel macbook Sequoia. Maybe it is linked -
|
Was looking for solution, found this: https://stackoverflow.com/a/74249047/28484223
We have both |
Here's a thread about this: I see a few solutions to this:
|
Meanwhile, I've pushed a fix attempt here: But let's see just in case, I'm waitin for @Khushboo-dev-cpp to test it. If the fix doesn't help, I think we should go with 2.33 as is. It will take quite some time to fix this, and it's not some simple change. Perhaps we could try to increase some delays, as it helped previously, but it might not improve it. UPDThanks @Khushboo-dev-cpp for testing the hotfix. It doesn't work, as expected 🙂
|
Damn, and we won't be able to do (1), because we can't import |
Thinking of potential solutions.
NOTE:
|
@igor-sirotin : are you sure this panic is because of multiple version of go libs ? |
if you could add a threadpool the same way we have in status-desktop to this repo https://github.com/siddarthkay/status-desktop-intel-crash-reproducer and if we can reproduce the same error. Then we can be really sure. |
@siddarthkay
I'm not 🙂 I'm currently reading the thread about this. |
This can be reproduced on an M1 Mac as well when compiling the app for x86_64 and running with rosetta. |
Thanks @alexjba, that'd be very helpful. I'll try it out. |
One mention would be that status-go doesn't cross-compile anymore with |
Maybe a dumb question, but what about using plugins? Isn't it an option? Was looking at the golang issue attached here and people suggest this as the preferred way of integrating multiple go libs. |
I was able to reproduce on M1 with x86_64 qt build. Thanks @alexjba! |
I think I reproduced this with a simple C++ app that imports #include "status-go/build/bin/libstatus.h"
#include "status-keycard-go/build/libkeycard/libkeycard.h"
#include <string>
int main() {
// status-go
char* jsonRequest = "{'dataDir': '/Users/igorsirotin/Repositories/Status/_sandbox/test-double-go-runtime/accounts'}";
InitializeApplication(jsonRequest);
// status-keycard-go
InitializeLibrary();
return 0;
}
Simple example
It's also reproducible with 2 simplest Go libs, that force
Code
Works with delayWhen adding a 50ms sleep between the 2 initializer functions, it works:
TODOTODO for myself:
|
Some more testing. Here's the code: int main() {
InitializeFoo();
delay();
InitializeBar();
return 0;
}
Basically, a 50ms delay helps in 100% cases. No clue why. I'll try to introduce this delay in some very early stages of |
I've tried to force GC as the first thing in status-desktop/src/nim_status_client.nim Lines 154 to 157 in 4eae12b
This didn't help. The initialization functions pass fine, but the app still crashes at the next call to
TODO:
|
@igor-sirotin : I would suggest trying to remove the -03 optimisation flag on both keycard and status-go in Makefile and verify if turning off these performance optimisations fix the issue or not.
ps: I have no evidence that this will fix the issue, but worth a try. |
❌ Try to disable Go optimizationsDid not change anything. ❌ pluginsI looked at plugins, which are mentioned in golang/go#65050 indeed. Theoretically, plugins would work, because they don't contain the go runtime, they expect the application (that loads the plugin) to have the Go runtime. Thus, there will be only a single Go runtime. Problem is that Go plugins are:
❌ Try to run more operations on the libraries in the C++ minimal exampleIn my example app, as soon as I add more Go calls without a delay in between, it crashes. e.g. int main() {
InitializeStatusGo();
delay();
InitializeStatusKeycardGo();
delay();
StatusGoFoo();
// delay(); <--- no delay
StatusKeycardFoo(); // <--- panic here
return 0;
} It seems that it's enough to make sure that 2 libraries never work in parallel. Current conclusionI will make PoC with running |
Bug Report
Description
It was observed that status-desktop would often crash on onboarding stage on intel MacOS.
After trying to figure out why that happens here #15134
I found out that this issue was often fixed by doing nothing.
All the signs point to a possible race condition in the thread pool setup where a
nim
service calls some business logic instatus-go
orstatus-keycard-go
.I also discovered that for
nim
interop ofgo
we free the memory by calling ago
method which is called via anim
interface.Exhibit A :
Lets take a look at how we consume
keycardInitFlow
which lives invendor/status-keycard-go/shared/main.go
Source of
keycardInitFlow
is :This function is wrapped around another wrapper in
vendor/nim-keycard-go/keycard_go.nim
and the source looks like this :It is also important to look at the source of
go_shim
:The source lives here :
vendor/nim-keycard-go/keycard_go/impl.nim
Finally this code is being consumed in
service.nim
like this :service.nim
lives here :src/app_service/service/keycard/service.nim
I tried to create a minimal reproduction repo here but I was unable to reproduce the crash :
https://github.com/siddarthkay/status-desktop-intel-crash-reproducer
Although my efforts did not include a thread pool and that could be the key to reproducing the race condition.
Another key factor in discovering this race condition was upgrading
go
to1.21
.go 1.21
has brought significant changes to its garbage collector and the crash we would see would often link to the code related to garbage collection.error message :
reference in go source :
https://github.com/golang/go/blob/8f5c6904b616fd97dde4a0ba2f5c71114e588afd/src/runtime/mcache.go#L325
At the moment this issue is mitigated by introducing some sleep time in this PR : #15194
However this is not a proper solution and we may run into race conditions elsewhere in the future.
Steps to reproduce
Expected behaviour
Actual behaviour
The text was updated successfully, but these errors were encountered: