-
Notifications
You must be signed in to change notification settings - Fork 213
Running code in an isolate group rather than isolate #4379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @aam @mkustermann |
My main issue with the "code crashes if you touch non-shared static variables" is that it's an invisible and uncontrollable limit on what you can do inside such a not-real-isolate. Unless the code you run is completely authored by yourself, which it never is because you will use platform libraries, there is no way to predict whether code will crash. If you choose to rely on code you haven't written, your code can break in the very next update, if that third-party code decides to access a non-shared static variable. Accessing static variables is not considered a breaking change, so nothing will prevent others from doing it. And accessing a new static variable must not be considered a breaking change. That's far too onerous a restriction. (I'll personally be happy to close any issue on core packages for not accessing static variables with a "No thanks".) That is: The vast majority of code, and all code written so far, has no incentive to make itself compatible with shared isolates. The code you can use in a shared isolate will be very limited. That doesn't worry my as much as it being fragile, if you don't even know that the code you have tested will keep working. You might be incentivized to pin yourself to specific version numbers of things you depend on, because those are known to work, and a patch-level increment of a package might stop working. |
How isolategroup-shared callbacks restriction on non-shared static field access is different from, for example, restriction on what can be sent in a message to an isolate? Some class from some third-party package didn't use to have some non-sendable state in it, now it does, so your code that used to send an instance of that class will now throw and have to be changed. That seems to be acceptable. Basically, isolategroup-shared callbacks have limitations, but they are very useful for integration with native libraries(remove a need to write/build native code). If errors are clear and actionable(send message to an isolate that would run the code that accesses non-shared static field), why rob users of this ability? |
@mraleph I think I could get comfortable with this if it was made more explicit that this is not intended to be a way to run arbitrary Dart code. So something like:
@sigmundch This feels connected to things you've been exploring around limiting what code dynamic modules can access. |
Here's some of my thoughts: We want to use shared memory multithreading support for callbacks from C, especially for synchronous callbacks where C calls Dart and expects a synchronous return value. This may require the Dart code to do arbitrary work - sometimes it may do very little work, sometimes it may do much more work. I'd argue that those callbacks should be able to use If we created a fresh static field state for each callback invocation, then
For high-frequency C->Dart calls this is just way too much overhead. e.g. Imagine C calls Dart and passes a simple protobuf message describing some state and Dart should return a boolean whether proceed or not. Now the overhead we'd be adding is enormous: We'd maybe allocate xx KB of memory for static field state & default initialize it, lazy-initialize various static protobuf metadata fields on-first-access and finally decode a few bytes of proto message and return a boolean. => IMHO We should have a solution where the callbacks perform only the work they actually need to perform - no So to make this more performant I'm convinced that we want to make any global fields (corelib & user-defined static fields) that are on the hot path to be initialized only once and ready-to-use for all callback invocations. We have the new shared fields for that. Whatever those fields on the hot path contain has to be sharable (and we should have a discussion whether that can be mutable dart objects as well - possibly via opting in individual classes, ...) Once we have this "run a dart closure with zero extra overhead" concurrently with normal dart isolate, we have a powerful mechanism. We're going to use that mechanism then also to parallelize programs: Create N threads, do some work in parallel and join those N threads (similar to using lightweight isolates today -- but now avoiding the extra Coming back to the original issue: I can see the point that users will have a hard time to predict what packages, libraries or core library features they can use in C callbacks - or whether they may get exceptions -- and how they may silently get broken with a What I could imagine is that we don't throw on normal global field access but
Currently the VM represents the static fields of a program as a large array indexed by a field id - this is very fast to access. An isolate creation involves a copy of that large array with default values & sentinels. Though one could make shared isolates use a different representation that can grow as the number of fields accessed grows (e.g. hashmap). That would make us only pay for what's actually used (and a little code size cost and a little access time cost for normal isolates accessing normal static fields - a "am I shared isolate" bit test and branch) All the state we currently initialize by embedder on isolate creation (e.g. That would make us end up in a place where the overhead of invoking a callback is purely based on what that callback needs (no extra embedder setup, no setup for fields that aren't used, ...) while still not throwing when accessing normal static fields. When profiling, one will observe slowness due to using non-shared field initializers and start migrating them to be shared fields. |
The problem is then that some static variables are mutable and not shareable. What if we had a special "initialize-once" shared variable, which does have per-isolate state, but it is guaranteed to be able to share the initialization value, so if that requires complicated computation, it'll still only happen once. Then the resulting value is cached and the next initialization is just "read value from isolate-shared cache". If we used a const _uriParserTable = IsolateVariable.late(_createParserTables); which would invoke that function the first time it's read, store the value in a real shared variable (or a global authority array of initialization values), then initialize the isolate variable with that value. You will need O(#sharedVariables) shared space, O(#isolateVariables) per-isolate state initialized, which can be copied from a global authority array for each shared isolate creation, or initialize to
And it won't even get that far, since it tries to read the (And for the record: |
What if there is no I would really like to avoid unsafe as a prefix. It does not communicate what is actually unsafe about the function - so it does not add any value except looking dangerous. We already have |
Looking dangerous is the value. Seriously.
Something like this is probably reasonable. In general I think that going through a native interface is a good signal that you're holding something sharp. I still don't understand the "shared" terminology.
Perhaps it should? But Another possible approach might be to just add an optional parameter to the existing |
So far the name for the ffi callback that we've been using is |
But that state is shared with all of the other isolates as well. That is, the distinguishing feature of the isolate you get from calling this api is not that it has access to shared state - all of the isolates in the group have access to that state. The distinguishing feature is that it does not have access to any other state of its own. |
During the recent meeting we have discussed progress of shared native memory multithreading prototype. One of the topics of contention was behavior of the Dart code which executes within an isolate group but outside any isolate.
As currently described in the proposal and implemented in the prototype an attempt to access static state not marked as shared by the code which is running outside of the isolate throws an exception. For example:
Language team raised a concern about this behavior - we will use this issue to discuss arguments for and against this behavior.
This behavior was chosen in the original proposal because in our (implementors opinion) it makes behavior of callbacks entering Dart from native on an arbitrary thread cleaner. The only other viable option is to create a temporary isolate which would exist for the duration of the execution and is destroyed after the call.
If users want to use code that requires isolated static state - they can always manage that explicitly, e.g. something along the line of
Essentially we are giving developers a way to write code which they currently need to write in C/C++ using embedding API.
cc @lrhn @leafpetersen
The text was updated successfully, but these errors were encountered: