-
Notifications
You must be signed in to change notification settings - Fork 2.9k
postMessage requires synchronous possibly-cross-process access #3691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I find the way this issue is phrased to be confusing, as a window's Realm is immutable, so we could just move step 2 to be before step 7.4 and get the exact same behavior without tripping up any "synchronous access to other processes' objects" alarms. Is there a deeper issue here? |
Well, the sync access to Window is a problem too, fundamentally... What you have available sync is a WindowProxy; that's really it. |
Or more precisely... There may be a Window available sync (to make the WebIDL branding checks work, though that stuff is actually pretty underdefined/broken with WindowProxy), but it's not the same Window as the place where the event will get dispatched; it just forwards things along to the right place. And it either does not have a Realm or has a different one... |
I guess both step 1 and step 2 could both move into the queued task. (I guess, modified so that we save a reference to the WindowProxy outside the queued task, and then get its inner Window inside the queued task.) But it sounds like you are hinting at a more general problem where every one of the CrossOriginProperties need to have their algorithms rewritten to be more obvious that they're not going to cause problems across process boundaries. I'm not sure how we'd do that exactly. |
Yeah, I'm not quite sure what the right way to do it is, either. Not least because I've just started seriously thinking about it on the implementation side. It's worth talking to engineers from browsers which have cross-process things in the same unit of related browsing contexts (which Gecko does not yet) and seeing how they view this stuff. |
To really solve this there are many things that would need to be changed. We'd have to actually allocate an event loop per agent properly, allocate multiple |
(I mentioned that elsewhere quite a while ago and also mentioned that to the Google folks who didn't think cross-process |
If I understand it correctly, some "data", like the Realm of the target window, is immutable and hence could be "easily" shared across process. And "queuing a task" on the event-loop corresponding to the similar-origin window agent of the target window, in theory doesn't require having sync access, since queuing a task is an async operation?
If the document of the "old window" is not fully active due to a navigation that occurred after the message was posted but before the task was queued, the task could still execute later if the document where to become fully active again(which would make the task runnable). So I would say the spec could indeed take into account that "queuing a task" across process could involve some delay, however I think the spec in general doesn't really depend on the exact timing of when a "enqueued task" is actually processed by the event-loop when the enqueuing happens from parallel steps(otherwise any parallel step enqueuing a task would have to first "stop the event-loop", enqueue the task, and then let it continue). From that perspective, the task enqueued via cross-origin messaging is not that different from say, the task that fetch enqueues to an event-loop from parallel steps.
In Servo there is a DissimilarOriginWindow, which owns an ipc-channel to a central component(the constellation), which can re-route the message to the appropriate event-loop(via another ipc-channel). |
I really wish I could link to snapshots of the spec, because the spec here has definitely changed since I filed the bug, and now I have to try to recover what the problems were, and whether they're still present...
The Realm identity is immutable, but the Realm itself contains and points to mutable state, right? So you can't share the Realm itself across anything, just some sort of identifier.
That's an interesting question. Tasks queued to the same task queue are required, by current spec, to run in the order they are queued. That means that queueing must be at least order-preserving in various ways, if not outright sync. Running a queued task is an async operation, of course. It's possible that there is not an actual problem here, by the way. It's just not obvious, and there's a great deal of leeway for implementors to do things differently from each other when the spec conceptual model doesn't match what's really going on.... |
From browsing https://html.spec.whatwg.org/commit-snapshots/ the snapshot you want (for May 14, there's none for May 15) is https://html.spec.whatwg.org/commit-snapshots/90a60b2a0dc740b8b0093b07ca0a41e70ba8d83a/ I think. You might also enjoy https://github.com/whatwg/html/blob/master/README.md#blame to find the commit hash you're looking for if you don't know the date (although |
Agree. I'm actually not very familiar with the "Realm" concept yet, but the spec seems to avoid "manipulating it's state", and seems to use it in this algorithm as a kind of identifier(
I agree that from a given set of parallel or event-loop steps, the ordering is assumed to be preserved, for example the various tasks enqueued to convey the lifetime of a fetch response, should indeed arrive at an event-loop in the order they were queued by a specific parallel fetch. Ordering should also be preserved if a task on the event-loop ends up queuing several tasks. I also think that from the principle of serialisability-of-script-execution, an event-loop doesn't have a concept of "when a task was enqueued by parallel steps(or another event-loop)", only that for a given set of parallel steps, any tasks enqueued should retain their ordering. I might be biased in the light of Servo's implementation, and I think this fits nicely with the concept of an unbounded "multiple-producer-single-consumer" channel, with non-blocking send operations, where ordering of messages is preserved per producer, but not across producers. With the consumer, an event-loop, not being concerned with synchronizing between producers, but still wanting to see messages per producer in the order in which the producer sent them. I've tried to write things down with regards to task-sources and queues and how they might be defined more precisely, although I can't say it's very actionable: #4615 For example if we were to define the DOM manipulation task-queue as local to a specific event-loop, one could indeed reason about an absolute ordering of all tasks, since they're all enqueued from sequential steps running on the event-loop. But for example for the networking task-queue, which is used (I think) almost exlusively to enqueue tasks back on an event-loop from parallel fetch steps, I don't think one can reason about the relative ordering of tasks between different parallel fetches, and it also doesn't matter since it's really just about having consistent ordering of tasks per parallel fetch, and the event-loop doesn't care about the order of tasks across independent fetches. If we needed one such "shared/ordered parallel queue" with consistent ordering of queuing across parallel algorithms, I think it would require a special definition, probably involving a blocking "enqueue" operation where the parallel step(s) would block until the task had been dequeued by the consumer. (Compare the current parallel-queue, which also seems to imply an undounded mpsc type of queue, and incidentally is only used for the shared-worker-manager, of which there is only one per user-agent, and there doesn't seem to be a form of synchronization assumed between various event-loops enqueuing steps on the queue, although ordering is probably assumed to be preserved from the point of view of one event-loop enqueuing several algorithms from the same "thread", say through repeated calls to |
Finally, on a more concrete note, we could perhaps define a cross-origin window not as an actual window with a
This approach would be I think similar to workers, which
(For dedicated workers, see https://html.spec.whatwg.org/multipage/workers.html#dedicated-workers-and-the-worker-interface:messageport) Perhaps an implicit MessagePort on cross-origin window(wrappers) would be a more consistent way of clarifying the spec @bzbarsky ? |
What would be the actual behavior of |
We could do something along the lines of:
I think at this point a browsing context should always have a port, and if it has a window associated with it whose document is fully-active, that port should be entangled with the window's port(by then enabled). In theory you should then have a messaging mechanism that works for both same- and cross-origin use cases. Then You could still lose messages due to a navigation, and that would be treated with explicitly as part of document unloading. |
Ok so I think there might be a few actionable items here. First of all, we can move Step 8.2 "Let origin be the serialization of incumbentSettings's origin." to a step prior to queuing the task(happening at Step 8). I think that's important, since the "origin" of the incumbent setting object from where Currently, the spec effectively says to take the origin of the incumbent, when the task runs. That seems like an opportunity for some sort of cross-site attack, since you could queue the task, and then navigate to another page, and then the queued task would masquerade as coming from the origin that would be the result of the navigation. I'm pretty sure UA must serialize the origin prior to step 8, and then "enclose" it within the queued task to use it at step 8.2. So we can move step 8.2 to before step 8 to make that explicit(this is similar to #1371). On the other question of the "realm"(it's a bit hard to remember what is exactly referred to above, the step numbering changed). I think one can queue a task to a specific realm, even across process, since that only requires having a reference to something allowing you to queue a task, not the actual realm. In the algorithm, the realm is also only used at step 8.4(inside the queued task), to do So in theory step 1 doesn't require an actual handle to the realm, rather some sort of identifier, in a similar vein to In terms of the target window navigating away while the task has been queued, I think from the point of unload the task has no chance of running anymore(unless one would argue that the new document would have replaced the old one in the same BC, and therefore should receive the message). If that is too late, we could explicitly cancel all tasks on the "posted message queue" as part of aborting the document, like is currently done with tasks related to fetch(see step 2 of https://html.spec.whatwg.org/multipage/#abort-a-document). So I don't think the spec requires cross-process synchronous access, since the variables use could in practice be replaced with identifiers as opposed to actual references to objects, and later when the variable is used to actually perform an action on an object, it seems to be from a context from which that object would indeed be readily available. (The use of "Let source be the WindowProxy object corresponding to incumbentSettings's global object (a Window object)." at step 8.3, so within the task, could also be seen as using an identifier passed along in the task, to then locally instantiate some sort of proxy, perhaps a cross-process one, to that window proxy. That is indeed a bit hairy and I think discussed elsewhere as part of the "proxy to a windowproxy" discussion, see for example #3727 (comment)) |
I don't believe it's possible for environment settings object's origins to change. Certainly not as a result of navigation. |
I was about to say the same: the origin of a settings object can't change.
Sure, if you have an identifier for it. But when you are calling If the intent is to "deserialize into the Realm that was in the browsing context at the point in time when the postMessage call happened", then you have a sync-access problem no matter what, because the "which identifier is in there" update isn't sync either, right? (Shared-memory tricks aside, and if the specification wants to require those, we need to be very explicit about it.) I should note that if we presuppose that
Why not? Nothing I see in the spec precludes it from doing so. The fundamental question we need to ask ourselves is why the target selection is the way it is and whether the goals of that can be accomplished sanely in a multiprocess world. Because an alternative behavior would be to immediately jump to talking about browsing contexts and posting a task to a browsing context, then selecting whatever Window is current when the task runs. But it's not clear that the spec has a good concept of tasks attached to browsing contexts, not Window instances. |
Thanks for pointing that out. Can the settings object on the other hand go away, if the window does? It could still make sense to do the origin serialization before queuing the task, and then use the serialized origin inside it, instead of requiring access to the settings object of the sender from within the task.
Ah ok I see what you mean. So actually what is happening in Servo, is that the identifier used to route the message, is a double key of the identifiers for a BC and a document. And the "cross agent-cluster proxy to a window" is instantiated for a specific BC/document combo, it's not something that is synced with the currently active document of that BC. For example in Servo we check if the document/BC combo can receive the message twice:
The spec doesn't seem to require giving feedback to the caller of
You're right, unloading in itself doesn't clear the task-queues. It could be specified to clear out the "posted message queue", or that could be done earlier when the document is aborted? When a navigation response is actually handled, then the tasks for the old document cannot run anymore, right? I'm referring to https://html.spec.whatwg.org/multipage/#navigate-html |
At what point in time? When the Dedicated workers can post messages back to their window, but we're specifically worrying about Window's
Nope. The "proxy" has lifetime spanning multiple windows. It does have something that can identify the current window it's proxying, obviously.
Apart from initial about:blank, there is only one document per window, and documents never really enter into this picture: all the work here happens on windows, without reference to documents.
Define "still running"?
Note that documents can be in the session history (e.g. in the non-discarded but navigated-away-from) state and not subject to receiving messages (because navigated away from).
That's correct.
That's presumably buggy with initial about:blank?
Conceptually, in the spec, there is only one WindowProxy per BC.
That is an interesting question, with the spec and implementations being all over the place on the details.... |
Ok so actually what I wrote is incorrect, in Servo, the "cross agent-cluster proxy to a windowproxy" is linked to a specific browsing context. Then, in the "constellation", which is like a unique router in the UA, that state of the "currently active document in the session history of that BC" is stored. So when the "proxy to a proxy" does a So basically the only "identifier" stored in the "proxy to a windowproxy", is that of a BC, then the routing to a given window/document is done based on the state of the session history for that BC, at the point of routing, not at the point of sending. Then when a message is received in the target agent-cluster, we check again in case the window would have been closed already(which can happen due to messages ahead in the queue versus the one routing the Note that the constellation manages navigation and session history, so basically the outcome of the question "to which window is this message going to get routed" is made there, as is any navigation/changing of the session history. So essentially the "constellation" in Servo acts a a parallel queue with regards to those workflows, which allows for synchronization across agent-clusters(not without some hairiness I admit). As you might have noticed already, the So I think if we spec instantiating "proxy to a window proxy" precisely, and link those to an actual window proxy in another agent-cluster via an identifier, and then we would need to serialize the routing of |
OK, but does that match what is currently specced? If it doesn't determine the target Window at the time of sending, I don't think it does. Which is precisely what this issue is about. |
Yes I think you're right, so the spec says "target window", in practice in Servo at least, it's rather "target window proxy(via a proxy to it)". But by the way does "window" not usually means "window proxy"? |
By the way the spec does have an interesting note:
|
(I think it really depends on who you talk to what "window" (lowercase) means. Window typically refers to the global object though and not the global this object (i.e., WindowProxy).) |
This closes #5352, by making event loops and agents 1:1. It also adds an explicit explanation that event loops/agents and implementation threads are not necessarily 1:1, apart from the restrictions imposed by JavaScript's forward progress guarantee. This also closes #4213, as worklet event loops work fine in this architecture. This also closes #4674, as browsing contexts changing event loops is not a problem in this architecture, since they change agents at the same time. Other problems remain around communication between agents (see e.g. #3691), and some of that communication is event-loop mediated, but the specific note discussed there is no longer relevant.
This closes #5352, by making event loops and agents 1:1. It also adds an explicit explanation that event loops/agents and implementation threads are not necessarily 1:1, apart from the restrictions imposed by JavaScript's forward progress guarantee. This also closes #4213, as worklet event loops work fine in this architecture. This also closes #4674, as browsing contexts changing event loops is not a problem in this architecture, since they change agents at the same time. Other problems remain around communication between agents (see e.g. #3691), and some of that communication is event-loop mediated, but the specific note discussed there is no longer relevant.
Step 2 of https://html.spec.whatwg.org/multipage/web-messaging.html#dom-window-postmessage requires synchronous possibly-cross-process access, as far as I can see.
In practice, it's not clear to me what UAs do here. For example, if a navigation matures while the postmessage task is queued (probably testable by doing postMessage from unload at least in the same-process case) does the message event fire on the old Window, the new Window, or neither? It looks to me like the spec requires it to fire on the old Window; not sure whether that firing would be observable (and hence whether this is black-box distinguishable from "neither").
@annevk @domenic
The text was updated successfully, but these errors were encountered: