VideoFrame.createImageBitmap returns an ImageBitmap with coded resolution rather than display resolution #119

jamespearce2006 · 2020-12-22T10:16:46Z

It seems the createImageBitmap method is returning an ImageBitmap with slightly incorrect dimensions.
For example, in my case, I'm getting video frames that have the following properties:

codedHeight: 368
codedWidth: 640
cropHeight: 360
cropLeft: 0
cropTop: 0
cropWidth: 640
displayHeight: 360
displayWidth: 640
duration: null
format: null
planes: null
timestamp: 1

I'd expect the resulting ImageBitmap to honor the crop data and be the display resolution - 640x360. In reality, I'm getting an ImageBitmap that is 640x368, and the last 8 lines aren't valid data.

The text was updated successfully, but these errors were encountered:

jchen10 · 2020-12-23T04:54:09Z

This is something to be clarified. Currently for hardware decoded video frames, chrome uses coded size for the created ImageBitmap, while it's crop size for soft video frames.

jamespearce2006 · 2020-12-23T11:09:40Z

Presumably, that's a temporary thing? Having an API behavior vary depending on the implementation of the underlying decoder seems like a bad thing.
It seems to me that if an app developer is calling createImageBitmap, that means they intend to render the output. In that case, I imagine they'd always want the intended display size, not the coded size. If it's a performance concern, perhaps a parameter could be added to createImageBitmap so the caller can indicate whether they want the cropped or coded size.

In the meantime, what's the best way of cropping the ImageBitmap? I'm rendering using WebGL, and texSubImage2D doesn't seem to work (the call doesn't include crop dimensions, but fails with an ImageBitmap anyway).

jchen10 · 2020-12-24T01:25:00Z

The WebCodecs spec should further clarify this. A parameter indicating whether callers want the cropped or coded size may be a good choice. @sandersdan @chcunningham

For the texSubImage2D problem, probably you need to use WebGL2 context to crop an ImageBitmap. Alternatively you can simply use texImage2D, and set the texture coordinates cropped.

For WebGL, there is another option, WebGL_webcodecs_video_frame and Sample. You can skip the
intermediate ImageBitmap step with it. It's still in early stage, supporting hardware video frames only now. Software video frames will be added soon. Try it if you want to be an early bird. :)

jamespearce2006 · 2020-12-24T09:40:04Z

Thanks for the info. I'd be strongly in favor of having some way of extracting the cropped frame as an imagebitmap. I would imagine >80% of other API consumers would too.

WRT skipping the intermediate ImageBitmap - unfortunately, my use case requires me to buffer a second or more of decoded frames. It seems only a handful of undestroyed video frames are allowed before the decoder will stall, which isn't surprising. Obtaining the ImageBitmap seems the best approach for me.

sandersdan · 2021-01-05T22:02:16Z

We last discussed this in December and at that time the preference was to be as user-friendly as possible without sacrificing substantial performance. I believe that means:

Always apply the crop
Do not scale to display size
Do not apply color space conversion (not yet exposed by WebCodecs VideoFrame)
Do not apply rotation (not yet exposed by WebCodecs VideoFrame)

This does mean that an ImageBitmap created in this way can't be used reliably without reference to the VideoFrame's display size, color space, and rotation metadata. I would predict that many developers would assume 1:1 PAR, sRGB, and 0deg, at least one of which will usually be wrong.

Any feedback on these points would be appreciated!

chcunningham · 2021-01-06T03:56:58Z

Having an API behavior vary depending on the implementation of the underlying decoder seems like a bad thing.

Definitely!

It seems to me that if an app developer is calling createImageBitmap, that means they intend to render the output. In that case, I imagine they'd always want the intended display size, not the coded size.

I like this reasoning.

I believe that means:

Always apply the crop

Sounds like everyone is in agreement to at least apply the crop. We fix chrome's behavior wrt gpu-backed frames. Happy to also clarify the spec. For now, I lean toward not making it user-configurable and just doing the intuitive thing. LMK if I'm overlooking a use-case for the alternative.

jamespearce2006 · 2021-01-06T11:05:39Z

Sounds good to me!

sandersdan · 2021-01-08T00:14:31Z

We discussed this again today and there was more support for an easy-to-use API now that we have a separate WebGL path. In summary:

Always apply the crop.
Always scale to display size (this is a no-op for most VideoFrames anyway).
Rotation should be applied.
Color space is unclear. BT.709 -> sRGB would have to happen for most frames, but it would reduce confusion about black levels.

dustinkerstein · 2021-01-08T00:19:55Z

Quick comment regarding:

Always scale to display size (this is a no-op for most VideoFrames anyway).

This is desirable most of the time, but I believe it may cause issues for WebGL rendering on mobile devices which have a relatively low limit on the max texture size. Ie. For equirectangular 360 video which is 2:1 DAR, a 4k video's display resolution ends up being 4320 which is beyond the limits of most mobile devices. Would it be possible to consider making this scaling to display size a user configurable option?

sandersdan · 2021-01-08T01:44:29Z

For equirectangular 360 video which is 2:1 DAR

Hmm, I forgot that that was a relatively common thing to do. Is there any reason to suspect that such uses won't prefer using WEBGL_webcodecs_video_frame anyway though?

Another alternative is to just lie in the VideoDecoderConfig and claim that the display size is exactly the crop size. That seems like it might be a good enough workaround to not need its own API?

dustinkerstein · 2021-01-08T02:32:54Z

Hmm, I forgot that that was a relatively common thing to do. Is there any reason to suspect that such uses won't prefer using WEBGL_webcodecs_video_frame anyway though?

Another alternative is to just lie in the VideoDecoderConfig and claim that the display size is exactly the crop size. That seems like it might be a good enough workaround to not need its own API?

Ok cool. WEBGL_webcodecs_video_frame does sound like it would handle my use-case as it provides direct access to the texture, and your proposed workaround seems like a good solution for folks that want the encoded texture size via createImageBitmap. Thanks!

jamespearce2006 · 2021-01-08T11:57:50Z

I actually have an intermediate use case...
I don't want to use the video frame, as I need to do quite a lot of buffering, and typically need to hold onto more than a second's worth of frames.
However, I don't really want to use createImageBitmap either, as I already have shaders to do the color space conversion, etc.

Will there be a way of copying the YUV buffers from video frame without calling createImageBitmap? I think the API spec mentions this, but I didn't notice any mechanism currently implemented.

sandersdan · 2021-01-08T19:08:18Z

The VideoFrame.planes mechanism is implemented in Chrome only for frames that are backed by mappable memory. We may be able to implement readback of texture-backed frames using the same mechanism, but readback can be complex and there is a risk that we could get the pixel format, plane offsets, etc. incorrect if we try to guess them before attempting the readback operation. I expect that over time Chrome will have fewer and fewer cases that are texture-backed, so it may not be prioritized, or we may propose a separate API for it.

As a workaround, it should be possible to use the WEBGL_webcodecs_video_frame extension to read back the planes as well, the same as for other textures.

as I need to do quite a lot of buffering, and typically need to hold onto more than a second's worth of frames.

FYI it's not actually guaranteed that images created by createImageBitmap() are separate from the underlying frames, it may be necessary to close() them to release the underlying frame. This isn't finalized yet but is required for a zero-copy implementation.

jamespearce2006 · 2021-01-08T20:18:10Z

Thanks Dan. I have to confess, though, I'm not sure I fully understand the implications of your comment on createImageBitmap().

My use case requires me to buffer about 60 frames (although the resolution is typically quite low). My current implementation uses a wasm decoder and generates YUV planes as ArrayBuffers. These are buffered, and when/if needed will be rendered via WebGL, using a fragment shader to do colour space conversion.

I've successfully got this working with web-codecs. My initial thought was to just buffer the video frames that are output from the decoder. This didn't work though, as the decoder stalls after about 10 undestroyed video frames. I've switched to using createImageBitmap(), and now buffer the ImageBitmaps. This works, but I worry it's introducing unnecessary overhead by eagerly doing the colour space conversion, when there's no guarantee that a frame will end up being rendered. I'd rather keep the YUV and do the conversion in a shader when needed.

Is there any advice you can offer in this case, or is my approach already the best strategy?

sandersdan · 2021-01-08T20:39:47Z

Your approach may or may not work, depending on details of the particular WebCodecs implementation you are using. In Chrome's current implementation, you approach may or may not work depending on the platform; in some cases a zero-copy path may be used and the ImageBitmaps will hold a clone() of the VideoFrame until they are closed.

We could specify that createImageBitmap() always copies, but then we can't have a performant (zero-copy) implementation.

There isn't really a way out of copying; in most cases decoded frames are owned by the decoder and the only way to buffer many of them is to make a copy.

WebCodecs does not currently have an explicit copy or readback API. I am investigating and may propose one soon. The priority is relatively low given that there are workarounds available.

On the point of color space conversions, I've just noticed that CreateImageBitmapOptions already includes a colorSpaceConversion field, we should probably just make the behavior configurable using that.

jamespearce2006 · 2021-01-08T21:08:58Z

So, should I be calling VideoFrame.clone() and then just hold onto the cloned video frames? The spec says the cloned video frames have a separate lifetime and are deep copies.
If not, what are the workarounds you mention?

sandersdan · 2021-01-08T21:16:54Z

@chcunningham: This is not the expected behavior of clone(); I think that I think the wording in the spec is incorrect. The Clone Frame algorithm does have a note that describes the expected behavior, but the steps are not clear and the destroy() steps don't match up.

Cloned frames do have their own lifetimes in the sense that destroy() on one does not destroy the other, but it is required to destroy() all clones.

Chrome's implementation of clone() never copies the underlying data.

sandersdan · 2021-01-08T21:20:07Z

@jamespearce2006: The workarounds are using ImageBitmap or WebGL_webcodecs_video_frame to get the data into WebGL, then reading back the image data.

jamespearce2006 · 2021-01-08T21:37:53Z

Would it be out of the question to add an option to createImageBitmap() to perform a copy, and guarantee it's not a clone of the VideoFrame?

sandersdan · 2021-01-08T21:45:55Z

Would it be out of the question to add an option to createImageBitmap() to perform a copy, and guarantee it's not a clone of the VideoFrame?

I'm unsure. If it's on VideoFrame then that seems fine, but the original expectation was to deprecate it and the existing window.createImageBitmap() API would take over responsibility. It was simply easier to keep the spec and implementation self-contained to start.

If we implement an explicit copy API we resolve the core problem, but copy() + createImageBitmap() could be two copies. I'll bring this up in our next meeting.

jamespearce2006 · 2021-01-08T21:55:01Z

Thanks. I can think of a lot of cases where getting access to the underlying buffer and/or holding the data for a period of time will be needed, so I'd like to lobby to increase the priority of the copy/readback API. ImageBitmaps are working for me currently, but I'm slightly concerned that the workaround is quite involved, if that approach doesn't work on other platforms.

chcunningham · 2021-01-08T23:19:52Z

@chcunningham: This is not the expected behavior of clone(); I think that I think the wording in the spec is incorrect. The Clone Frame algorithm does have a note that describes the expected behavior, but the steps are not clear and the destroy() steps don't match up.

Agree, I mixed this up. I'll send a PR to sort it out (clone isn't necessarily deep copy, but its always a new lifetime guarantee).

jamespearce2006 · 2021-01-20T15:25:09Z

Sorry to resurrect this thread again. I just wanted to make a few additional points after talking with colleagues.

Firstly, @sandersdan responded to one of my implementation questions above with "Your approach may or may not work, depending on...". I think it's important that the API is predictable. Either it should behave the same way regardless of the underlying decoder implementation, or it should allow me to query (or control) what the behavior will be. If I have a need to hold onto decoded frames for longer than a hardware decoder might allow, or readback the decoded data, not knowing whether I'll get a hardware or software decoder means I don't know whether my code will work. At the very least, there should be a way of querying this. Even better, I should be able to ask to enable this so that my code is guaranteed to work. (Quite possibly this results in a software decoder being used, but as long as I get consistency, that's fine).

Secondly, reading between the lines, it seems that linear playback workflows are being prioritized initially for development of the API. I base this on the fact that the readback API is currently a lower priority. I would argue that linear playback is already well served by existing APIs (MSE, HTMLVideoElement, etc). Personally, I think it'd be a better to prioritize more exotic workflows, that are not currently achievable (or perhaps require WebAssembly).

Finally, in the above discussion, the proposed workaround for reading back a decoded frame is to render it, via WebGL, then read back the image data. Unfortunately, this isn't really a viable solution from a web-worker. It's not possible to dynamically create a canvas in a worker - one needs to be passed in from the main thread.

sandersdan · 2021-01-20T21:03:03Z

I agree that the behavior should be predictable. I'm personally on the side that ImageBitmaps shouldn't be able to stall a decoder, but it does have performance implications so it's being considered thoroughly.

linear playback workflows are being prioritized initially for development of the API

The priority right now is to ship a working MVP API. We'll cut features if they are not required to demonstrate the core goals of WebCodecs.

Don't worry too much about this specific issue though, ImageBitmap integration is a core feature.

It's not possible to dynamically create a canvas in a worker

You can create an OffscreenCanvas in a worker.

chcunningham · 2021-03-29T05:11:35Z

@jamespearce2006 earlier wrote

Will there be a way of copying the YUV buffers from video frame without calling createImageBitmap? I think the API spec mentions this, but I didn't notice any mechanism currently implemented.

The latest thinking on this is to offer an API like:
let deepCopiedFrame = videoFrame.copy()

This is mentioned alongside other API changes for VideoFrame readback in #157. We are working now to get them implemented.

I've also created a separate #159 to hash out createImageBitmap semantics. Dan has taken the excellent feedback in this issue and made several proposals. Note that, I think everyone in this thread is actually better suited using the newer drawImage or texImage methods for painting (better perf and memory mgmt). The current plan (in #159) for createImageBitmap() is that it be a friendly (does conversions, rotations, probably makes copies) API for rendering the occasional frame. Please see rationale in that bug.

With that, I think we've probably split out all the remaining work in this issue. Please re-open if I've missed anything. Thank you for the feedback!

chcunningham mentioned this issue Feb 1, 2021

Clarify VideoFrame|AudioFrame clone/close/transfer behavior #129

Closed

chcunningham closed this as completed Mar 29, 2021

VideoFrame.createImageBitmap returns an ImageBitmap with coded resolution rather than display resolution #119

VideoFrame.createImageBitmap returns an ImageBitmap with coded resolution rather than display resolution #119

Comments

jamespearce2006 commented Dec 22, 2020

jchen10 commented Dec 23, 2020

Uh oh!

jamespearce2006 commented Dec 23, 2020

Uh oh!

jchen10 commented Dec 24, 2020

Uh oh!

jamespearce2006 commented Dec 24, 2020

Uh oh!

sandersdan commented Jan 5, 2021

Uh oh!

chcunningham commented Jan 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamespearce2006 commented Jan 6, 2021

Uh oh!

sandersdan commented Jan 8, 2021

Uh oh!

dustinkerstein commented Jan 8, 2021

Uh oh!

sandersdan commented Jan 8, 2021

Uh oh!

dustinkerstein commented Jan 8, 2021

Uh oh!

jamespearce2006 commented Jan 8, 2021

Uh oh!

sandersdan commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamespearce2006 commented Jan 8, 2021

Uh oh!

sandersdan commented Jan 8, 2021

Uh oh!

jamespearce2006 commented Jan 8, 2021

Uh oh!

sandersdan commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sandersdan commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamespearce2006 commented Jan 8, 2021

Uh oh!

sandersdan commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamespearce2006 commented Jan 8, 2021

Uh oh!

chcunningham commented Jan 8, 2021

Uh oh!

jamespearce2006 commented Jan 20, 2021

Uh oh!

sandersdan commented Jan 20, 2021

Uh oh!

chcunningham commented Mar 29, 2021

Uh oh!

chcunningham commented Jan 6, 2021 •

edited

Loading

sandersdan commented Jan 8, 2021 •

edited

Loading

sandersdan commented Jan 8, 2021 •

edited

Loading

sandersdan commented Jan 8, 2021 •

edited

Loading

sandersdan commented Jan 8, 2021 •

edited

Loading