Skip to content

Simulcast support #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
murillo128 opened this issue Feb 24, 2020 · 8 comments
Open

Simulcast support #44

murillo128 opened this issue Feb 24, 2020 · 8 comments
Labels
extension Interface changes that extend without breaking.

Comments

@murillo128
Copy link

what is the intention regarding simulcast support?

We have the following options:

  • Not support it, and make the app create N encoder instances, and make the app support it. Not very likely as currently there is no support for accessing the image raw data, so it will have draw the media stream in n different canvas, downscale it, create the capture stream for each one and pass each one to an instance of the decoder.

  • Support it as configuration parameter of an encoder. This would allow to have single input for the simulcast encoder, but would make it much difficult to provide a good api compatible with non-encoders.

  • Provide a helper/adapter that can be created with a webrtc encoding-like object parameter and will create internally n encoders, exposing each one for controlling each simulcast layer individually. This would allow to have a single input, and the simulcast adapter will internally downscale/forward it to each of the encoders individually.

I think that the latest one is the current approach in libwbertc internal code and also my preferred option.

@aboba
Copy link
Collaborator

aboba commented May 20, 2020

@murillo128 The last option has the advantage of maintaining parity with WebRTC. Only potential disadvantage is that this would probably lead to having to reset all simulcast streams on receipt of an FIR for one of them (the current Chrome behavior).

@chcunningham
Copy link
Collaborator

The API has changed a lot since this issue was filed.

Not support it, and make the app create N encoder instances, and make the app support it. Not very likely as currently there is no support for accessing the image raw data, so it will have draw the media stream in n different canvas, downscale it, create the capture stream for each one and pass each one to an instance of the decoder.

This is now much better. You can create N encoders, configure them each with a different target encoding size, and they will internally resize your input VideoFrames as needed. You now also have raw access to the image data, but that's not required for resizing to happen internally.

Support it as configuration parameter of an encoder. This would allow to have single input for the simulcast encoder, but would make it much difficult to provide a good api compatible with non-encoders.

We have no plan to support simulcast as a parameter, but we do plan to support parameters to configure SVC (dependent on codec and UA support). Tracked in #40.

Provide a helper/adapter that can be created with a webrtc encoding-like object parameter and will create internally n encoders, exposing each one for controlling each simulcast layer individually. This would allow to have a single input, and the simulcast adapter will internally downscale/forward it to each of the encoders individually.

We don't plan to make this part of the API, but such a helper could be implemented in javascript on top of the first option.

Hopefully this addresses the issue and breaks out our follow up work. Please re-open if more to discuss :)

@aboba
Copy link
Collaborator

aboba commented Feb 19, 2021

IMHO, this is really more of an implementation than an API issue. For example, can the WebCodecs implementation recognize that there are N encoders utilizing the same input, and optimize performance?

In WebRTC it is also possible to create N encoders, but simulcast performs better without having to clone the input MediaStreamTrack N times, as well as behaving more logically on a congested network (e.g. simulcast streams do not converge in resolution and framerate, as may happen with N distinct encodings from the same MediaStreamTrack).

Congestion control behavior is out of scope for the WebCodecs API, so it needs to be handled by the application. This is definitely non-trivial, so in practice it is best solved within a library written by someone who understands congestion control. For the algorithm to perform properly, it is necessary to obtain loss/throughput estimates for each of the simulcast streams. This might be handled purely at the application level (e.g. if the application is running RTP/RTCP over WebTransport), or there might b some assistance provided by the transport (e.g. WebTransport-stats, which are in the process of being revised to reflect Http3Transport).

@padenot
Copy link
Collaborator

padenot commented Apr 28, 2021

We don't plan to make this part of the API, but such a helper could be implemented in javascript on top of the first option.

IMHO, this is really more of an implementation than an API issue. For example, can the WebCodecs implementation recognize that there are N encoders utilizing the same input, and optimize performance?

@chcunningham how would you recommend doing this, considering #104 and #129 (which are not yet merged but help somewhat)? This is one of those rather important use-cases that we need to get right, because the performance implications of having authors doing it "manually" are very important (lots of copies, lots of encoding time wasted). It's also not

It seems like ref-counting semantics would maybe allow this? Consider the following situation:

  • Three video encoders are created, with a set of resolution r1, r2, r3
  • A frame needs to be encoded, it's cloned twice (there is now three frames, but they point to the same underlying data)
  • encode() is called in the same task on the three encoders
  • The UA starts processing the encoding jobs, and notices that it's the same input frames going to three encoders (because queued from the same task and the UA is dispatching the actual decode job from a stable state (or microtask, I think both work here).
  • Bonus point: the UA notices the three encoders are in fact the same codec with different resolution and optimizes further

@padenot
Copy link
Collaborator

padenot commented Apr 28, 2021

hrm I probably should have posted this other issues instead of resurrecting this old issue.

@chcunningham chcunningham reopened this Apr 28, 2021
@chcunningham
Copy link
Collaborator

Discussed on last editors call. My recollection

  • having 3 encoder recognize a common input and/or do processing of that input in the same task seems technically possible, but really tough to implement.
  • a better option may be to just make this behavior explicit using 1 encoder.
    • for instance, we could add a simulcast key to the VideoEncoderConfig which accepts a sequence of other config objects describing the additional resolutions. The output callback may then be called with chunks of different resolutions in a rotating fashion.
  • demand for simulcast may not be huge (so far none that I'm aware of). we anticipate that SVC will come to be more important/common, particularly with AV1.

@chcunningham
Copy link
Collaborator

triage note: marking 'extension', as latest proposal (if any action taken) is to simply add knobs the config dict.

@chcunningham chcunningham added the extension Interface changes that extend without breaking. label May 12, 2021
@bradisbell
Copy link

demand for simulcast may not be huge (so far none that I'm aware of)

One major use case would be to support the creation of multi-bitrate streams for HLS and DASH. Allowing the creation of these streams client-side would reduce the need for transcoding server-side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Interface changes that extend without breaking.
Projects
None yet
Development

No branches or pull requests

5 participants