Skip to content

Create a DecoderSourceNode that integrates with WebCodecs decoders #2651

@bendmorris

Description

@bendmorris

The WebCodecs standard exposes browser-internal decoding functionality to JavaScript. Currently, to play audio with these userspace decoders requires operations across multiple threads:

  • A decoder instance is created from the control thread. The instance is fed data, and a decode is requested.
  • Decoding happens asynchronously, likely on a dedicated thread.
  • When decoding finishes, it must notify back to the control thread, which can then create and connect an AudioBufferSourceNode.
  • This sends a message to the Web Audio rendering thread which can now play it.

(Playing sound decoded with decodeAudioData is similar.)

Since the control thread may be busy with other work, this can add significant latency to playing encoded audio and may also result in gaps if this delay exceeds the amount of decoded audio already buffered to play. Queueing buffers back-to-back this way is also subject to a race condition that can result in audible artifacts.

A hypothetical DecoderSourceNode could work around this:

declare class DecoderSourceNode extends AudioScheduledSourceNode {
    constructor(context: BaseAudioContext);

    decoder: AudioDecoder;
}
  • The DecoderSourceNode takes a WebCodecs AudioDecoder as a property. Users can freely queue encoded audio buffers to the decoder, in any size, and will not need to explicitly decode them.
  • When a DecoderSourceNode is actively playing and holds a decoder instance, the Web Audio implementation controls when and how much to decode as data is needed to render. Decoding could be done just-in-time in the rendering thread, or ahead of time on a dedicated thread and buffered, etc., as long as enough data is available to render as needed.
  • Attempting to manually decode from an AudioDecoder attached to a DecoderSourceNode, or attach it to multiple nodes simultaneously, results in an error.

With this model, the control thread is only responsible for creating the DecoderSourceNode and periodically feeding it with data. The audio renderer is responsible for decoding an appropriate amount as needed.

We're using a version of this in our internal Web Audio implementation, and in my opinion, it simplifies a very common use case and makes it more efficient: playing a sound effect in some compressed format like vorbis. Typically today users would either decode the entire sound ahead of time, incurring extra latency and memory overhead, or else implement their own decode-and-buffer-in-chunks scheme that adds complexity and potential gaps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs DiscussionThe issue needs more discussion before it can be fixed.category: new featureSubstantive changes that add new functionality. https://www.w3.org/policies/process/#class-4

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions