Noise suppression with DSP+DNN, WebNN and Web Audio API feature gaps

The [RNNoise, Neural Speech Enhancement, and the Browser](https://www.w3.org/2020/06/machine-learning-workshop/talks/rnnoise_neural_speech_enhancement_and_the_browser.html) talk by @jmvalin -- which btw. has a superb audio quality in its recording :) -- explains the complexity of [RNNoise](https://jmvalin.ca/demo/rnnoise/) (for a 48 kHz mono input signal) is around 40 megaflops, with the following top 3:

>– DNN (matrix-vector multiply): 17.5 MFLOPS
>– FFT/IFFT: 7.5 MFLOPS
>– Pitch search (convolution): 10 MFLOPS

@jmvalin concludes:

>So, if we wanna optimize RNNoise, then these are the things we need to look at.

The [WebNN API](https://webmachinelearning.github.io/webnn/) recently added the Gated Recurrent Unit (GRU) and corresponding operators https://github.com/webmachinelearning/webnn/pull/83 to fill the operator gaps to enable hardware acceleration of models that make use of GRUs, such as RNNoise.

In [earlier related discussions](https://github.com/webmachinelearning/webnn/issues/66#issuecomment-650432256) @jmvalin noted:
>Honestly what I've like to see at some point is a WebBLAS (plus FFT and convolution/correlation). That would probably cover most use cases -- including a big chunk of WebML.

The WebNN API also recently added the [general matrix multiplication (gemm)](https://webmachinelearning.github.io/webnn/#api-neuralnetworkcontext-gemm) of the Basic Linear Algebra Subprograms (BLAS), specifically its [Level 3](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3).

Couple of questions or discussion points in the context of the workshop:

- What are the areas that need further focus on the web platform to ensure also future noise suppression models (DSP/DNN hybrids, or pure DNN maybe 100-1000x bigger?) could keep on performing?

- What is the state of real-time (or near real-time) analysis of waveforms in pure userland JavaScript with libraries such as [DSP.js](https://github.com/corbanbrook/dsp.js) in comparison with the Web Audio API primitives (e.g. [AnalyserNode](https://webaudio.github.io/web-audio-api/#analysernode))? What are the gaps and issues that need to be filled in with a JS library or hand-rolled code?

I suspect @teropa might have perspectives and input to this discussion, so looping him in.

@padenot for the Web Audio API expertise.

@huningxin for feedback on noise suppression hardware perspectives.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Noise suppression with DSP+DNN, WebNN and Web Audio API feature gaps #100

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Noise suppression with DSP+DNN, WebNN and Web Audio API feature gaps #100

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions