Skip to content

whisper-large-v3-turbo_timestamped has broken timestamps #1357

@jozefchutka

Description

@jozefchutka

System Info

transformers.js version 3.6.1

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

Running whisper-large-v3-turbo_timestamped produces broken timestamps.

Reproduction

Run the following code (takes ~30 seconds) and follow console log:

<script type="module">
const { env, pipeline } = await import(`https://cdn.jsdelivr.net/npm/@huggingface/[email protected]/dist/transformers.min.js`);
env.allowLocalModels = false;

const buffer = await (await fetch("tos.pcm")).arrayBuffer();
const audio = new Float32Array(buffer);

const pipe = await pipeline("automatic-speech-recognition",
	"onnx-community/whisper-large-v3-turbo_timestamped",
	{dtype:"fp16", device:"webgpu"});

const result = await pipe(audio, {
	chunk_length_s: 30,
	stride_length_s: 5,
	return_timestamps: "word",
	language: "en"});

console.log(result.chunks.map(chunk => `${chunk.timestamp[0]} -> ${chunk.timestamp[1]} ${chunk.text}`))
</script>

Input is 60 second .pcm file, the console prints:

...
66: "40.82 -> 40.98  by"
67: "40.98 -> 41.1  these"
68: "41.1 -> 41.44  giant"
69: "41.44 -> 42.02  robotic"
70: "42.02 -> 69.98  claws--"
71: "69.98 -> 69.98  Oh,"
72: "69.98 -> 69.98  whatever,"
73: "69.98 -> 69.98  we're"
74: "69.98 -> 69.98  done!"
75: "69.98 -> 69.98  We're"
76: "69.98 -> 69.98  done!"
77: "69.98 -> 69.98  Robot's"
78: "69.98 -> 69.98  memory"
79: "69.98 -> 69.98  synced"
80: "69.98 -> 69.98  and"
81: "69.98 -> 69.98  locked."

There are two problems here:

  1. timestamps are not precise
  2. timestamps goes beyond the source duration

Attached the tos.pcm as .zip

tos.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions