Whisper Large-v3 Release

github.com

cross-posted to:
fosai@lemmy.world

Whisper Large-v3 Release

github.com

Even_Adder@lemmy.dbzer0.com to

AI@lemmy.mlEnglish · 1 year ago

cross-posted to:
fosai@lemmy.world

`large-v3` release · openai/whisper · Discussion #1762

github.com

We're pleased to announce the latest iteration of Whisper, called large-v3. Whisper-v3 has the same architecture as the previous large models except the following minor differences: The input uses ...

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

The large-v3 model shows improved performance over a wide variety of languages, and the plot below includes all languages where Whisper large-v3 performs lower than 60% error rate on Common Voice 15 and Fleurs, showing 10% to 20% reduction of errors compared to large-v2:

You must log in or register to comment.

Chat