{"id":20998,"date":"2024-10-03T15:46:51","date_gmt":"2024-10-03T07:46:51","guid":{"rendered":"https:\/\/www.1ai.net\/?p=20998"},"modified":"2024-10-03T15:46:51","modified_gmt":"2024-10-03T07:46:51","slug":"openai-%e5%8d%87%e7%ba%a7-whisper-%e8%af%ad%e9%9f%b3%e8%bd%ac%e5%bd%95-ai%e6%a8%a1%e5%9e%8b%ef%bc%8c%e4%b8%8d%e7%89%ba%e7%89%b2%e8%b4%a8%e9%87%8f%e9%80%9f%e5%ba%a6%e5%bf%ab-8-%e5%80%8d","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/20998.html","title":{"rendered":"OpenAI Upgrades Whisper Speech Transcription AI Model to Be 8x Faster Without Sacrificing Quality"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/openai\" title=\"[View articles tagged with [OpenAI]]\" target=\"_blank\" >OpenAI<\/a> During the DevDay event day held on October 1, it was announced that the launch of the <a href=\"https:\/\/www.1ai.net\/en\/tag\/whisper\" title=\"_Other Organiser\" target=\"_blank\" >Whisper<\/a> large-v3-turbo <a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%af%ad%e9%9f%b3%e8%bd%ac%e5%bd%95%e6%a8%a1%e5%9e%8b\" title=\"[Sees articles with tags]\" target=\"_blank\" >phonetic transcription model<\/a>The total number of parameters is 809 million, with almost no decrease in quality.<strong>8x faster than large-v3<\/strong>.<\/p>\n<p>The Whisper large-v3-turbo speech transcription model is an optimized version of large-v3 and has only 4 Decoder Layers, in contrast to large-v3 which has 32 layers.<\/p>\n<p>The Whisper large-v3-turbo speech transcription model has a total of 809 million parameters, which is slightly larger than the medium model with 769 million parameters, but much smaller than the large model with 1.55 billion parameters.<\/p>\n<p><strong>OpenAI says Whisper large-v3-turbo is 8x faster than large models<\/strong>The VRAM required for the large model is 6GB, while the large model requires 10GB.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-20999\" title=\"aa6c27f3j00skrs6j000jd000o6006rm\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/10\/aa6c27f3j00skrs6j000jd000o6006rm.jpg\" alt=\"aa6c27f3j00skrs6j000jd000o6006rm\" width=\"870\" height=\"243\" \/><\/p>\n<p>The Whisper large-v3-turbo speech transcription model is 1.6GB in size, and OpenAI continues to make Whisper (including code and model weights) available under the MIT license.<\/p>\n<p>GitHub: https:\/\/github.com\/openai\/whisper\/discussions\/2363<\/p>\n<p>Model Download: https:\/\/huggingface.co\/openai\/whisper-large-v3-turbo<\/p>\n<p>Online experience: https:\/\/huggingface.co\/spaces\/hf-audio\/whisper-large-v3-turbo<\/p>","protected":false},"excerpt":{"rendered":"<p>At its DevDay event on October 1, OpenAI announced the release of the Whisper large-v3-turbo speech transcription model, which has 809 million parameters and is up to 8 times faster than large-v3 with virtually no loss in quality. The Whisper large-v3-turbo speech transcription model is an optimized version of large-v3 and has only 4 Decoder Layers, compared to large-v3's 32 layers. The Whisper large-v3-turbo speech transcription model has 809 million parameters, which is more than 769 million parameters.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[190,871,4535],"collection":[],"class_list":["post-20998","post","type-post","status-publish","format-standard","hentry","category-news","tag-openai","tag-whisper","tag-4535"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/20998","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=20998"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/20998\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=20998"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=20998"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=20998"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=20998"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}