{"id":22504,"date":"2024-11-04T00:17:40","date_gmt":"2024-11-03T16:17:40","guid":{"rendered":"https:\/\/www.1ai.net\/?p=22504"},"modified":"2024-11-03T20:24:35","modified_gmt":"2024-11-03T12:24:35","slug":"%e6%94%af%e6%8c%81%e4%b8%ad%e8%8b%b1%e5%8f%8c%e8%af%ad%e5%8f%8a-40-%e7%a7%8d%e6%96%b9%e8%a8%80%e4%bb%bb%e6%84%8f%e6%b7%b7%e8%af%b4%ef%bc%8c%e4%b8%ad%e5%9b%bd%e7%94%b5%e4%bf%a1-teleai-%e6%98%9f","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/22504.html","title":{"rendered":"China Telecom TeleAI Star Voice Model Upgraded to Support Bilingual Chinese, English and 40 Dialects"},"content":{"rendered":"<p>November 3 News.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e4%b8%ad%e5%9b%bd%e7%94%b5%e4%bf%a1\" title=\"[View articles tagged with [China Telecom]]\" target=\"_blank\" >China Telecom<\/a>Artificial Intelligence Research Institute (<a href=\"https:\/\/www.1ai.net\/en\/tag\/teleai\" title=\"[View articles tagged with [TeleAI]]\" target=\"_blank\" >TeleAI<\/a>In May of this year, the industry's first free mashup of 30 dialects was released.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%af%ad%e9%9f%b3%e8%af%86%e5%88%ab%e5%a4%a7%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [speech recognition macromodel]]\" target=\"_blank\" >Speech recognition macromodel<\/a>\u00a0-- Star Super Multi-Dialect Speech Recognition Large Model.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-22505\" title=\"f2ce0502j00smdjre00f2d000rs00fmp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/11\/f2ce0502j00smdjre00f2d000rs00fmp.jpg\" alt=\"f2ce0502j00smdjre00f2d000rs00fmp\" width=\"1000\" height=\"562\" \/><\/p>\n<p>After less than half a year, the multi-dialect capability of the TeleAI Star Speech Grand Model has been upgraded again, conquering such dialects as Zhanjiang, Yibin, Luoyang and Yantai.<strong>Upgraded the number of dialect types from 30 to 40 and introduced the recognition of English<\/strong>.<\/p>\n<p>Compared to traditional labeled training methods, TeleAI pre-trains speech recognition models by utilizing massive amounts of unlabeled data for pre-training and then fine-tuning them with small amounts of labeled data.<\/p>\n<p>Since dialectal speech data is generally characterized by more unlabeled data and less labeled data, this \"<strong>Pre-training + fine-tuning<\/strong>\"The modeling scheme and the needs of the dialect scene can be highly compatible.<\/p>\n<p>TeleAI also innovates in model structure and cost optimization, achieving a significant reduction of about 50 times in the amount of manually annotated data required and guaranteeing that the model results are comparable to the level of supervised training of dialect models.<\/p>\n<p>With GitHub open source address: https:\/\/github.com\/Tele-AI\/TeleSpeech-ASR<\/p>","protected":false},"excerpt":{"rendered":"<p>November 3, 2011 - China Telecom's Institute of Artificial Intelligence (TeleAI) released the industry's first multi-dialect speech recognition model in May this year, which supports the free mixing of 30 dialects. Less than half a year later, TeleAI's Star Speech Recognition Model has upgraded its multi-dialect capability again, conquering dialects such as Zhanjiang, Yibin, Luoyang and Yantai, raising the number of dialects from 30 to 40, and introducing the recognition of English. Compared with the traditional labeled training method, TeleAI pre-trains the speech recognition model by utilizing massive unlabeled data for pre-training and then fine-tuning it with a small amount of labeled data. Since dialect speech data is generally characterized by more unlabeled data and less labeled data, TeleAI uses a large amount of unlabeled data for pre-training and a small amount of labeled data for fine-tuning.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[4827,3419,4828],"collection":[],"class_list":["post-22504","post","type-post","status-publish","format-standard","hentry","category-news","tag-teleai","tag-3419","tag-4828"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/22504","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=22504"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/22504\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=22504"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=22504"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=22504"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=22504"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}