{"id":25059,"date":"2024-12-13T09:05:27","date_gmt":"2024-12-13T01:05:27","guid":{"rendered":"https:\/\/www.1ai.net\/?p=25059"},"modified":"2024-12-13T09:05:27","modified_gmt":"2024-12-13T01:05:27","slug":"%e8%b0%b7%e6%ad%8c%e5%8f%91%e5%b8%83%e5%a4%9a%e6%a8%a1%e6%80%81%e7%9b%b4%e6%92%ad-api%ef%bc%9a%e8%a7%a3%e9%94%81%e7%9c%8b%e3%80%81%e5%90%ac%e3%80%81%e8%af%b4%ef%bc%8c%e5%bc%80%e5%90%af-ai-%e9%9f%b3","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/25059.html","title":{"rendered":"Google Releases Multimodal Live Streaming API: Unlocking Watching, Listening, and Speaking, Opening a New Experience in AI Audio and Video Interaction"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e8%b0%b7%e6%ad%8c\" title=\"[View articles tagged with [Google]]\" target=\"_blank\" >Google<\/a>Along with the release of Gemini 2.0 yesterday, the new<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%9a%e6%a8%a1%e6%80%81\" title=\"[View articles tagged with [multimodal]]\" target=\"_blank\" >Multimodality<\/a>Live streaming (Multimodal Live)<a href=\"https:\/\/www.1ai.net\/en\/tag\/api\" title=\"_OTHER ORGANISER\" target=\"_blank\" >API<\/a>,<strong>Helps developers create applications with real-time audio and video streaming capabilities.<\/strong><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-25060\" title=\"187e0144j00soer04006zd000v900nbp\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/12\/187e0144j00soer04006zd000v900nbp.jpg\" alt=\"187e0144j00soer04006zd000v900nbp\" width=\"1125\" height=\"839\" \/><\/p>\n<p>The API enables low-latency, bi-directional text, audio, and video interactions with audio and text output for a more natural, smooth, human-like dialog experience. Users can interrupt the model at any time and interact with it via shared camera input or screen recording to ask questions about the content.<\/p>\n<p>The model's video comprehension capabilities extend the communication paradigm by enabling users to use the camera to take or share a desktop in real time and ask relevant questions. The API has been made available to developers and a demo application of the multimodal real-time assistant is also available to users.<\/p>","protected":false},"excerpt":{"rendered":"<p>In parallel to the release of Gemini 2.0 yesterday, Google launched a new Multimodal Live API to help developers develop applications with real-time audio and video stream capabilities. The API achieves low-relay, two-way text, audio and video interaction, with output in audio and text, which brings a more natural flow of interactive experience like human dialogue. Users can interrupt the model at any time and interact with it through shared camera input or screen video, asking questions about content. The video interpretation function of the model expands the communication mode and allows users to use cameras to photograph or share desktops in real time and to raise related issues. The API is already open to developers and provides users with a multimodular real-time assistant<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1033,592,281],"collection":[],"class_list":["post-25059","post","type-post","status-publish","format-standard","hentry","category-news","tag-api","tag-592","tag-281"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/25059","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=25059"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/25059\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=25059"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=25059"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=25059"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=25059"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}