{"id":17398,"date":"2024-08-07T09:25:34","date_gmt":"2024-08-07T01:25:34","guid":{"rendered":"https:\/\/www.1ai.net\/?p=17398"},"modified":"2024-08-27T10:21:03","modified_gmt":"2024-08-27T02:21:03","slug":"%e7%b4%a2%e8%b5%94500%e4%b8%87%e7%be%8e%e5%85%83%ef%bc%81youtube%e5%8d%9a%e4%b8%bb%e8%b5%b7%e8%af%89openai%ef%bc%8c%e6%8c%87%e6%8e%a7%e6%9c%aa%e7%bb%8f%e8%ae%b8%e5%8f%af%e4%bd%bf%e7%94%a8%e8%a7%86","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/17398.html","title":{"rendered":"YouTuber sues OpenAI for $5 million for using video transcriptions without permission"},"content":{"rendered":"<div class=\"pgc-img\" data-pm-slice=\"0 0 []\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-17399\" title=\"get-192\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/08\/get-192.jpg\" alt=\"get-192\" width=\"600\" height=\"401\" \/><\/div>\n<p>One <a href=\"https:\/\/www.1ai.net\/en\/tag\/youtube\" title=\"_Other Organiser\" target=\"_blank\" >Youtube<\/a> The anchor filed a class action lawsuit with the U.S. District Court for the Northern District of California last Friday, alleging <a href=\"https:\/\/www.1ai.net\/en\/tag\/openai\" title=\"[View articles tagged with [OpenAI]]\" target=\"_blank\" >OpenAI<\/a> The company scraped millions of videos without notifying or compensating the video owners. <a href=\"https:\/\/www.1ai.net\/en\/tag\/youtube%e8%a7%86%e9%a2%91\" title=\"_Other Organiser\" target=\"_blank\" >YouTube Video<\/a>Scripts for training AI generative models.<\/p>\n<p>The anchor is named David Millette from Massachusetts, USA. He accused OpenAI of grabbing videos of him and other anchor creators for training AI models. The products involved include ChatGPT, Sora, etc.<\/p>\n<p>The class action lawsuit alleges that OpenAI collected the data and received \u201cgenerous rewards,\u201d but that this practice violated copyright law and YouTube\u2019s terms of service.<\/p>\n<p>Millett has currently entrusted Bursor &amp; Fisher law firm to advance the class action lawsuit. The plaintiff requests a jury trial and demands more than $5 million (currently approximately RMB 35.683 million) in compensation from all YouTube users and creators whose data may have been involved in OpenAI training.<\/p>\n<p>As we all know, generative AI models are not really intelligent. They learn the likelihood and patterns of data by processing large amounts of data samples (such as movies, recordings, papers, etc.). The training data for many models comes from public websites and data sets on the Internet. Although companies claim that their data crawling complies with the principle of &quot;fair use&quot;, many copyright holders disagree and have filed lawsuits to stop this practice.<\/p>\n<p>Video transcription content has become an important training data, especially as other data sources are exhausted. According to Originality.AI, more than 35% of the world&#039;s top websites have blocked OpenAI&#039;s web crawlers. In addition, research from MIT&#039;s Data Provenance Initiative shows that about 25% of high-quality data sources have been restricted, making the training data of AI models more scarce.<\/p>\n<p>It is worth mentioning that OpenAI&#039;s Whisper model is specifically used to transcribe video audio to collect more training data. According to the New York Times, after transcribing more than one million hours of YouTube videos, the OpenAI team used these transcribed texts to train their GPT-4 model. This triggered internal discussions that this might violate YouTube&#039;s regulations.<\/p>","protected":false},"excerpt":{"rendered":"<p>A YouTube anchor filed a class action lawsuit in the U.S. District Court for the Northern District of California on Friday, alleging that OpenAI Inc. crawled millions of YouTube video scripts to train AI-generated models without informing or compensating the video owners. The anchor, David Millette, from Massachusetts, alleges that OpenAI crawled his videos and those of other anchor creators to train AI models for products such as ChatGPT, Sora, and others. The class action lawsuit argues that OpenAI collects this data and is \"handsomely rewarded\" for doing so, but that this practice violates copyright law and YoYo laws.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[190,423,3900],"collection":[],"class_list":["post-17398","post","type-post","status-publish","format-standard","hentry","category-news","tag-openai","tag-youtube"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17398","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=17398"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/17398\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=17398"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=17398"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=17398"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=17398"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}