{"id":15541,"date":"2024-07-14T09:30:49","date_gmt":"2024-07-14T01:30:49","guid":{"rendered":"https:\/\/www.1ai.net\/?p=15541"},"modified":"2024-07-14T09:30:49","modified_gmt":"2024-07-14T01:30:49","slug":"%e6%a8%a1%e5%9e%8b%e8%ae%ad%e7%bb%83%e6%88%90%e6%9c%ac%e5%b9%b3%e6%b0%91%e5%8c%96%ef%bc%8c%e5%89%8d%e7%89%b9%e6%96%af%e6%8b%89-ai-%e6%80%bb%e7%9b%91-24-%e5%b0%8f%e6%97%b6%e4%bb%85","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/15541.html","title":{"rendered":"Model training costs are becoming more affordable. Former Tesla AI director spent only $672 in 24 hours to \u201creproduce\u201d GPT-2"},"content":{"rendered":"<p><a href=\"https:\/\/www.1ai.net\/en\/tag\/gpt-2\" title=\"[SEE ARTICLES WITH [GPT-2] LABELS]\" target=\"_blank\" >GPT-2<\/a> yes <a href=\"https:\/\/www.1ai.net\/en\/tag\/openai\" title=\"[View articles tagged with [OpenAI]]\" target=\"_blank\" >OpenAI<\/a> Launched in 2019, the model once cost $256 per hour to train, so five years later in the GPT-4 era, will advances in hardware, software, and data mean that the time and cost required to train the same model will subsequently decrease? The answer is yes.<\/p>\n<p>According to Tom's Hardware today, the former director of Tesla AI, co-founder of OpenAI, and project developer Andrej Karpathy used llm.c to \u201cre-emerge\u201d GPT-2, at a cost of only $28 per hour (currently approximately RMB 204), and nearly 90% in just five years\u3002<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15542\" title=\"947df92d-377c-44b1-af57-58d7fe5bc9bb\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/07\/947df92d-377c-44b1-af57-58d7fe5bc9bb.jpg\" alt=\"947df92d-377c-44b1-af57-58d7fe5bc9bb\" width=\"640\" height=\"453\" \/><\/p>\n<p>Image source: Pixabay<br \/>\nThe main factor in cost reduction is the use of a single 8XH100 node for training. In addition, Andrej Karpathy says that llm.c implements GPT training directly. \"Since llm.c is a direct implementation of GPT training in C \/ CUDA, its requirements are very low - no conda environment, Python interpreter, pip installation, etc. You just need to start a cloud GPU. You just start a cloud GPU node, optionally install NVIDIA cuDNN, NCCL \/ MPI, download the .bin data slice, compile and run, and you're up and running in minutes.\"<\/p>\n<p>He added: \"Then wait 24 hours (28*24=672) to generate a sample about 'English-speaking unicorns in the Andes'.\"<\/p>\n<p>The llm.c project reportedly started out as part of an educational video, but quickly turned into a project that Karpathy built from scratch after running into some PyTorch issues.<\/p>\n<p>However, the report argues that advances in hardware, software, and training data don't mean that the cost of cutting-edge AI training is going down. Anthropic CEO Dario Amodei, for example, recently said that AI models currently in development can cost $1 billion to train, with higher-cost models expected to reach $100 billion by 2025.<\/p>\n<p>Increased hardware performance also comes with increased costs. For example, NVIDIA's H100 chip costs $40,000 per unit, while the next-generation Blackwell AI chip is expected to sell for $70,000 per unit. But even so, the CEO of Google Deepmind has said that the IQ level of the current model is still only equivalent to a cat.<\/p>","protected":false},"excerpt":{"rendered":"<p>GPT-2 is an OpenAI model that was launched in 2019 at a cost of US$ 256 per hour, so does five years after the GPT-4 era, progress in software and hardware and data mean that the time and cost of training the same model will be reduced? The answer is yes. According to Tom's Hardware today, the former director of Tesla AI, co-founder of OpenAI, and project developer Andrej Karpathy used llm.c to \u201cre-emerge\u201d GPT-2, at a cost of only $28 per hour (currently approximately RMB 204), and nearly 90% in just five years. Source Pix<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[3484,190,1374],"collection":[],"class_list":["post-15541","post","type-post","status-publish","format-standard","hentry","category-news","tag-gpt-2","tag-openai","tag-1374"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/15541","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=15541"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/15541\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=15541"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=15541"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=15541"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=15541"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}