{"id":15727,"date":"2024-07-17T09:07:11","date_gmt":"2024-07-17T01:07:11","guid":{"rendered":"https:\/\/www.1ai.net\/?p=15727"},"modified":"2024-07-17T09:07:11","modified_gmt":"2024-07-17T01:07:11","slug":"%e5%be%ae%e8%bd%af-cto-%e5%9d%9a%e4%bf%a1%e5%a4%a7%e5%9e%8b%e8%af%ad%e8%a8%80%e6%a8%a1%e5%9e%8b%e7%9a%84%e8%a7%84%e6%a8%a1%e5%ae%9a%e5%be%8b%e4%be%9d%e7%84%b6%e5%a5%8f%e6%95%88","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/15727.html","title":{"rendered":"Microsoft CTO believes that the &quot;law of scale&quot; of large language models still works and there is a lot to look forward to in the future"},"content":{"rendered":"<p data-vmark=\"fc1d\"><a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%be%ae%e8%bd%af\" title=\"[View articles tagged with [Microsoft]]\" target=\"_blank\" >Microsoft<\/a>Chief Technology Officer<a href=\"https:\/\/www.1ai.net\/en\/tag\/cto\" title=\"_OTHER ORGANISER\" target=\"_blank\" >CTO<\/a>Kevin Scott said in an interview with Sequoia Capital\u2019s podcast last week:<strong>He reiterated his belief<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%a4%a7%e5%9e%8b%e8%af%ad%e8%a8%80%e6%a8%a1%e5%9e%8b\" title=\"[View articles tagged with [large-scale language model]]\" target=\"_blank\" >Large Language Models<\/a> (<a href=\"https:\/\/www.1ai.net\/en\/tag\/llm\" title=\"[SEE ARTICLES WITH [LLM] LABELS]\" target=\"_blank\" >LLM<\/a>)\u2019s view that the \u201claw of scale\u201d will continue to drive progress in AI<\/strong>, even as some in the field suspect that progress has stalled. Scott played a key role in pushing Microsoft to reach a $13 billion technology-sharing agreement with OpenAI.<\/p>\n<p data-vmark=\"2415\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-15728\" title=\"b51736f0-2a82-42e0-a75c-4078b4a74fa7\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/07\/b51736f0-2a82-42e0-a75c-4078b4a74fa7.png\" alt=\"b51736f0-2a82-42e0-a75c-4078b4a74fa7\" width=\"694\" height=\"385\" \/><\/p>\n<p data-vmark=\"815d\">\u201cOthers may disagree, but I don\u2019t think we\u2019ve reached a point of diminishing returns with scale,\u201d Scott said. \u201cI want people to understand that there is an exponential process here, and unfortunately,<strong>You only see it every few years because it takes time to build supercomputers and then train models on them.<\/strong>. &quot;<\/p>\n<p data-vmark=\"69f7\">In 2020, OpenAI researchers explored the \u201cscaling law\u201d of LLM, which states that<strong>Language model performance tends to improve predictably as models become larger (more parameters), more training data is available, and more computing power is available.<\/strong>This law means that simply increasing the size of models and the amount of training data can significantly improve AI capabilities without requiring fundamental algorithmic breakthroughs.<\/p>\n<p data-vmark=\"cdaa\">However, other researchers have since questioned the long-term validity of the &quot;law of scale.&quot; Still, the concept remains a cornerstone of OpenAI&#039;s philosophy on artificial intelligence research and development. Scott&#039;s optimism contrasts with the views of some critics in the field of artificial intelligence, who believe that progress in large language models has stagnated at the level of models like GPT-4. This view is based primarily on informal observations and benchmark results of the latest models, such as Google&#039;s Gemini 1.5 Pro, Anthropic&#039;s Claude Opus, and OpenAI&#039;s GPT-4o. Some believe that these models have not made the same leaps and bounds as previous generations of models,<strong>The development of large language models may be approaching a stage of \u201cdiminishing marginal returns\u201d.<\/strong><\/p>\n<p data-vmark=\"ee0a\">Gary Marcus, a prominent critic of artificial intelligence, wrote in April: \u201cGPT-3 is clearly better than GPT-2, and GPT-4 (released 13 months ago) is clearly better than GPT-3. But what happens next?\u201d<\/p>\n<p data-vmark=\"77b9\">Scott&#039;s stance suggests that tech giants like Microsoft still believe it&#039;s reasonable to invest in large AI models, betting on continued breakthroughs. Given Microsoft&#039;s investment in OpenAI and its heavy marketing of its own AI collaboration tool, Microsoft Copilot, the company has a strong desire to maintain the public perception of continued progress in the field of AI, even if the technology itself may hit a wall.<\/p>\n<p data-vmark=\"129e\">Ed Zitron, another prominent AI critic, recently wrote on his blog that one argument some people make for continued investment in generative AI is that \u201cOpenAI has some technology we don\u2019t know about, a powerful and mysterious technology that will completely crush all skeptics.\u201d He wrote, \u201cBut that\u2019s not the case.\u201d<\/p>\n<p data-vmark=\"9639\">The public perception of the slowdown in the capabilities of large language models, as well as the results of the benchmarks, may be partly due to the fact that AI has only recently entered the public eye, while in fact, large language models have been in development for many years. OpenAI continued to develop large language models for three years after the release of GPT-3 in 2020, until the release of GPT-4 in 2023. Many people may not have realized the power of models like GPT-3 until the launch of ChatGPT, a chatbot developed using GPT-3.5, in late 2022, and therefore felt that the capabilities were greatly improved when GPT-4 was released in 2023.<\/p>\n<p data-vmark=\"ae5a\">In the interview, Scott refuted the view that progress in artificial intelligence has stagnated, but he also admitted that the data points in the field are indeed updated slowly because new models often take years to develop. Nevertheless, Scott is still confident that future versions will improve, especially in areas where current models perform poorly.<\/p>\n<p data-vmark=\"6977\">\u201c<strong>The next breakthrough is coming, and I can&#039;t predict exactly when it will occur.<\/strong>&quot;We don&#039;t know how much progress it will make, but it will almost certainly improve the current imperfect aspects, such as the model is too expensive or too fragile to be used with confidence,&quot; Scott said in an interview. &quot;All of these aspects will be improved, the cost will be reduced, and the model will become more stable. By then, we will be able to achieve more complex functions. This is exactly what each generation of large language models has achieved through scale.&quot;<\/p>","protected":false},"excerpt":{"rendered":"<p>In an interview with Sequoia Capital's podcast last week, Microsoft CTO Kevin Scott reiterated his belief that the \"law of scale\" of large-scale language modeling (LLM) will continue to drive advances in artificial intelligence, despite the suspicion of some in the field that progress has stalled. Scott played a key role in pushing Microsoft into a $13 billion technology-sharing agreement with OpenAI. Others may take a different view, but I don't think scaling has reached the tipping point of diminishing marginal returns,\" Scott said. I want people to understand that there is an exponential ramp-up here, and unfortunately you only see it every few years because it takes time to build supercomputers and then train models with them.\" 2<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[3356,473,371,280],"collection":[],"class_list":["post-15727","post","type-post","status-publish","format-standard","hentry","category-news","tag-cto","tag-llm","tag-371","tag-280"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/15727","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=15727"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/15727\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=15727"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=15727"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=15727"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=15727"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}