{"id":28956,"date":"2025-02-18T10:59:49","date_gmt":"2025-02-18T02:59:49","guid":{"rendered":"https:\/\/www.1ai.net\/?p=28956"},"modified":"2025-02-18T10:59:49","modified_gmt":"2025-02-18T02:59:49","slug":"%e6%9c%88%e4%b9%8b%e6%9a%97%e9%9d%a2%ef%bc%9a%e4%b8%80%e5%b9%b4%e5%89%8d%e5%b0%b1%e9%aa%8c%e8%af%81%e8%bf%87%e9%95%bf%e6%80%9d%e7%bb%b4%e9%93%be%ef%bc%8c%e5%9b%a0%e6%88%90%e6%9c%ac%e9%ab%98%e5%85%88","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/28956.html","title":{"rendered":"Dark Side of the Moon: verified long thought chains a year ago, got long text first due to high costs"},"content":{"rendered":"<p>February 18, 2011 AM.<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%9c%88%e4%b9%8b%e6%9a%97%e9%9d%a2\" title=\"[Sees articles with labels]\" target=\"_blank\" >Dark Side of the Moon<\/a>Researcher Flood Sung recently shared the complete thought process behind the k1.5 model and revealed that<strong>The shocking effect of the September 12, 2024 release of OpenAI o1 has led to reflections on the validity of Long-CoTs<\/strong>. Because of the validity of Long-CoT, in fact over a year ago Dark Side of the Moon <a href=\"https:\/\/www.1ai.net\/en\/tag\/kimi\" title=\"[View articles tagged with [Kimi]]\" target=\"_blank\" >Kimi<\/a> Co-founder Tim Zhou Xinyu has verified that using a very small model, the training model to do do tens of bits of addition, subtraction, multiplication and division operations, the fine-grained operation process synthesized into a very long CoT data to do SFT, you can get very good results.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-28957\" title=\"13a4153aj00sruyye0032d000fa00fap\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2025\/02\/13a4153aj00sruyye0032d000fa00fap.jpg\" alt=\"13a4153aj00sruyye0032d000fa00fap\" width=\"550\" height=\"550\" \/><\/p>\n<p>\"I still remember the shock of seeing that effect.\" Flood Sung said the company realized the importance of Long Context, so it was the first to consider making it long.<strong>However, Long-CoT was not taken seriously enough, mainly due to cost considerations.<\/strong>. \"Long Context does mostly long text input, with Prefill prefill, Mooncake additions, and manageable cost and speed, while Long-CoT is long text output, which is much more costly and much slower, in which case making the output long doesn't become a highly preferable option. \"<\/p>\n<p>Flood Sung reflects, \"But what's more important than Performance? Cost and speed can keep going down with Moore's Law, so as long as we get the performance up, the rest is not a major issue.\" So we've got to get Long-CoT, get o1. \"All in All, we're trying to train models to be able to think like we do, to think freely,\" says Flood Sung. Flood Sung said.<\/p>\n<p>On the official website of Dark Side of the Moon Kimi, Flood Sung published a 10,000-word article decrypting the process of cracking o1, signaling the company's attention to o1 and the start of a substantial action to promote related research.<\/p>","protected":false},"excerpt":{"rendered":"<p>February 18th morning news, the dark side of the moon researcher Flood Sung recently shared the complete thinking process behind the k1.5 model, and revealed that the shock effect brought by the release of OpenAI o1 on September 12th, 2024 has plunged himself into the reflection on the effectiveness of Long-CoT. Because of the effectiveness of Long-CoT, in fact, more than a year ago, the dark side of the moon Kimi co-founder Tim Zhou Xinyu verified that the use of very small models, training models to do do dozens of digits of addition, subtraction, multiplication and division, the fine-grained arithmetic process synthesized into a very long CoT data to do the SFT, you can get very good results. \"I still remember the shock of seeing that effect.\" F<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1814,1168],"collection":[],"class_list":["post-28956","post","type-post","status-publish","format-standard","hentry","category-news","tag-kimi","tag-1168"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/28956","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=28956"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/28956\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=28956"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=28956"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=28956"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=28956"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}