{"id":6607,"date":"2024-03-29T10:32:14","date_gmt":"2024-03-29T02:32:14","guid":{"rendered":"https:\/\/www.1ai.net\/?p=6607"},"modified":"2024-03-29T10:32:14","modified_gmt":"2024-03-29T02:32:14","slug":"%e5%bc%80%e6%ba%90%e5%a4%a7%e6%a8%a1%e5%9e%8bdbrx%ef%bc%9a1320%e4%ba%bf%e5%8f%82%e6%95%b0%ef%bc%8c%e6%af%94llama2-70b%e5%bf%ab1%e5%80%8d","status":"publish","type":"post","link":"https:\/\/www.1ai.net\/en\/6607.html","title":{"rendered":"Open source large model DBRX: 132 billion parameters, 1x faster than Llama2-70B"},"content":{"rendered":"<p>Databricks, a big data company, recently released a new<a href=\"https:\/\/www.1ai.net\/en\/tag\/dbrx\" title=\"[SEE ARTICLES WITH [DBRX] LABEL]\" target=\"_blank\" >DBRX<\/a>MoE Big<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e6%a8%a1%e5%9e%8b\" title=\"_Other Organiser\" target=\"_blank\" >Model<\/a>, which triggered<a href=\"https:\/\/www.1ai.net\/en\/tag\/%e5%bc%80%e6%ba%90%e7%a4%be%e5%8c%ba\" title=\"[Sees articles with [open source communities] labels]\" target=\"_blank\" >Open Source Community<\/a>DBRX beat open source models such as Grok-1 and Mixtral in benchmark tests and became a new open source<span class=\"spamTxt\">King<\/span>The total number of parameters of this model reaches 132 billion, but each activation has only 36 billion parameters, and its generation speed is 1 times faster than Llama2-70B.<\/p>\n<p class=\"article-content__img\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-6608\" src=\"https:\/\/www.1ai.net\/wp-content\/uploads\/2024\/03\/6384723707232841619123272.png\" alt=\"\" width=\"923\" height=\"598\" \/><\/p>\n<p>DBRX is composed of 16 expert models, with 4 experts active at each reasoning and a context length of 32K. To train DBRX, the Databricks team rented 3072 H100s from cloud vendors and trained them for two months. After internal discussions, the team decided to adopt a course learning approach to improve DBRX&#039;s capabilities in specific tasks with high-quality data. This decision was successful, and DBRX reached SOTA levels in language understanding, programming, mathematics, and logic, and defeated GPT-3.5 in most benchmarks.<\/p>\n<p>Databricks also released two versions of DBRX: DBRX Base and DBRX Instruct. The former is a pre-trained basic model, and the latter is fine-tuned with instructions. Chief Scientist Jonathan Frankle revealed that the team plans to conduct further research on the model and explore how DBRX can acquire additional skills in the &quot;last week&quot; of training.<\/p>\n<p>Although DBRX is welcomed by the open source community, some people question its \u201copen source\u201d nature. According to the agreement published by Databricks, products built on DBRX must submit a separate application to Databricks if their monthly active users exceed 700 million.<\/p>","protected":false},"excerpt":{"rendered":"<p>Big data company Databricks has sparked a buzz in the open source community with the recent release of a MoE big model called DBRX, which has beaten open source models such as Grok-1 and Mixtral in benchmarks to become the new open source king. The model has 132 billion total parameters, but only 36 billion parameters per activation, and it generates them 1x faster than Llama2-70B. DBRX is composed of 16 expert models with 4 experts active per inference and a context length of 32 K. To train DBRX, the Databricks team rented 3,072 H100s from a cloud vendor for two months. After internal discussions, the team decided to use a course learning approach with high<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[146],"tags":[1954,391,1955,1489],"collection":[],"class_list":["post-6607","post","type-post","status-publish","format-standard","hentry","category-news","tag-dbrx","tag-391","tag-1955","tag-1489"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/6607","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/comments?post=6607"}],"version-history":[{"count":0,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/posts\/6607\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/media?parent=6607"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/categories?post=6607"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/tags?post=6607"},{"taxonomy":"collection","embeddable":true,"href":"https:\/\/www.1ai.net\/en\/wp-json\/wp\/v2\/collection?post=6607"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}