June 16 news.MiniMax It's officially announced yesterdayOpen SourceUnderMultimodalityThe model weight of the flagship model MiniMax M3 and the simultaneous publication of the MSA (MiniMax Sparke Act) technical paper。

Mini Max M3 Total Parameter 428B, Activating Parameter 23B, officially positioned as the first open-source model to introduce multi-modular hybrid training from Step 0。
THE TRAINING PHASE WAS HEAVILY INTEGRATED INTO TEXT, IMAGES AND MULTIMODULAR STAGGERED DATA, WITH A VIEW TO BUILDING A UNIFIED CROSS-MODULAR SEMANTIC SPACE AT THE PRE-TRAINING STAGE; AND THE INTRODUCTION OF THE MSA ARCHITECTURE SIGNIFICANTLY REDUCED THE COSTING UNDER THE LONG CONTEXT SCENARIO。
💻 Github: github.com/MiniMax-AI/MiniMax-M3
Hugging Face: huggingface.co/MiniMaxi/MiniMax-M3