DeepSeek-Prover-V2 Crowns Math King! 671B Math Reasoning Rampage Against All Odds

DeepSeek-Prover-V2 introduces 671B and 7B models, which use recursion + reinforcement learning to enhance mathematical reasoning and set several new records; adopts DeepSeek-V3 decomposition theorem + GRPO algorithm optimization, combined with cold-start training to achieve unification of non-formal and formal reasoning; and performs excellently in undergraduate-level tests, and the 7B model demonstrates a unique base processing ability.

Search