DeepSeek releases Prover-V2 model with parameters reaching 671 billion

金色财经|Apr 30, 2025 10:38

According to Golden Finance, DeepSeek released a new model called DeepSeek-Prover-V2-671B on the AI open source community Hugging Face today. It is reported that DeepSeeker Prover-V2-671B uses a more efficient safetensors file format and supports multiple computational accuracies, making it easier for models to train and deploy faster and more resource efficient. The parameters reach 671 billion, or an upgraded version of the Prover-V1.5 mathematical model released last year. In terms of model architecture, the model uses the DeepSeek-V3 architecture and adopts the MoE (Mixed Expert) mode, with 61 Transformer layers and 7168 hidden layers. At the same time, it supports ultra long contexts with a maximum position embedding of 163800, enabling it to handle complex mathematical proofs. It also uses FP8 quantization, which can reduce model size and improve inference efficiency through quantization techniques. (Golden Ten)