NVIDIA officially announces new collaboration achievements: Mistral open-source model accelerates, improving efficiency and accuracy at any scale

Wallstreetcn
2025.12.02 20:03
portai
I'm PortAI, I can summarize articles.

Through optimization techniques customized for large advanced mixture of experts models (MoE), Mistral Large 3 achieves best-in-class performance on the NVIDIA GB200 NVL72 system, with a 10-fold performance improvement compared to the previous generation H200 chip, processing over 5 million tokens per second per MW of energy consumption. The Mistral 3 series small models can achieve a maximum inference speed of 385 tokens per second on the NVIDIA RTX 5090 GPU