Chinese AI developer DeepSeek has disclosed the cost of training the R1 model: a total of $294,000. The figures appeared in a peer-reviewed Nature publication and are considered significantly lower than Western competitors’ previous estimates. This Hangzhou update marked the first public estimate of the training cost for this model.
In January, DeepSeek announced the release of more affordable AI systems, undermining expectations of dominance in the market by giants like Nvidia. Since then, the company and its founder Liang Wenfeng have largely disappeared from public view, except for a few product updates.
In 2023, OpenAI CEO Sam Altman said that training foundational models cost “well over” $100 million – although his company does not provide detailed figures for each release.
Context of the costs and market reaction
Training costs for large language models typically include the costs of running a cluster of powerful chips over weeks or months to process vast amounts of text and code.
In the Nature article, in which Liang is listed as one of the co-authors, it states that the DeepSeek-R1 model was trained at the cost of $294,000 and used 512 Nvidia H800 chips. The January version of the publication did not include such information.
Some of DeepSeek’s statements about development costs and the technology used were questioned by U.S. companies and officials. Nvidia developed the H800 chips for the Chinese market after export restrictions on their more powerful H100 and A100 in 2022.
According to U.S. officials cited by Reuters in June, DeepSeek had access to large quantities of H100 chips purchased after export controls were imposed. Nvidia told Reuters that DeepSeek used legally purchased H800 chips, not H100.
In an accompanying information document to Nature, the company for the first time confirmed that it owns A100 chips and used them in the early stages of development.
Regarding our DeepSeek-R1 research, we used A100 GPUs to prepare experiments with a smaller model.
After this initial phase, R1 was trained for a total of 80 hours on a cluster of 512 H800 chips.
OpenAI did not respond to a request for comment.
This case illustrates shifts toward greater transparency in the costs of training large language models and underscores the role of access to modern hardware and regulatory environments for global AI development. At the same time, it highlights the shadowy aspects of costs and the competition between local players and global market leaders.

