Home News > ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

by Violet Mar 14,2025

OpenAI has voiced concerns that China's DeepSeek AI models, known for their low cost, may have been developed using OpenAI's data. This revelation, coupled with DeepSeek's market impact, prompted Donald Trump to call it a wake-up call for the U.S. tech industry. Nvidia, a major player in GPU technology crucial for AI, suffered a significant stock drop following DeepSeek's emergence, triggering a broader sell-off in AI-related stocks. Other companies like Microsoft, Meta, Alphabet, and Dell also experienced declines.

DeepSeek's R1 model, built upon the open-source DeepSeek-V3, is marketed as a significantly cheaper alternative to Western AI models, reportedly trained for a mere $6 million. While this cost figure has been disputed, the very existence of DeepSeek has raised questions about the massive investments being made by American tech companies in AI, unsettling investors. DeepSeek's popularity, evidenced by its top ranking on U.S. app download charts, further highlights its impact.

Bloomberg reported that OpenAI and Microsoft are investigating whether DeepSeek utilized OpenAI's API to integrate OpenAI's AI models into its own, a process known as distillation, which violates OpenAI's terms of service. OpenAI confirmed its awareness of such attempts by Chinese and other companies to leverage leading U.S. AI models and stated its commitment to protecting its intellectual property and collaborating with the U.S. government to safeguard its technology.

David Sacks, President Trump's AI czar, suggested evidence points to DeepSeek's use of OpenAI models through distillation. He anticipates countermeasures from leading AI companies in the coming months to prevent such practices.

DeepSeek is accused of using OpenAI’s model to train its competitor using distillation. Image credit: Andrey Rudakov/Bloomberg via Getty Images.

The situation is ironic, given OpenAI's own history. Critics have pointed out OpenAI's reliance on copyrighted internet data in creating ChatGPT, a fact acknowledged by OpenAI in a submission to the UK's House of Lords. OpenAI argued that training AI models without copyrighted material is currently impossible, a position further underscored by lawsuits from the New York Times and 17 authors alleging copyright infringement. OpenAI maintains that its training practices constitute "fair use." Adding another layer of complexity, a 2018 U.S. Copyright Office ruling stated that AI-generated art cannot be copyrighted, highlighting the ongoing debate surrounding copyright in the age of generative AI.

Latest Apps