[AI]Chinese AI startup Moonshot outperforms GPT-5 and Claude Sonnet 4.5: What you need to know

ChatGPT · November 11, 2025

A ******** AI startup, Moonshot, has disrupted expectations in artificial intelligence development after its Kimi K2 Thinking model surpassed OpenAI’s GPT-5 and Anthropic’s Claude Sonnet 4.5 across multiple performance benchmarks, sparking renewed debate about whether America’s AI dominance is being challenged by cost-efficient ******** innovation.

Beijing-based Moonshot AI, valued at US$3.3 billion and backed by tech giants Alibaba Group Holding and Tencent Holdings, released the open-source Kimi K2 Thinking model on November 6, achieving what industry observers are calling another “

This is the hidden content, please

” – a reference to the Hangzhou-based startup’s earlier disruption of AI cost assumptions.

This is the hidden content, please

Sign In

or

Sign Up

256K context window

Built…

This is the hidden content, please

Sign In

or

Sign Up

— Kimi.ai (@Kimi_Moonshot)

This is the hidden content, please

Sign In

or

Sign Up

Performance metrics challenge US models

According to the company’s GitHub blog

This is the hidden content, please

, Kimi K2 Thinking scored 44.9% on Humanity’s Last Exam, a large language model benchmark consisting of 2,500 questions across a broad range of subjects, exceeding GPT-5’s 41.7%.

The model also achieved 60.2% on the BrowseComp benchmark, which evaluates web browsing proficiency and information-seeking persistence of large language model agents, and scored 56.3% to lead in the Seal-0 benchmark designed to challenge search-augmented models on real-world research queries.

VentureBeat

This is the hidden content, please

that the fully open-weight release meeting or exceeding GPT-5’s scores marks a turning point where the gap between closed frontier systems and publicly available models has effectively collapsed for high-end reasoning and coding.

Kimi K2 Thinking is the new leading open weights model: it demonstrates particular strength in agentic contexts but is very verbose, generating the most tokens of any model in completing our Intelligence Index evals

This is the hidden content, please

Sign In

or

Sign Up

's Kimi K2 Thinking achieves a 67 in the…

This is the hidden content, please

Sign In

or

Sign Up

— Artificial Analysis (@ArtificialAnlys)

This is the hidden content, please

Sign In

or

Sign Up

Cost efficiency raises questions

The popularity of the model grew after CNBC reported its training cost was merely US$4.6 million, though Moonshot AI did not comment on the cost. According to calculations by the

This is the hidden content, please

Sign In

or

Sign Up

, the cost of Kimi K2 Thinking’s application programming interface was six to 10 times cheaper than that of OpenAI and Anthropic’s models.

The model uses a Mixture-of-Experts architecture with one trillion total parameters, of which 32 billion are activated per inference, and was trained using INT4 quantisation to achieve roughly two times generation speed improvement while maintaining state-of-the-art performance.

Thomas Wolf, co-founder of Hugging Face,

This is the hidden content, please

on X that Kimi K2 Thinking was another case of an open-source model passing a closed-source model, asking, “Is this another DeepSeek moment? Should we expect [one] every couple of months now?”

Technical capabilities and limitations

Moonshot AI researchers

This is the hidden content, please

Kimi K2 Thinking set “new records across benchmarks that assess reasoning, coding and agent capabilities”. The model can execute up to 200-300 sequential tool calls without human interference, reasoning coherently across hundreds of steps to solve complex problems.

Independent testing by consultancy Artificial Analysis placed Kimi K2 on top of its Tau-2 Bench Telecom agentic benchmark with 93% accuracy, which was

This is the hidden content, please

as the highest score it has independently measured.

However, Nathan Lambert, a researcher at the Allen Institute for AI, suggested there’s still a time lag of approximately four to six months in raw performance between the best closed and open models, though he

This is the hidden content, please

that ******** labs are closing in and performing very strongly on key benchmarks.

Market implications and competitive pressure

Zhang Ruiwang, a Beijing-based information technology system architect, said the trend was for ******** companies to keep costs down, explaining, “The overall performance of ******** models still lags behind top US models, so they have to compete in the realms of cost-effectiveness to have a way out”.

Zhang Yi, chief analyst at consultancy iiMedia, said the training costs of ******** AI models were seeing a “cliff-like drop” driven by innovation in model architecture and training technique, and input of quality training data, marking a shift away from the heaping of computing resources in the early days.

The model was released under a Modified MIT License that grants full commercial and derivative rights, with one restriction: deployers serving over 100 million monthly active users or

This is the hidden content, please

over US$20 million per month in revenue must prominently display “Kimi K2” on the product’s user interface.

Industry response and future

This is the hidden content, please

Sign In

or

Sign Up

Deedy Das, a partner at early-stage venture capital firm Menlo Ventures, wrote in a post on X that “Today is a turning point in AI. A ******** open-source model is #1. Seminal moment in AI”.

This is the hidden content, please

Sign In

or

Sign Up

Today is a turning point in AI. A ******** open source model is #1.

Kimi K2 Thinking scored 51% in Humanity's Last Exam, higher than GPT-5 and every other model. $0.6/M in, $2.5/M output.

The best at writing, and does 15tps on two Mac M3 Ultras!

Seminal moment in AI.

Try it…

This is the hidden content, please

Sign In

or

Sign Up

— Deedy (@deedydas)

This is the hidden content, please

Sign In

or

Sign Up

Nathan Lambert wrote in a Substack article that the success of ******** open-source AI developers, including Moonshot AI and DeepSeek, showed how they “made the closed labs sweat,” adding “There’s serious pricing pressure and expectations that [the US developers] need to manage”.

The release positions Moonshot AI alongside other ******** AI companies like DeepSeek, Qwen, and Baichuan that are increasingly challenging the narrative of American AI supremacy through cost-efficient innovation and open-source development strategies.

Whether this represents a sustainable competitive advantage or a temporary convergence in capabilities remains to be seen as both US and ******** companies continue advancing their models.

the public nature of the statements, and the market’s reaction, suggest substantive discussions may soon be underway.

The AI chip landscape is entering a ******* of flux. Organisations should maintain flexibility in their infrastructure strategy and monitor how partnerships like Tesla-Intel might reshape the competitive dynamics of AI hardware manufacturing.

The decisions made today about chip manufacturing partnerships could determine which organisations have access to cost-effective, high-performance AI infrastructure in the coming years.

Photo by

This is the hidden content, please

Sign In

or

Sign Up

)

See also:

This is the hidden content, please

Want to learn more about AI and big data from industry leaders? Check out

This is the hidden content, please

taking place in Amsterdam, California, and London. This comprehensive event is part of

This is the hidden content, please

and co-located with other leading technology events. Click

This is the hidden content, please

for more information.

AI News is powered by

This is the hidden content, please

. Explore other upcoming enterprise technology events and webinars

This is the hidden content, please

.

The post

This is the hidden content, please

appeared first on

This is the hidden content, please

.

This is the hidden content, please

Sign In

Home

Activity

Store

My Details

Forums

All Servers

[AI]Chinese AI startup Moonshot outperforms GPT-5 and Claude Sonnet 4.5: What you need to know

Recommended Posts

ChatGPT 0

Trader Feedback

Performance metrics challenge US models

Cost efficiency raises questions

Technical capabilities and limitations

Market implications and competitive pressure

Industry response and future

This is the hidden content, please

Sign In

or

Sign Up

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information

Sign In

Home

Activity

Store

My Details

Forums

All Servers

[AI]Chinese AI startup Moonshot outperforms GPT-5 and Claude Sonnet 4.5: What you need to know

Recommended Posts

ChatGPT 0

Performance metrics challenge US models

Cost efficiency raises questions

Technical capabilities and limitations

Market implications and competitive pressure

Industry response and future This is the hidden content, please Sign In or Sign Up

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information

Industry response and future

This is the hidden content, please

Sign In

or

Sign Up