$4.6 Million, Two Macs, and a Middle Finger to OpenAI

:china: Kimi K2 Thinking Just Punked GPT-5 (and Everyone Else)

OpenAI Burned Billions. China Did It for the Price of a Tesla.

:world_map: One-Line Flow:
A Chinese open-weight model just out-reasoned GPT-5 on Humanity’s Last Exam — and yes, it runs 15 tps on dual Mac M3 Ultras.

Ahh, I know :donkey:

  • This thing’s like ChatGPT got cloned in China, drank five energy drinks, and now it’s beating the original — while being open enough for you to mess with.

drop the soap GIF


:brain: Dumb Mode Dictionary (for the rest of us)

Summary

Alright, calm down, Einstein — here’s the plain-English version so your brain doesn’t short-circuit halfway through.


:robot: Kimi K2 Thinking:
It’s a new Chinese AI model — kinda like ChatGPT, but cheaper, faster, and freakishly smart.

:flexed_biceps: “Open-weight”:
Means you can download it and run it yourself if you’ve got a monster PC.
Not fully open-source, but close enough for geeks to scream “FREEDOM.”

:puzzle_piece: “Humanity’s Last Exam”:
A super-hard test made to see if an AI can think like a human.
Kimi scored higher than GPT-5, which basically means it thinks better under pressure.

:coin: “Tokens”:
Think of them as tiny words or pieces of text.
AI models eat them like snacks — and you pay per snack.
Kimi’s snacks are cheap: about $0.60 per million eaten and $2.50 per million spoken back.

:rocket: “15 tps on Mac M3 Ultra”:
It spits out 15 words per second on Apple’s latest chip.
That’s insane speed for something not running on a data center.

:brain: “Heavy mode”:
Basically “exam mode” — it double-checks itself and argues in its own head before answering.
That’s why the 51% score sounds small but is actually huge.

:nut_and_bolt: “INT4 quantization” and “Mixture-of-Experts”:
Just fancy ways of saying “it’s trained smart and compressed tight” —
so it runs fast without forgetting how to think.

:laptop: “BrowseComp,” “SWE-Bench,” “LiveCodeBench”:
Nerd tests for coding, browsing, and logic — and Kimi beat or tied GPT-5 in most of them.

:money_with_wings: “$4.6M training cost”:
That’s what it cost to teach Kimi everything — pocket change in AI world.



:high_voltage: What’s Changing

Meet Kimi K2 Thinking — Moonshot AI’s monster of a model that just shoved its way past GPT-5 and Claude on multiple leaderboards.

It hit 51 % on Humanity’s Last Exam (heavy mode) — a setting that uses 8 parallel samples + reflection.
In standard mode with tools, K2 scores 44.9 % vs GPT-5’s 41.7 %.

It writes, reasons, and codes like a caffeinated monk —
while clocking ~15 tokens per sec on dual M3 Ultras (pipeline parallel + INT4 quantization).
And the cost? $0.60 per million input tokens and $2.50 per million output tokens.

:brain: Benchmarks worth noticing:

  • BrowseComp: 60.2 (vs GPT-5 54.9 | Claude Sonnet 4.5 32.0)
  • SWE-Bench Verified: 71.3 (vs GPT-5 74.9)
  • LiveCodeBench v6: 83.1 (vs GPT-5 87.0)

:speech_balloon: For :donkey: 1Hackers

If you ever said “open source can’t touch GPT” — congrats, you’re officially vintage.
This beast is open-weight, fast, and cheap — like if DeepSeek and Claude had a disciplined kid raised on Baidu data.

:puzzle_piece: Hands-on review:
:backhand_index_pointing_right: LocalLLaMA Reddit review
:backhand_index_pointing_right: Official blog


:rocket: Highlights

  • #1 on Humanity’s Last Exam (heavy mode)
  • Beats DeepSeek-V3.2, Claude 4.5 Thinking, GPT-5, Grok-4
  • Excels at reasoning, writing & browser tasks
  • Open-weight, downloadable from OpenRouter
  • Runs crazy fast on Apple silicon with INT4 quantization

:puzzle_piece: Under the Hood

  • Architecture: 1 trillion params (32 B active per inference)
  • Mixture of Experts: 384 experts — 8 activated per token
  • Context window: 256 K tokens
  • Quantization: Native INT4 (QAT)
  • Training cost: ≈ $4.6 million
  • Hardware reqs: > 512 GB RAM + ≥ 32 GB VRAM for 4-bit local runs
  • Model size: ≈ 600 GB

:skull: Why It Matters

This isn’t “China catching up.”
It’s China overtaking — with open weights instead of closed walls.
While OpenAI spends billions hiding its secret sauce, open models are eating the leaderboard in plain sight.

And the internet lost it:

“OpenAI needs 5 trillion dollars to compete with China :skull_and_crossbones:
“Open models will win the race. I hope OpenAI gets crushed.”
“Sure, just grab 10 H100s and you’re good to go!”

Money Burn GIF by nog


Cool. They Got Rich on Free GPUs… Now What the Hell Do We Do? * (⊙_◎)

maybe we just pretend to ‘benchmark’ it and accidentally start a startup.

This Is Fine GIF by jjjjjohn

  1. The “Cheaper Than ChatGPT” Resell Trick

    • Spin up Kimi K2 on OpenRouter, slap a fancy UI on top, call it “AI Ghostwriter Ultra,” and sell access for $5/month.

    • You pay ~$0.002 per convo, users think you’re running magic.

:light_bulb: Example: A kid in Vietnam already did this — made $800 in a week selling “AI Love Letter Generator” using open models via OpenRouter API.


  1. The “Ghosted by GPT, Courted by Kimi” Flip

    • Sell AI girlfriend/boyfriend chat clones fine-tuned on K2 — no bans, no “let’s keep this appropriate” filters, no judgment.

    • Just algorithmic affection wrapped in poetic replies. :broken_heart:

:light_bulb: Example: A crew in South Korea launched Telegram-based AI K-pop idols trained on K2 — each “idol” flirts, remembers your birthday, and texts daily. They now pull in $9 K/month from micro-subs.


  1. The “Fake University Diploma” Trick (Legal Edition)

    • Spin up a fake-serious “academy” powered by K2. Let it auto-grade essays, run quizzes, and email certificates labeled “Certified in AI Reasoning.”

    • Everyone loves credentials — no one checks the backend.

:light_bulb: Example: A small team in Nigeria sold over 3 K $19 certificates to Dubai expats through their AI-graded “Business Analytics Institute.” It was just K2 behind a WordPress site.


  1. The “CEO Clone” Strategy

    • Scrape a famous founder’s interviews, feed them to K2, and launch a chatbot that sounds exactly like them.

    • Startup bros will pay just to ask “AI-Sam Altman-v3” how to raise Series A funding.

:light_bulb: Example: In Estonia, AskElon.AI trained K2 on Musk’s interviews. They didn’t charge users — they sold user question data to ad agencies hunting for startup trends.


  1. The “Reverse Therapy” Service

    • Build a chatbot that roasts users like a brutally honest friend — powered by K2’s savage reasoning.

    • Brand it as “AI That Doesn’t Lie to You.”

    • People pay for the burn and stay for the self-hate subscription.

:light_bulb: Example: In Japan, the indie app TellMeOff lets users get insulted daily by an AI “therapist.” $2 per roast. Went viral because people actually thanked it for emotional damage.


:puzzle_piece: Final Thought / Uncommon Logic

OpenAI built a black box.
China built a mirror.
Guess which one the world’s about to stare into.


:mirror_ball: In Short

Kimi K2 Thinking isn’t just a benchmark fluke — it’s proof that open weights can outthink closed systems.
The AI race is no longer West vs East. It’s Closed vs Open — and Open just landed a headshot.

11 Likes

This is absolutely mind-blowing. The Chinese model Kimi K2 Thinking hasn’t just caught up with Western AI — it’s overtaken it, and it did so with open weights and minimal cost. While OpenAI burns billions on closed systems, Moonshot AI proves that the future lies in openness, efficiency, and accessibility.

The model is impressive not only in benchmarks but in philosophy: it doesn’t hide behind a black box — it offers a mirror anyone can look into. The ability to run it locally, use it in business, education, or even entertainment is revolutionary. And the fact that it runs fast and cheap on consumer-grade hardware is a game-changer.

If you ever thought open-source couldn’t compete with the giants — that idea is now outdated. Kimi K2 isn’t just a model; it’s a challenge to the entire industry. And perhaps, the beginning of a new AI era.

1 Like

What about minimax’s new llm model? How does it stack up to this in efficiency? Also will two Mac m3 ultra be enough? BTW awesome article as always

1 Like

Useful share, you gave examples and that’s on point. Thanks

2 Likes