Moonshot's open model Kimi K2.7 Code undercuts GPT-5.5 and Claude by up to 12x on price per tok
"Moonshot AI's new model threatens to disrupt the coding landscape with its affordable pricing. Can it deliver on performance?"
Moonshot AI has released Kimi K2.7 Code, a new AI model built specifically for programming tasks and agent-based coding workflows, in Mountain View, California, this week.
The model is designed to outperform its predecessor, Kimi K2.6, on long-running, complex software engineering tasks. According to Moonshot AI, Kimi K2.7 Code is available as an open-weights version on Hugging Face, making it accessible to a wide range of developers and researchers. For general tasks outside of coding, the company still recommends K2.6.
Kimi K2.7 Code has shown significant improvements on various benchmarks, including Moonshot's in-house Kimi Code Bench v2, Program Bench, and MLS Bench Lite. On Kimi Code Bench v2, performance jumps from 50.9 to 62.0, while on Program Bench, it climbs from 48.3 to 53.6. The model also improves on agentic benchmarks, hitting 76.0 on MCP Atlas and 81.1 on MCPMark Verified.
However, in a head-to-head comparison with GPT-5.5 and Claude Opus 4.8, Kimi K2.7 Code trails on most coding benchmarks. GPT-5.5 scores 69.1 on Program Bench versus 53.6 for Kimi K2.7 Code. On Kimi Code Bench v2, it's 69.0 versus 62.0. One notable exception is the MCPMark Verified benchmark, where Kimi K2.7 Code beats Claude Opus 4.8 with 81.1 versus 76.4.
The model uses a Mixture-of-Experts (MoE) architecture with one trillion total parameters, with only 32 billion active per token. The architecture is identical to K2.5 and K2.6, allowing existing deployment configs to be reused directly. One key improvement is more efficient reasoning, with Kimi K2.7 Code using about 30 percent fewer thinking tokens than K2.6.
Moonshot AI has also announced a "6x High-Speed Mode" coming soon, which promises to further improve the model's performance. The model can be accessed through the Kimi API, Kimi Code CLI, and inference engines like vLLM and SGLang. A native INT4 quantization is available too, making it possible to run the model on less powerful or cheaper hardware.
The pricing for Kimi K2.7 Code is $0.95 per million input tokens and $4.00 per million output tokens, with cache hits dropping the input price to $0.19 per million tokens. This puts Kimi K2.7 Code at a significant advantage over its competitors, with GPT-5.5 costing $5.00 per million input tokens and $30.00 per million output tokens, and Claude Opus 4.8 running $5.00/$25.00.
The cost per token is becoming an increasingly important factor in the AI market, and Kimi K2.7 Code's affordable pricing could make it an attractive option for developers and businesses looking to integrate AI into their workflows. While it may not be the best model overall, its price point and performance make it a viable alternative to more expensive models like GPT-5.5 and Claude Opus 4.8.
In the end, the question of whether Kimi K2.7 Code is "good enough" for a particular task will depend on the specific use case and requirements. However, with its competitive pricing and improved performance, it is certainly worth considering for businesses and developers looking to harness the power of AI for coding and software development tasks. As the AI landscape continues to evolve, it will be interesting to see how Kimi K2.7 Code and other models like it shape the future of coding and software development.

