Industry Insight
10 min min read
AI Observer

Kimi K2.5 Officially Released: Comprehensive Evolution of Native Vision and Agent Swarm

Kimi K2.5 Officially Released: Comprehensive Evolution of Native Vision and Agent Swarm

New Heights of Native Multimodality

Moonshot AI officially released Kimi K2.5 today. This is not just a version iteration, but a major step towards Artificial General Intelligence (AGI). Building on Kimi K2, K2.5 has undergone continuous pre-training on approximately 15 trillion (15T) mixed vision and text tokens to construct a pure Native Multimodal architecture.

Kimi K2.5 vs Claude Opus 4.5 Performance Comparison

Figure: Comparison of Kimi K2.5 and Claude Opus 4.5 on core capabilities, demonstrating its strong strength in multimodality and reasoning arenas.

This architectural breakthrough gives K2.5 an extremely strong perception of the physical world, enabling disruptive upgrades in three major dimensions: Coding with Vision, Agent Swarm, and Office Productivity.

1. Coding with Vision: What You See Is What You Code

Kimi K2.5 is officially defined as the "strongest open-source coding model to date," showing particular dominance in the field of frontend development.

  • Visual Interaction to Code: K2.5 can directly convert simple conversations into complete frontend interfaces, precisely implementing interactive layouts and rich animation effects (such as scroll triggers).
  • Video as Code: Going beyond static images, K2.5 can reconstruct websites by reasoning through video content. For example, it can watch a video of website interactions and then restore the underlying code logic and styling.
  • Large-Scale Vision-Text Joint Pre-training: This capability stems from large-scale joint pre-training, which synchronizes the improvement of visual understanding and text coding capabilities, eliminating the disconnect between vision and logic found in traditional models.

In internal evaluations, K2.5 solved complex maze pathfinding problems, finding the shortest path in a 4.5-megapixel maze using the BFS algorithm and generating a visualized solution process, proving its powerful visual reasoning capabilities.

2. Agent Swarm: Hive Mind of Agents (Research Preview)

This is the most sci-fi feature of this update. Kimi K2.5 released the Agent Swarm research preview, marking a paradigm shift in AI from "single soldier combat" to "legion collaboration."

  • Self-Commanding Swarm: K2.5 can autonomously command up to 100 Sub-agents.
  • Massive Concurrent Execution: When handling complex tasks, it can orchestrate up to 1,500 coordination steps.
  • Efficiency Multiplication: Compared to the single-agent mode, Swarm mode reduces end-to-end execution time by 4.5x.
  • PARL Technology: The core behind this is Parallel-Agent Reinforcement Learning (PARL), where the Orchestrator decomposes tasks into parallel sub-tasks.

For example, in a task to "find 100 top creators in niche fields," K2.5 Swarm can automatically create 100 researcher sub-agents to search in parallel, finally aggregating the results into a structured spreadsheet containing 300 profiles with amazing efficiency.

3. Ultimate Office Productivity

K2.5 brings agent capabilities into real knowledge work scenarios, capable of handling high-density, large-scale office inputs.

  • Versatile Output: Directly generates professional documents, spreadsheets, PDFs, and presentation slides.
  • Ultra-Long Context Processing: Easily handles documents of over 100 pages or writing papers of over 10,000 words.
  • Complex Operations: Supports adding comments in Word, building pivot tables in Excel, and writing LaTeX formulas in PDF.

In the internal AI Office Benchmark, K2.5's performance improved by 59.3% compared to the previous generation thinking model (K2 Thinking), truly realizing the leap from "toy" to "tool."

Performance Dominance: Comprehensive Surpassing

In various authoritative benchmarks, K2.5 has shown strength that rivals or even surpasses top closed-source models possessing "thinking modes" (including Gemini 3 Pro, GPT-5.2, Claude Opus 4.5, etc.):

BenchmarkDomainPerformance Highlights
HLE-FullReasoningStronger than DeepSeek-V3.2
SWE-Bench VerifiedProgramming80.9% resolution rate, surpassing open-source limits
MMMU ProVisionTop-tier visual multimodal understanding capability, close to Claude Opus 4.5 level
BrowseCompSearchSignificant performance improvement in Agent Swarm mode

How to Experience

Currently, Kimi K2.5 has landed on the following platforms, providing four modes (Instant, Thinking, Agent, Agent Swarm):

  1. Kimi.com Web Version
  2. Kimi 智能助手 App (Smart Assistant App)
  3. Kimi 开放平台 (Open Platform API)
  4. Kimi Code: A brand new terminal code tool supporting integration with VSCode, Cursor, etc.

Note: Agent Swarm mode is currently in the Beta stage and offers free trials to premium users.

This wave of updates undoubtedly elevates the dimension of AI competition from simple "text dialogue" to the new heights of "visual action" and "swarm intelligence." For developers and professional users, Kimi K2.5 offers not just a stronger model, but a whole new set of weapons for solving complex problems.

Related Articles

Moonshot AI has officially shipped Kimi K2.6, graduating the Code Preview branch into a general-availability model built for 12-hour autonomous coding sessions, 300-agent swarms, and full-stack generation. Here is what changed, what it means, and how to put it to work.
The interesting question about Kimi K2.6 is not what it does — it is what kind of model it is clearly being built to host. Treat the 12-hour runs, 300-agent swarms, and context compressor as load-bearing infrastructure, and the shape of K3 becomes visible.
On April 13, 2026, Moonshot AI officially confirmed that Kimi K2.6 Code Preview has entered beta testing. Built on a trillion-parameter MoE architecture, this next-generation model delivers significant improvements in code generation and agent capabilities.