New Heights of Native Multimodality

Moonshot AI officially released Kimi K2.5 today. This is not just a version iteration, but a major step towards Artificial General Intelligence (AGI). Building on Kimi K2, K2.5 has undergone continuous pre-training on approximately 15 trillion (15T) mixed vision and text tokens to construct a pure Native Multimodal architecture.

Kimi K2.5 vs Claude Opus 4.5 Performance Comparison

Figure: Comparison of Kimi K2.5 and Claude Opus 4.5 on core capabilities, demonstrating its strong strength in multimodality and reasoning arenas.

This architectural breakthrough gives K2.5 an extremely strong perception of the physical world, enabling disruptive upgrades in three major dimensions: Coding with Vision, Agent Swarm, and Office Productivity.

1. Coding with Vision: What You See Is What You Code

Kimi K2.5 is officially defined as the "strongest open-source coding model to date," showing particular dominance in the field of frontend development.

Visual Interaction to Code: K2.5 can directly convert simple conversations into complete frontend interfaces, precisely implementing interactive layouts and rich animation effects (such as scroll triggers).
Video as Code: Going beyond static images, K2.5 can reconstruct websites by reasoning through video content. For example, it can watch a video of website interactions and then restore the underlying code logic and styling.
Large-Scale Vision-Text Joint Pre-training: This capability stems from large-scale joint pre-training, which synchronizes the improvement of visual understanding and text coding capabilities, eliminating the disconnect between vision and logic found in traditional models.

In internal evaluations, K2.5 solved complex maze pathfinding problems, finding the shortest path in a 4.5-megapixel maze using the BFS algorithm and generating a visualized solution process, proving its powerful visual reasoning capabilities.

2. Agent Swarm: Hive Mind of Agents (Research Preview)

This is the most sci-fi feature of this update. Kimi K2.5 released the Agent Swarm research preview, marking a paradigm shift in AI from "single soldier combat" to "legion collaboration."

Self-Commanding Swarm: K2.5 can autonomously command up to 100 Sub-agents.
Massive Concurrent Execution: When handling complex tasks, it can orchestrate up to 1,500 coordination steps.
Efficiency Multiplication: Compared to the single-agent mode, Swarm mode reduces end-to-end execution time by 4.5x.
PARL Technology: The core behind this is Parallel-Agent Reinforcement Learning (PARL), where the Orchestrator decomposes tasks into parallel sub-tasks.

For example, in a task to "find 100 top creators in niche fields," K2.5 Swarm can automatically create 100 researcher sub-agents to search in parallel, finally aggregating the results into a structured spreadsheet containing 300 profiles with amazing efficiency.

3. Ultimate Office Productivity

K2.5 brings agent capabilities into real knowledge work scenarios, capable of handling high-density, large-scale office inputs.

Versatile Output: Directly generates professional documents, spreadsheets, PDFs, and presentation slides.
Ultra-Long Context Processing: Easily handles documents of over 100 pages or writing papers of over 10,000 words.
Complex Operations: Supports adding comments in Word, building pivot tables in Excel, and writing LaTeX formulas in PDF.

In the internal AI Office Benchmark, K2.5's performance improved by 59.3% compared to the previous generation thinking model (K2 Thinking), truly realizing the leap from "toy" to "tool."

Performance Dominance: Comprehensive Surpassing

In various authoritative benchmarks, K2.5 has shown strength that rivals or even surpasses top closed-source models possessing "thinking modes" (including Gemini 3 Pro, GPT-5.2, Claude Opus 4.5, etc.):

Benchmark	Domain	Performance Highlights
HLE-Full	Reasoning	Stronger than DeepSeek-V3.2
SWE-Bench Verified	Programming	80.9% resolution rate, surpassing open-source limits
MMMU Pro	Vision	Top-tier visual multimodal understanding capability, close to Claude Opus 4.5 level
BrowseComp	Search	Significant performance improvement in Agent Swarm mode

How to Experience

Currently, Kimi K2.5 has landed on the following platforms, providing four modes (Instant, Thinking, Agent, Agent Swarm):

Kimi.com Web Version
Kimi 智能助手 App (Smart Assistant App)
Kimi 开放平台 (Open Platform API)
Kimi Code: A brand new terminal code tool supporting integration with VSCode, Cursor, etc.

Note: Agent Swarm mode is currently in the Beta stage and offers free trials to premium users.

This wave of updates undoubtedly elevates the dimension of AI competition from simple "text dialogue" to the new heights of "visual action" and "swarm intelligence." For developers and professional users, Kimi K2.5 offers not just a stronger model, but a whole new set of weapons for solving complex problems.

Kimi K2.5 Officially Released: Comprehensive Evolution of Native Vision and Agent Swarm

New Heights of Native Multimodality

1. Coding with Vision: What You See Is What You Code

2. Agent Swarm: Hive Mind of Agents (Research Preview)

3. Ultimate Office Productivity

Performance Dominance: Comprehensive Surpassing

How to Experience

Popular Kimi K2 paths

Kimi K3

Kimi K2.7 Code

Kimi Code

Kimi K3 Status

Related Articles