Kimi CLI Technical Deep Dive

This article provides a comprehensive analysis of Moonshot AI's open-source project Kimi CLI, exploring its architecture design, core implementation, and innovative features to help developers deeply understand how this powerful AI command-line tool works under the hood.

Introduction - What is Kimi CLI?
Architecture Overview - Four Core Systems
Agent System - Flexible Configuration and Loading
KimiSoul Engine - The Smart Execution Brain
Tool System - An Extensible Capability Hub
ACP Protocol - The Bridge for IDE Integration
Core Design Principles

Introduction - What is Kimi CLI?

Kimi CLI, developed by Moonshot AI, is an AI-powered command-line intelligent assistant. It's not just a simple wrapper around a command-line interface, but a complete AI-native development tool ecosystem. It helps developers:

Perform complex tasks like file operations, code analysis, and web searches directly from the terminal
Complete software development workflows through natural language interaction
Support multiple LLM providers (Moonshot AI, OpenAI, Claude, Gemini)
Integrate deeply with mainstream IDEs like Zed

Unlike traditional command-line tools, Kimi CLI's standout feature is its Agentic AI Architecture—it organically combines AI models, tool systems, and execution engines into a complete autonomous agent that can plan, execute, and verify tasks independently.

Version: 0.58 Technical Preview Tech Stack: Python 3.13+, Asynchronous Architecture, Modular Design

Architecture Overview - Four Core Systems

Before diving into the details, let's understand the overall architecture of Kimi CLI at a macro level.

High-Level Architecture

mermaid

graph TB
    A[CLI Entry: cli.py] --> B[Application Layer: KimiCLI];
    B --> C[Agent System];
    B --> D[KimiSoul Execution Engine];
    D --> E[Tool System];
    B --> F[UI Layer];

    subgraph "UI Layer (4 Modes)"
        F --> G[Shell Mode: Interactive CLI];
        F --> H[Print Mode: Batch Processing];
        F --> I[ACP Mode: IDE Integration];
        F --> J[Wire Mode: Experimental Protocol];
    end

    subgraph "Agent System"
        C --> K[Load Agent Config];
        C --> L[System Prompt Management];
        C --> M[Tool Registration];
        C --> N[Sub-agent Management];
    end

    subgraph "KimiSoul Engine"
        D --> O[Execution Loop];
        D --> P[Context Management];
        D --> Q[Error Handling];
        D --> R[Checkpoint Mechanism];
    end

    subgraph "Tool System"
        E --> S[File Operations];
        E --> T[Shell Commands];
        E --> U[Web Search];
        E --> V[MCP Integration];
        E --> W[Aggregation System];
    end

Core Data Flow

mermaid

flowchart LR
    A[User Input] --> B[Parse and Route];
    B --> C{Choose Execution Mode};

    C -->|Shell/Print| D[Direct Execution];
    C -->|ACP| E[Start ACP Server];

    D --> F[Create Session];
    E --> F;

    F --> G[Load Agent];
    G --> H[Initialize Runtime];
    H --> I[Inject Dependencies];
    I --> J[Create KimiSoul];

    J --> K[Main Execution Loop];

    K --> L{Any Tool Calls?};
    L -->|Yes| M[Execute Tools];
    M --> N[Get Results];
    N --> O[Update Context];
    O --> P{Done?};

    L -->|No| P;
    P -->|No| K;
    P -->|Yes| Q[Return Result];

    M --> R[Parallel Processing];
    R --> M;

The core philosophy of this architecture is Layering and Decoupling:

Agent System handles configuration and initialization
KimiSoul is a pure execution engine
Tool System provides pluggable capabilities
UI Layer is completely separated from business logic

This design allows Kimi CLI to support various usage scenarios—from simple CLI interactions to complex IDE integrations—while maintaining clean and maintainable code.

Agent System - Flexible Configuration and Loading

What is the Agent System?

In Kimi CLI, an Agent is a complete intelligent agent configuration that includes:

System prompt
Available tools list
Sub-agent definitions
Runtime parameters

By making Agents configurable, Kimi CLI can switch between different "AI personalities":

Coder Agent: Focused on code writing and refactoring
Debug Agent: Specialized in bug triage and fixing
Custom Agent: User-defined agents

Configuration File Structure

yaml

# agents/default/agent.yaml
version: 1
agent:
  name: "Kimi CLI"                    # Agent name
  system_prompt_path: ./system.md     # System prompt file
  system_prompt_args:                 # Prompt arguments
    ROLE_ADDITIONAL: ""
  tools:                              # Available tools
    - "kimi_cli.tools.multiagent:Task"
    - "kimi_cli.tools.todo:SetTodoList"
    - "kimi_cli.tools.shell:Shell"
    - "kimi_cli.tools.file:ReadFile"
    - "kimi_cli.tools.file:WriteFile"
    - "kimi_cli.tools.web:SearchWeb"
    - "kimi_cli.tools.web:FetchURL"
  subagents:                          # Sub-agents
    coder:
      path: ./sub.yaml
      description: "Specialized in general software engineering tasks"

Agent Loading Flow (Sequence Diagram)

mermaid

sequenceDiagram
    participant CLI as KimiCLI
    participant Loader as load_agent()
    participant Spec as load_agent_spec()
    participant SubLoader as Load Sub-agent
    participant ToolLoader as Load Tools
    participant MCP as MCP Tools

    CLI->>Loader: agent_file, runtime, mcp_configs
    Loader->>Spec: Parse YAML config
    Spec-->>Loader: ResolvedAgentSpec

    Note over Loader: Load system prompt
    Loader->>Loader: _load_system_prompt()

    Note over Loader: Recursively load sub-agents (fixed sub-agents)
    loop For each subagent
        Loader->>SubLoader: load_agent(subagent.path)
        SubLoader->>Loader: Agent instance
        Loader->>Runtime.labor_market: add_fixed_subagent()
    end

    Note over Loader: Load tools (dependency injection)
    Loader->>ToolLoader: tool_paths, dependencies

    loop For each tool
        ToolLoader->>ToolLoader: importlib.import_module()
        ToolLoader->>ToolLoader: Reflect and create instance
        ToolLoader->>ToolLoader: Auto-inject dependencies
        ToolLoader->>ToolLoader: toolset.add(tool)
    end

    opt If MCP config exists
        Loader->>MCP: Connect to MCP server
        MCP->>Loader: Get tool list
        Loader->>Loader: Add to toolset
    end

    Loader-->>CLI: Agent instance (with all tools)

Dependency Injection Mechanism

Kimi CLI's tool system uses automatic dependency injection, one of the most elegant aspects of the Agent system:

python

def _load_tool(tool_path: str, dependencies: dict) -> ToolType | None:
    """Load tool and auto-inject dependencies"""
    module_name, class_name = tool_path.rsplit(":", 1)
    module = importlib.import_module(module_name)
    cls = getattr(module, class_name)

    args = []
    for param in inspect.signature(cls).parameters.values():
        # All positional parameters are treated as dependencies
        if param.annotation in dependencies:
            args.append(dependencies[param.annotation])

    return cls(*args)  # Auto-inject dependencies

Dependency container includes:

Runtime: Runtime context
Config: Configuration information
Approval: Approval system
Session: Session data
DenwaRenji: D-Mail system
LaborMarket: Sub-agent management

Tool definition example:

python

class Shell(CallableTool2[Params]):
    def __init__(self, approval: Approval, **kwargs):
        # approval parameter auto-injected from Runtime
        self._approval = approval

    async def __call__(self, params: Params) -> ToolReturnType:
        # Use approval to request user confirmation
        if not await self._approval.request(...):
            return ToolRejectedError()

LaborMarket: The Sub-agent "Labor Market"

LaborMarket is an innovative design that manages all available sub-agents:

mermaid

graph TB
    A[User calls Task tool] --> B[Task execution];
    B --> C{Find subagent};
    C -->|Found| D[Get fixed sub-agent];
    C -->|Not found| E[Create dynamic sub-agent];

    D --> F[Load sub-agent];
    F --> G[Create independent Context];
    G --> H[run_soul() execution];

    subgraph "Sub-agent Types"
        I[Fixed sub-agent] --> J[Shared config];
        I --> K[Independent DenwaRenji];
        K --> L[Independent LaborMarket];

        E --> M[Dynamic sub-agent];
        M --> N[Clone Runtime];
        N --> O[Share main LaborMarket];
    end

Why sub-agents?

Task decomposition: Complex tasks can be delegated to specialized agents
Context isolation: Sub-agents have independent history, avoiding main context interruption
Single responsibility: Each agent focuses on a specific domain

KimiSoul Engine - The Smart Execution Brain

KimiSoul is the most important component in the entire system. It's the "soul" of the AI agent, responsible for all reasoning, tool calls, and context management.

Core Responsibilities

python

class KimiSoul(Soul):
    """The soul of Kimi CLI."""

    # 1. Manage execution loop
    async def run(self, user_input: str):
        await self._checkpoint()
        await self._context.append_message(user_message)
        await self._agent_loop()  # Main loop

    # 2. Handle each reasoning step
    async def _step(self) -> bool:
        result = await kosong.step(
            self._runtime.llm.chat_provider,
            self._agent.system_prompt,
            self._agent.toolset,
            self._context.history
        )
        # Process tool calls, results, context updates

    # 3. Manage context lifecycle
    async def _grow_context(self, result, tool_results):
        await self._context.append_message(result.message)
        await self._context.append_message(tool_messages)

    # 4. Compact context
    async def compact_context(self):
        # Compress context when it gets too long

Execution Loop Deep Dive (Sequence Diagram)

mermaid

sequenceDiagram
    participant User as User
    participant Soul as KimiSoul
    participant LLM as LLM Provider
    participant Context as Context Storage
    participant Tools as Tool Collection
    participant Wire as Event Bus

    User->>Soul: Input "List files in current directory"

    Soul->>Context: checkpoint() Create checkpoint
    Soul->>Context: append_message(user_message)

    loop Agent Loop (Step 1..N)
        Soul->>Soul: _step()

        opt Context too long
            Soul->>Wire: CompactionBegin
            Soul->>Soul: compact_context()
            Soul->>Wire: CompactionEnd
        end

        Soul->>Context: checkpoint()
        Soul->>Wire: StepBegin(n=step_no)

        Soul->>LLM: kosong.step() call
        LLM-->>Soul: StepResult

        loop Concurrent tool execution
            Soul->>Tools: handle(tool_call)
            Tools-->>Soul: ToolResult
            Soul->>Wire: Send ToolResult event
        end

        Soul->>Soul: _grow_context()
        Soul->>Context: append_message(result)
        Soul->>Context: append_message(tool_results)

        opt DenwaRenji has D-Mail
            Soul->>Soul: Throw BackToTheFuture
            Soul->>Context: revert_to(checkpoint_id)
            Soul->>Context: append_message(dmail)
        end

        opt Tool rejected
            Soul->>Wire: Send rejection event
            Soul->>Soul: return True (end loop)
        end

        opt No more tool calls
            Soul->>Soul: return True (complete)
        end
    end

    Soul-->>User: Return final answer

Checkpoint and "Time Travel" Mechanism

One of KimiSoul's most innovative designs is the Checkpoint mechanism, which allows the system to "go back in time."

How it works:

python

# 1. Create checkpoint
async def checkpoint(self, add_user_message: bool):
    """Create checkpoint before each step"""
    checkpoint_id = self._next_checkpoint_id
    self._next_checkpoint_id += 1

    # Write to disk
    await f.write(json.dumps({"role": "_checkpoint", "id": checkpoint_id}) + "\n")

    if add_user_message:
        await self.append_message(
            Message(role="user", content=[system(f"CHECKPOINT {checkpoint_id}")])
        )

Use Case: D-Mail

mermaid

graph LR
    A[SendDMail tool] --> B[Send D-Mail to the past];
    B --> C[Specify checkpoint_id];
    C --> D[KimiSoul captures];
    D --> E[Throw BackToTheFuture];
    E --> F[revert_to(checkpoint_id)];
    F --> G[Remove subsequent content];
    G --> H[Re-execute];

Scenario:

User asks: "Help me refactor this function"
AI starts executing, but at step 3 realizes: "Wait, need to backup first"
AI sends D-Mail back to checkpoint 1
System returns to checkpoint 1, this time backing up before refactoring

Just like the D-Mail in the sci-fi anime "Steins;Gate", AI can send messages to its past self!

Error Handling and Retry

KimiSoul has robust error handling:

python

@tenacity.retry(
    retry=retry_if_exception(_is_retryable_error),
    wait=wait_exponential_jitter(initial=0.3, max=5, jitter=0.5),
    stop=stop_after_attempt(max_retries),
    reraise=True
)
async def _kosong_step_with_retry() -> StepResult:
    """Auto-retry LLM calls"""
    return await kosong.step(...)

Retryable errors:

API connection errors
Timeout errors
503 Service Unavailable
Rate limiting (429)

Non-retryable errors:

Invalid API Key
Unsupported model
Context overflow

Tool System - An Extensible Capability Hub

Tool System Architecture

The philosophy of the tool system is: Everything is a tool, and all tools are pluggable.

mermaid

graph TB
    A[Tool Base Class] --> B[CallableTool];
    A --> C[CallableTool2[Params]];

    B --> D[Shell];
    B --> E[ReadFile];
    C --> F[Task];
    C --> G[SearchWeb];

    subgraph "Tool Registry"
        H[KimiToolset] --> I[_inner: SimpleToolset];
        I --> J[add(tool)];
    end

    subgraph "Dependency Injection"
        K[Runtime] --> L[Tool Dependencies];
        M[Approval] --> L;
        N[BuiltinSystemPromptArgs] --> L;
    end

    subgraph "MCP Integration"
        O[MCP Client] --> P[MCPTool];
        P --> H;
    end

    style A fill:#f96
    style H fill:#bbf
    style L fill:#bfb

Tool Categories

1. File Operations

python

# Read file
ReadFile(path="/absolute/path/to/file.py", line_offset=1, n_lines=100)

# Write file
WriteFile(path="/absolute/path", file_text="content", line_count_hint=1)

# Find files
Glob(pattern="src/**/*.py")

# Search content
Grep(pattern="TODO|FIXME", path="/workspace", -n=true)

# String replacement
StrReplaceFile(path="/absolute/path", old_str="", new_str="")

Security Features:

Must use absolute paths (prevents path traversal)
File size limit (100KB)
Line limit (1000 lines)
Per-line length limit (2000 characters)

2. Shell Commands

python

Shell(command="git status", timeout=60)

Security Features:

Requires user approval (except in yolo mode)
Timeout control (1-300 seconds)
Streaming output (real-time stdout/stderr)
Maximum timeout: 5 minutes

3. Web Tools

python

# Web search
SearchWeb(query="Python 3.13 new features")

# Fetch URL content
FetchURL(url="https://github.com/MoonshotAI/kimi-cli")

4. Task Management

python

# Set todo list
SetTodoList(todos=[
    {"content": "Analyze code structure", "status": "completed"},
    {"content": "Write unit tests", "status": "in_progress"}
])

5. Sub-agent Tool

python

# Delegate task to sub-agent
Task(
    description="Analyze codebase structure",  # Brief description
    subagent_name="coder",                      # Sub-agent name
    prompt="Analyze src/ directory structure in detail, summarize responsibilities of each module"
)

Tool Call Flow Example (Shell)

mermaid

sequenceDiagram
    participant Soul as KimiSoul
    participant ToolSet as KimiToolset
    participant Shell as Shell Tool
    participant Approval as Approval System
    participant Process as Subprocess
    participant Wire as Event Bus

    Soul->>ToolSet: toolset.handle(tool_call)
    ToolSet->>Shell: current_tool_call.set()

    Shell->>Shell: ToolResultBuilder()

    Shell->>Approval: request("Shell", "run shell command", description)

    opt YOLO Mode
        Approval-->>Shell: True (Auto-approve)
    else Normal Mode
        Approval->>Wire: ApprovalRequest
        Wire-->>Approval: ApprovalResponse
        Approval-->>Shell: True/False
    end

    alt Rejected
        Shell-->>ToolSet: ToolRejectedError()
    else Approved
        Shell->>Process: asyncio.create_subprocess_shell()

        par Stream Reading
            Process-->>Shell: stdout (line by line)
            Shell->>Shell: builder.write(line)
            Shell->>Wire: Send output

            Process-->>Shell: stderr (line by line)
            Shell->>Shell: builder.write(line)
            Shell->>Wire: Send output
        end

        Process-->>Shell: exitcode

        opt exitcode == 0
            Shell-->>ToolSet: builder.ok("Success")
        else exitcode != 0
            Shell-->>ToolSet: builder.error(f"Failed: {exitcode}")
        end
    end

    ToolSet->>ToolSet: current_tool_call.reset()
    ToolSet-->>Soul: HandleResult
    Soul->>Wire: Send ToolResult

MCP (Model Context Protocol) Integration

MCP is an open protocol from Anthropic that standardizes connections between AI models and tools.

python

# Configure MCP servers
{
  "mcpServers": {
    "context7": {
      "url": "https://mcp.context7.com/mcp",
      "headers": {
        "CONTEXT7_API_KEY": "YOUR_API_KEY"
      }
    },
    "chrome-devtools": {
      "command": "npx",
      "args": ["-y", "chrome-devtools-mcp@latest"]
    }
  }
}

# Load at startup
kimi --mcp-config-file /path/to/mcp.json

MCP Integration Flow:

mermaid

graph TB
    A[Configure MCP Server] --> B[Load MCP Config];
    B --> C[Connect MCP Client];
    C --> D[Get Available Tools];
    D --> E[Create MCPTool Wrapper];
    E --> F[Add to toolset];

    subgraph "Enhanced Capabilities"
        F --> G[Chrome DevTools];
        F --> H[Context7 Documentation];
        F --> I[GitHub API];
        F --> J[Database Connections];
    end

    A -- "Unified Tool Interface" --> K[All Tools Homogenized];
    style A fill:#f96
    style K fill:#bbf

MCP integration makes Kimi CLI infinitely extensible. Any tool conforming to the MCP protocol can be seamlessly integrated, including:

Database query tools
API calling tools
Browser automation tools
Documentation search tools

ACP Protocol - The Bridge for IDE Integration

Agent Client Protocol (ACP) is one of Kimi CLI's most important innovations. Like how LSP (Language Server Protocol) standardizes communication between editors and language servers, ACP standardizes communication between editors and AI agents.

ACP Positioning: Editor ↔ Agent LSP

mermaid

graph TB
    subgraph "LSP Analogy"
        A[Editor] -->|LSP| B[Language Server];
        B --> A;
    end

    subgraph "ACP Definition"
        C[Editor/IDE] -->|ACP| D[AI Agent];
        D --> C;
    end

    style A fill:#f9f
    style C fill:#bbf

ACP Core Features:

JSON-RPC 2.0: Based on JSON-RPC 2.0 protocol
StdIO Transport: Communication via standard input/output
Streaming Events: Supports real-time streaming responses
Tool Integration: Standardized tool call display
Approval Control: User confirmation mechanism
Session Management: Stateful conversations

ACP Protocol Stack

mermaid

graph TB
    subgraph "Application Layer"
        A[Zed Editor] --> B[User Interaction];
    end

    subgraph "Protocol Layer (ACP v1)"
        C[Initialize] --> D[Session Management];
        D --> E[Prompt Execution];
        E --> F[Streaming Updates];
        F --> G[Tool Calls];
        G --> H[Approval Requests];
    end

    subgraph "Transport Layer (JSON-RPC)"
        I[Requests] --> J[Responses];
        I --> K[Notifications];
    end

    subgraph "Physical Layer"
        L[Stdin] --> M[Stdout];
    end

    subgraph "Kimi CLI"
        N[ACPServer] --> O[ACPAgent];
        O --> P[Event Translation];
        P --> Q[KimiSoul];
    end

    style A fill:#f96
    style N fill:#bbf
    style Q fill:#bfb

Zed Integration Example

Configuration:

json

// ~/.config/zed/settings.json
{
  "agent_servers": {
    "Kimi CLI": {
      "command": "kimi",
      "args": ["--acp"],
      "env": {}
    }
  }
}

Workflow:

mermaid

sequenceDiagram
    participant User as User
    participant Zed as Zed Editor
    participant ACP as ACP Client
    participant Kimi as Kimi CLI
    participant Soul as KimiSoul
    participant LLM as Moonshot AI

    User->>Zed: Open Agent Panel
    Zed->>ACP: Start kimi --acp process
    ACP->>Kimi: initialize() request
    Kimi-->>ACP: InitializeResponse
    ACP->>Kimi: session/new request
    Kimi-->>ACP: NewSessionResponse(sessionId)

    User->>Zed: Input question "Explain this code logic"
    Zed->>ACP: session/prompt request
    ACP->>Kimi: Forward message
    Kimi->>Soul: run_soul(prompt)

    Soul->>LLM: Send request
    LLM-->>Soul: Streaming response

    loop Real-time streaming output
        Soul-->>Kimi: TextPart/ThinkPart
        Kimi-->>ACP: AgentMessageChunk/AgentThoughtChunk
        ACP->>Zed: Display text
    end

    opt Tool call needed
        Soul-->>Kimi: ToolCall
        Kimi-->>ACP: ToolCallStart
        ACP->>Zed: Display tool call
        Soul->>Kimi: Execute tool
        Soul-->>Kimi: ToolResult
        Kimi-->>ACP: ToolCallUpdate
        ACP->>Zed: Display result
    end

    Soul-->>Kimi: Complete
    Kimi-->>ACP: PromptResponse
    ACP->>Zed: Display final answer

ACP Event Translation Deep Dive

The most complex part of ACP is translating Kimi CLI's internal events to ACP standard events.

Internal Wire Events → ACP Protocol Events:

Internal Event	ACP Event	Description
`TextPart`	`AgentMessageChunk`	AI output text
`ThinkPart`	`AgentThoughtChunk`	AI thinking process
`ToolCall`	`ToolCallStart`	Tool call started
`ToolCallPart`	`ToolCallProgress`	Parameter streaming update
`ToolResult`	`ToolCallUpdate`	Tool call completed
`ApprovalRequest`	`RequestPermissionRequest`	User approval required

python

# Key translation logic example
async def _send_tool_call(self, tool_call: ToolCall):
    # Create tool call state
    state = _ToolCallState(tool_call)
    self.run_state.tool_calls[tool_call.id] = state

    # Send to ACP client
    await self.connection.sessionUpdate(
        acp.SessionNotification(
            sessionId=self.session_id,
            update=acp.schema.ToolCallStart(
                toolCallId=state.acp_tool_call_id,  # UUID
                title=state.get_title(),  # "Shell: ls -la"
                status="in_progress",
                content=[...]
            )
        )
    )

_ToolCallState: Intelligent State Management

python

class _ToolCallState:
    def __init__(self, tool_call: ToolCall):
        # Generate unique ACP tool call ID
        self.acp_tool_call_id = str(uuid.uuid4())

        # Parse tool call arguments
        self.tool_call = tool_call
        self.args = tool_call.function.arguments or ""
        self.lexer = streamingjson.Lexer()

    def get_title(self) -> str:
        """Dynamically generate title"""
        tool_name = self.tool_call.function.name
        subtitle = extract_key_argument(self.lexer, tool_name)
        # Example: "Shell: git status" or "ReadFile: src/main.py"
        return f"{tool_name}: {subtitle}"

ACP Approval Flow

mermaid

sequenceDiagram
    participant Soul as KimiSoul
    participant Tool as Shell Tool
    participant Approval as Approval System
    participant Wire as Wire
    participant ACP as ACPAgent
    participant ACPClient as ACP Client
    participant Editor as IDE

    Soul->>Tool: __call__()
    Tool->>Approval: request("Shell", "run shell command", "ls -la")

    Approval->>Wire: ApprovalRequest
    Wire-->>ACP: wire.receive()

    ACP->>ACPClient: requestPermission({
        toolCallId: "uuid",
        options: [
            {optionId: "approve", name: "Approve once", kind: "allow_once"},
            {optionId: "approve_for_session", name: "Approve for session", kind: "allow_always"},
            {optionId: "reject", name: "Reject", kind: "reject"}
        ]
    })

    ACPClient->>Editor: Show approval dialog
    Editor-->>ACPClient: User choice
    ACPClient-->>ACP: RequestPermissionResponse

    alt User approves
        ACP->>Wire: ApprovalResponse.APPROVE
        Wire-->>Approval: True
        Approval-->>Tool: True
        Tool->>Tool: Execute command
    else User rejects
        ACP->>Wire: ApprovalResponse.REJECT
        Wire-->>Approval: False
        Approval-->>Tool: False
        Tool-->>Tool: ToolRejectedError()
    end

This approval mechanism provides fine-grained control, ensuring AI doesn't execute dangerous operations without user authorization.

Core Design Principles

After thoroughly analyzing Kimi CLI's source code, here are the core design principles:

1. Layering and Decoupling

mermaid

graph TB
    A[CLI Entry] --> B[KimiCLI App Layer];
    B --> C[Agent System];
    B --> D[KimiSoul Engine];
    D --> E[Tool System];
    B --> F[UI Layer];

    subgraph "Fully Decoupled"
        C
        D
        E
        F
    end

    style A fill:#f96
    style F fill:#bbf
    style E fill:#bfb

Layering Benefits:

Testability: Each layer can be tested independently
Extensibility: Adding/removing UI modes doesn't affect core logic
Maintainability: Clear responsibility boundaries

2. Dependency Injection and Auto-wiring

python

# Tools declare dependencies via type annotations
class ReadFile(CallableTool2[Params]):
    def __init__(self, builtin_args: BuiltinSystemPromptArgs):
        self._work_dir = builtin_args.KIMI_WORK_DIR

# Agent system auto-discovers and injects dependencies
def _load_tool(tool_path: str, dependencies: dict):
    for param in inspect.signature(cls).parameters.values():
        if param.annotation in dependencies:
            args.append(dependencies[param.annotation])
    return cls(*args)

Benefits:

Reduces boilerplate code
Improves testability (easy to mock)
Flexible tool composition

3. Time Travel (Checkpoint)

python

# Create checkpoint before each step
await self._checkpoint()  # checkpoint_id: 0
# ... execute ...
await self._checkpoint()  # checkpoint_id: 1
# ... find issue ...
# D-Mail back in time
await self._context.revert_to(1)

Innovation:

Provides safety net
Implements "undo"
Supports sub-agent task management

4. Wire Communication Abstraction

python

def wire_send(msg: WireMessage) -> None:
    """Decouple Soul from UI"""
    wire = get_wire_or_none()
    wire.soul_side.send(msg)

# Shell UI handles directly
msg = await wire.ui_side.receive()

# ACP UI translates before sending to editor
await connection.sessionUpdate(convert_to_acp(msg))

Benefits:

Soul doesn't care about UI type
Supports multiple UI implementations
Event-driven architecture

5. ACP: The LSP for AI Era

ACP standardizes editor-AI communication, just like LSP standardized editor-language server communication.

Core Value:

Ecosystem Interoperability: Any ACP editor can use Kimi CLI
Streaming Experience: Real-time AI thinking display
Security Control: User approval mechanism
Tool Visualization: Structured tool call display

6. LLM Provider Abstraction

Support for multiple LLM providers:

python

def create_llm(provider, model):
    match provider.type:
        case "kimi":
            return Kimi(model, base_url, api_key)
        case "openai_responses":
            return OpenAIResponses(model, base_url, api_key)
        case "anthropic":
            return Anthropic(model, base_url, api_key)
        case "google_genai":
            return GoogleGenAI(model, base_url, api_key)

Benefits:

Avoid vendor lock-in
Flexible model switching
Support for self-hosted models

Use Case Analysis

Best suited for:

Terminal Development Workflow

bash

kimi
> Help me analyze this error log and find the root cause
> Run tests and fix failing cases
> Optimize this code's performance

IDE Intelligent Assistant

json

// After Zed configuration
{
  "agent_servers": {
    "Kimi CLI": {
      "command": "kimi",
      "args": ["--acp"]
    }
  }
}

Batch Automation

bash

kimi --print -c "Review all Python files and fix PEP8 violations"

Multi-tool Collaboration: AI has multiple tools (file operations, shell, search, approval, undo) to automatically plan complex tasks

Less suitable for:

Simple Q&A: Direct ChatGPT web interface is more convenient
Non-interactive: Traditional tools are faster for simple grep/ls commands
Ultra-high performance: Python async has overhead

Security Design

Path Restrictions
- File operations must use absolute paths
- Prevents path traversal attacks
Approval Mechanism
- Shell commands require approval
- File modifications require approval
- Supports yolo mode (for scripting scenarios)
Timeout Control
- Shell commands max 5-minute timeout
- Prevents long hangs
Context Limits
- Auto-compression when context approaches limit
- Prevents token waste

Conclusion

Kimi CLI is not just an excellent tool from Moonshot AI, but an elegantly architected, innovatively designed AI-native application example.

From studying Kimi CLI, we can see:

AI applications should be layered: Configuration, execution, tool, and UI layers should be clearly separated
Dependency injection is key to flexibility: Auto-wired tools are easy to extend
Checkpoint is time travel magic: Provides safety nets and supports complex tasks
Standardized protocols are ecosystem foundations: ACP makes editor-AI communication possible

Resources:

Kimi CLI represents the future of next-generation development tools: Not just tools, but intelligent partners that can understand, plan, and execute.

Authors: Claude Code + Kimi K2 Thinking

Kimi CLI Technical Deep Dive ​

Table of Contents ​

Introduction - What is Kimi CLI? ​

Architecture Overview - Four Core Systems ​

High-Level Architecture ​

Core Data Flow ​

Agent System - Flexible Configuration and Loading ​

What is the Agent System? ​

Configuration File Structure ​

Agent Loading Flow (Sequence Diagram) ​

Dependency Injection Mechanism ​

LaborMarket: The Sub-agent "Labor Market" ​

KimiSoul Engine - The Smart Execution Brain ​

Core Responsibilities ​

Execution Loop Deep Dive (Sequence Diagram) ​

Checkpoint and "Time Travel" Mechanism ​

Error Handling and Retry ​

Tool System - An Extensible Capability Hub ​

Tool System Architecture ​

Tool Categories ​

1. File Operations ​

2. Shell Commands ​

3. Web Tools ​

4. Task Management ​

5. Sub-agent Tool ​

Tool Call Flow Example (Shell) ​

MCP (Model Context Protocol) Integration ​

ACP Protocol - The Bridge for IDE Integration ​

ACP Positioning: Editor ↔ Agent LSP ​

ACP Protocol Stack ​

Zed Integration Example ​

ACP Event Translation Deep Dive ​

ACP Approval Flow ​

Core Design Principles ​

1. Layering and Decoupling ​

2. Dependency Injection and Auto-wiring ​

3. Time Travel (Checkpoint) ​

4. Wire Communication Abstraction ​

5. ACP: The LSP for AI Era ​

6. LLM Provider Abstraction ​

Use Case Analysis ​

Security Design ​

Conclusion ​

Kimi CLI Technical Deep Dive

Table of Contents

Introduction - What is Kimi CLI?

Architecture Overview - Four Core Systems

High-Level Architecture

Core Data Flow

Agent System - Flexible Configuration and Loading

What is the Agent System?

Configuration File Structure

Agent Loading Flow (Sequence Diagram)

Dependency Injection Mechanism

LaborMarket: The Sub-agent "Labor Market"

KimiSoul Engine - The Smart Execution Brain

Core Responsibilities

Execution Loop Deep Dive (Sequence Diagram)

Checkpoint and "Time Travel" Mechanism

Error Handling and Retry

Tool System - An Extensible Capability Hub

Tool System Architecture

Tool Categories

1. File Operations

2. Shell Commands

3. Web Tools

4. Task Management

5. Sub-agent Tool

Tool Call Flow Example (Shell)

MCP (Model Context Protocol) Integration

ACP Protocol - The Bridge for IDE Integration

ACP Positioning: Editor ↔ Agent LSP

ACP Protocol Stack

Zed Integration Example

ACP Event Translation Deep Dive

ACP Approval Flow

Core Design Principles

1. Layering and Decoupling

2. Dependency Injection and Auto-wiring

3. Time Travel (Checkpoint)

4. Wire Communication Abstraction

5. ACP: The LSP for AI Era

6. LLM Provider Abstraction

Use Case Analysis

Security Design

Conclusion