Skip to content

Overview

This page describes the top-level layered architecture, runtime modes, and technology stack of the system.

Use case: You need to quickly build a complete mental model from the AI Agent down to the SiYuan data boundary.


Four-Layer Architecture

The system consists of four layers from outside to inside:

┌─────────────────────────────────────────────────────────────┐
│  Layer 1: AI Agent / MCP Client                             │
│  - Claude Desktop / Kimi CLI / Cursor / other MCP clients   │
│  - Communicates via stdio or HTTP(S) using MCP protocol     │
├─────────────────────────────────────────────────────────────┤
│  Layer 2: MCP Server / CLI                                  │
│  ┌──────────────┐  ┌─────────────────────────────────────┐  │
│  │ Plugin MCP   │  │ Standalone CLI (siyuan-sisyphus)    │  │
│  │ Server       │  │ - Direct TOOL_REGISTRY calls        │  │
│  │ - stdio mode │  │ - Bypasses MCP protocol entirely    │  │
│  │ - HTTP mode  │  │                                     │  │
│  └──────────────┘  └─────────────────────────────────────┘  │
├─────────────────────────────────────────────────────────────┤
│  Layer 3: Plugin Runtime & Tool Layer                        │
│  - Tool Registry (11 aggregated tools)                      │
│  - Tool Lifecycle (analytics / telemetry / mascot)          │
│  - Permission Manager (notebook-level 4-tier permissions)   │
│  - Settings Panel & Mascot UI                               │
├─────────────────────────────────────────────────────────────┤
│  Layer 4: SiYuan HTTP API & Data Model                      │
│  - SiYuanClient (unified HTTP wrapper)                      │
│  - Notebook / Document / Block / Attribute View / File      │
│    / Search / Tag / System / Flashcard APIs                 │
│  - SQLite data storage                                      │
└─────────────────────────────────────────────────────────────┘

Key boundary: Layer 1 and Layer 2 communicate via the Model Context Protocol (MCP) standard. Layer 2 and Layer 3 use MCP in plugin mode (ListToolsRequestSchema / CallToolRequestSchema), but in CLI mode they use direct function calls. Layer 3 and Layer 4 always go through SiYuan's HTTP APIs — never directly accessing the local filesystem — ensuring remote safety.


Dual-Product Architecture

This repository produces two independently usable products:

DimensionSiYuan PluginStandalone CLI (siyuan-sisyphus)
Entry filesrc/index.tssrc/cli/index.ts
Build outputdist/index.js + dist/mcp-server.cjscli/dist/cli.cjs
Runtime modeLong-running process, starts with SiYuanShort-lived process, exits after one call
Transportstdio (default) / HTTP (optional)Direct function calls, no MCP transport
Config sourcePlugin settings panel → SiYuan storageCLI profile + plugin UI config from SiYuan storage
Tool togglesUser fine-controls each action via UISame UI-controlled tool/action toggles
Mascot UIYes (Svelte component mounted to DOM)No
Permission mgmtReads from SiYuan storageSame PermissionManager; unconfigured notebooks default to r (read-only)
Use caseAI client integration, daily continuous useScripting, CI/CD, quick queries

Both products share the same core: TOOL_REGISTRY, SiYuanClient, PermissionManager, tool-lifecycle, and all src/api/* wrappers. The differences are only in the outer packaging (MCP Server vs CLI argument parsing) and config persistence.


Technology Stack

LayerTechnologyNotes
BuildViteMulti-entry compilation (renderer / server / cli), outputs CommonJS
FrontendSvelteSettings panel McpConfig, mascot ToolPuppy
LanguageTypeScriptSource uses ESM ("type": "module"), output is CJS
MCP Protocol@modelcontextprotocol/sdk ^1.26.0stdio / StreamableHTTP dual transport
ValidationZod ^4.3.6Input parameter schemas for all tool actions (~913 lines)
TestingVitestUnit + Integration + Smoke three-tier testing
DocsVitePressBilingual site (English default + Simplified Chinese /zh/)
CLI ParsingminimistTwo-pass parsing: global flags + schema-aware action flags

Runtime Mode Comparison

Plugin stdio Mode (Default)

AI Client (e.g. Claude Desktop)
    ↓ spawns child process
SiYuan.app → loads plugin → starts mcp-server.cjs (stdio)
    ↓ MCP stdio protocol
Calls TOOL_REGISTRY → SiYuanClient → SiYuan HTTP API
  • AI client spawns dist/mcp-server.cjs as a child process
  • Standard input/output serve as the MCP transport channel
  • Suitable for local desktop AI assistants (Claude Desktop, Kimi CLI, etc.)

Plugin HTTP Mode

AI Client / Browser / Third-party service
    ↓ HTTP POST (Bearer Token)
SiYuan.app → loads plugin → starts embedded HTTP Server
    ↓ StreamableHTTP (MCP 2025-03-26 spec)
Calls TOOL_REGISTRY → SiYuanClient → SiYuan HTTP API
  • Plugin auto-starts HTTP server on onload() (if user-enabled)
  • Supports Session management, mcp-session-id routing
  • Supports TLS (custom certificates), Token auth, Parent Watchdog (self-destruct on parent exit)
  • Suitable for remote access, browser extensions, multi-client sharing

CLI Direct Operation Mode

Terminal user
    ↓ shell command
siyuan-sisyphus notebook list
    ↓ direct import
src/cli/dispatch.ts → TOOL_REGISTRY[notebook].callTool()
    ↓ runToolCall (reuses lifecycle)
SiYuanClient → SiYuan HTTP API
    ↓ render
Terminal output (human-readable / --json)
  • Does not start any MCP server process
  • Directly imports TOOL_REGISTRY and runToolCall from plugin source
  • One call, one request, immediate exit
  • Supports interactive paging (Enter/n/p/q in TTY)

Key Facts at a Glance

  1. Aggregated tool surface: 11 MCP tools (fs / notebook / document / block / av / file / search / tag / system / flashcard / mascot), rather than 100+ single-purpose tools.
  2. Config hot-reload: server.ts getToolConfig() has a 30-second TTL cache + in-flight deduplication. Changes from the settings panel take effect without restart.
  3. Dangerous action marking: DANGEROUS_ACTIONS set marks 15 high-risk actions (delete / remove / find_replace, etc.). Warnings are auto-injected into tool descriptions and server instructions, but calls are not blocked (relies on LLM self-discipline).
  4. Analytics & Telemetry: Every tool call records an analytics event (JSONL format, 2MB auto-rotation). Telemetry aggregates and reports periodically per config. In CLI mode, analytics is synchronously flushed before exit.
  5. Puppy mascot: Communicates with the MCP server via polling puppyEvents.json files (decoupled). Supports idle animations, drag, wage card, test mode, and other state machines.

Released under the MIT License.