Skip to content

MCP Integration Guide

Since: v0.9.0 (core), v0.10.0+ (F046/F047)
Protocol: Model Context Protocol (MCP)

This guide covers connecting MCP-compliant agents to CSTP decision intelligence. The MCP layer exposes the same capabilities as the JSON-RPC API through a standardized tool interface, with two high-level "primary" tools for streamlined workflows.


Architecture

The MCP server is a thin bridge over the existing CSTP dispatcher. Each MCP tool maps 1:1 to a CSTP service method — there is zero business logic duplication.

mermaid
graph TB
    subgraph Clients["MCP Clients"]
        CC["Claude Code"]
        CD["Claude Desktop"]
        OC["OpenClaw"]
        GEN["Generic MCP Client"]
    end

    subgraph Transport["Transport Layer"]
        HTTP["Streamable HTTP<br/>POST/GET :9991/mcp"]
        STDIO["stdio<br/>python -m a2a.mcp_server"]
    end

    subgraph MCP["MCP Server"]
        APP["mcp_app<br/>Server('cstp-decisions')"]
        SCHEMAS["Pydantic Schemas<br/>(mcp_schemas.py)"]
        TOOLS["11 Tool Handlers"]
    end

    subgraph Services["CSTP Services"]
        QS["query_service"]
        GS["guardrails_service"]
        DS["decision_service"]
        CS["calibration_service"]
    end

    subgraph Storage["Storage"]
        CHROMA["ChromaDB"]
        YAML["YAML Files"]
    end

    Clients -->|"Streamable HTTP"| HTTP
    Clients -->|"stdio"| STDIO
    HTTP --> APP
    STDIO --> APP
    APP --> SCHEMAS
    APP --> TOOLS
    TOOLS -->|"direct function calls"| Services
    Services --> Storage

Key design principles:

  • Bridge pattern — MCP tools validate inputs via Pydantic, then delegate to CSTP services
  • Shared server instance — Both stdio and Streamable HTTP use the same mcp_app and tool handlers
  • Zero duplication — No business logic in the MCP layer; all logic lives in CSTP services

Connection Methods

Streamable HTTP (Remote)

The CSTP FastAPI server mounts the MCP handler at /mcp on port 9991. The endpoint handles both POST (tool calls) and GET (event streams) on the same path.

http://your-server:9991/mcp

Requirements: The CSTP server must be running (Docker or local).

stdio (Local / Docker)

The MCP server can run as a standalone process communicating over stdin/stdout:

bash
# Direct (requires Python env with dependencies)
python -m a2a.mcp_server

# Via Docker (recommended)
docker exec -i cstp python -m a2a.mcp_server

Requirements: The container/environment must have CHROMA_URL, GEMINI_API_KEY, and DECISIONS_PATH configured.


Client Setup

Claude Code CLI

Add to your project's .mcp.json (or global ~/.claude.json):

json
{
  "mcpServers": {
    "decisions": {
      "command": "npx",
      "args": [
        "mcp-remote@latest",
        "http://your-server:9991/mcp",
        "--allow-http",
        "--header",
        "Authorization: Bearer YOUR_CSTP_TOKEN"
      ]
    }
  }
}

On Windows, use cmd as the command:

json
{
  "mcpServers": {
    "decisions": {
      "command": "cmd",
      "args": [
        "/c", "npx", "mcp-remote@latest",
        "http://your-server:9991/mcp",
        "--allow-http",
        "--header",
        "Authorization: Bearer YOUR_CSTP_TOKEN"
      ]
    }
  }
}

After adding, CSTP tools appear in Claude Code's tool list automatically.

Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS, %APPDATA%\Claude\claude_desktop_config.json on Windows):

json
{
  "mcpServers": {
    "decisions": {
      "command": "npx",
      "args": [
        "mcp-remote@latest",
        "http://your-server:9991/mcp",
        "--allow-http",
        "--header",
        "Authorization: Bearer YOUR_CSTP_TOKEN"
      ]
    }
  }
}

OpenClaw

Add to your OpenClaw gateway config:

yaml
mcp:
  servers:
    cognition-engines:
      url: http://your-server:9991/mcp
      headers:
        Authorization: "Bearer your-token"

Generic MCP Client

Any client implementing the MCP specification can connect via either transport. The server advertises itself as cstp-decisions and exposes 11 tools via the standard tools/list method.


Available Tools

Primary Tools

These are the recommended entry points. Use these first; fall back to granular tools only when you need fine-grained control.

pre_action (PRIMARY)

All-in-one pre-action check: queries similar past decisions, evaluates guardrails, fetches calibration context, extracts confirmed patterns, and optionally records the decision. One call replaces query_decisions + check_action + log_decision.

Call this BEFORE any significant decision.

json
{
  "action": {
    "description": "Refactor auth module to use JWT",
    "category": "architecture",
    "stakes": "high",
    "confidence": 0.82
  },
  "reasons": [
    {"type": "analysis", "text": "Stateless auth scales better for microservices"},
    {"type": "pattern", "text": "Team successfully used JWT in 2 other services"}
  ],
  "tags": ["auth", "jwt", "refactor"],
  "pattern": "Stateless auth scales better than session-based",
  "options": {
    "auto_record": true,
    "query_limit": 5,
    "include_patterns": true
  }
}

Input Schema:

FieldTypeRequiredDescription
actionobjectAction details (see below)
reasonsarraySupporting reasons (aim for 2+ types)
tagsstring[]Tags for categorization
patternstringAbstract pattern this decision represents
optionsobjectauto_record (bool, default true), query_limit (int, default 5), include_patterns (bool, default true)

Action object:

FieldTypeRequiredDescription
descriptionstringWhat you plan to do
categorystringarchitecture, process, integration, tooling, security
stakesstringlow, medium, high, critical
confidencefloatYour confidence (0.0–1.0)

Output: allowed boolean, decision_id (if recorded), relevant past decisions, guardrail results, calibration context (accuracy, Brier, tendency for this category), confirmed patterns, and query time.


get_session_context (PRIMARY)

Full cognitive context for session start or task switch. Returns agent profile, relevant decisions, active guardrails, calibration by category, overdue reviews, and confirmed patterns.

Call at session start to prime decision-making context.

json
{
  "task_description": "Build authentication service for user API",
  "include": ["decisions", "guardrails", "calibration", "ready", "patterns"],
  "format": "markdown"
}

Input Schema:

FieldTypeRequiredDefaultDescription
task_descriptionstringWhat you're working on
includestring[]allSections to include: decisions, guardrails, calibration, ready, patterns, contradictions
decisions_limitint10Max relevant decisions
ready_limitint5Max ready queue items
format"json" | "markdown""markdown"Output format (markdown is ready for system prompt injection)

Output (JSON): Agent profile (total decisions, reviewed, accuracy, Brier, tendency, strongest/weakest category), relevant decisions, active guardrails, calibration by category, ready queue (overdue reviews, stale pending), confirmed patterns.

Output (Markdown): Pre-formatted block ready to inject into a system prompt or CLAUDE.md.


1. Session start    → get_session_context (load context into system prompt)
2. Decision point   → pre_action (gate + record + get relevant history)
3. During work      → record_thought (capture reasoning)
4. After work       → update_decision (finalize decision text and context)
5. Later            → review_outcome (record success/failure for calibration)

Granular Tools

Use these when you need fine-grained control beyond what pre_action and get_session_context provide.

query_decisions

Search similar past decisions using semantic search, keyword matching, or hybrid retrieval. Returns matching decisions with confidence scores, categories, and outcomes.

Use before making new decisions to learn from history.

json
{
  "query": "database migration strategy",
  "limit": 5,
  "retrieval_mode": "hybrid",
  "filters": {
    "category": "architecture",
    "project": "owner/repo"
  }
}

Input Schema:

FieldTypeRequiredDefaultDescription
querystringNatural language query (min 1 char)
limitint5Maximum results (1–50)
retrieval_mode"semantic" | "keyword" | "hybrid""hybrid"Search mode
bridge_side"structure" | "function" | "both""both"Search by bridge side
filtersobjectSee below

Filters:

FieldTypeDescription
categorystringarchitecture, process, integration, tooling, security
stakesstring[]low, medium, high, critical
projectstringProject in owner/repo format
has_outcomeboolFilter to decisions with/without recorded outcomes

Output: List of matching decisions with IDs, titles, confidence scores, categories, stakes, dates, and similarity distances.


check_action

Validate an intended action against safety guardrails and policies. Returns whether the action is allowed, any blocking violations, and warnings.

Always check before high-stakes actions.

json
{
  "description": "Deploy to production without tests",
  "stakes": "high",
  "confidence": 0.6
}

Input Schema:

FieldTypeRequiredDefaultDescription
descriptionstringAction you intend to take (min 1 char)
categorystringAction category
stakes"low" | "medium" | "high" | "critical""medium"Stakes level
confidencefloatYour confidence (0.0–1.0)

Output: allowed boolean, list of violations (blocking), list of warnings, count of evaluated guardrails.


log_decision

⚠️ Last resort. Prefer pre_action with auto_record: true, which queries similar decisions, checks guardrails, and records in one call. Use log_decision only when no prior context exists (legacy clients, spontaneous decisions).

Record a decision to the immutable decision log manually.

json
{
  "decision": "Use PostgreSQL for the new service",
  "confidence": 0.85,
  "category": "architecture",
  "stakes": "high",
  "context": "Evaluated PostgreSQL vs MongoDB for the analytics service",
  "reasons": [
    {"type": "analysis", "text": "Need ACID transactions for financial data"},
    {"type": "pattern", "text": "Team has strong PostgreSQL expertise"}
  ],
  "project": "owner/repo",
  "pr": 42,
  "bridge": {
    "structure": "PostgreSQL",
    "function": "persistence"
  }
}

Deliberation Auto-Capture: When log_decision is called, the server automatically attaches a "deliberation trace" if you previously called query_decisions or check_action within the same session. This captures the "chain of thought" leading to the decision without extra effort.

Input Schema:

FieldTypeRequiredDefaultDescription
decisionstringWhat you decided — state the choice
confidencefloatConfidence (0.0–1.0)
category"architecture" | "process" | "integration" | "tooling" | "security"Decision category
stakes"low" | "medium" | "high" | "critical""medium"Stakes level
contextstringSituation context
reasonsarraySupporting reasons (see below)
tagsstring[]Tags for categorization
projectstringProject in owner/repo format
featurestringFeature or epic name
printPull request number (≥ 1)
bridgeobjectBridge definition (structure/function)

Bridge object:

FieldTypeDescription
structurestringWhat it looks like (pattern)
functionstringWhat problem it solves (purpose)
tolerancestring[]Features that don't matter
enforcementstring[]Features that must be present
preventionstring[]Features that must be absent

Reason object:

FieldTypeDescription
type"authority" | "analogy" | "analysis" | "pattern" | "intuition"Reasoning type
textstringExplanation

Output: Decision ID, file path, indexed status, and timestamp.


review_outcome

Record the outcome of a past decision. Provide the decision ID, whether it succeeded or failed, what actually happened, and lessons learned.

Builds calibration data over time.

json
{
  "id": "a1b2c3d4",
  "outcome": "success",
  "actual_result": "PostgreSQL handled the load well, no issues",
  "lessons": "ACID transactions were indeed critical for data integrity"
}

Input Schema:

FieldTypeRequiredDescription
idstringDecision ID (8-char hex)
outcome"success" | "partial" | "failure" | "abandoned"Outcome of the decision
actual_resultstringWhat actually happened
lessonsstringLessons learned
notesstringAdditional review notes

Output: Review status, updated path, and reindex confirmation.


get_stats

Get calibration statistics: Brier score, accuracy, confidence distribution, and decision counts. Optionally filter by category, project, or time window.

Use to check decision-making quality.

json
{
  "category": "architecture",
  "window": "90d"
}

Input Schema:

FieldTypeRequiredDescription
categorystringFilter by category
projectstringFilter by project (owner/repo)
window"30d" | "60d" | "90d" | "all"Rolling time window

Output: Brier score, accuracy, confidence distribution, decision counts, and confidence variance metrics.


Error Handling

All MCP tool errors are returned as TextContent containing a JSON object:

json
{
  "error": "validation_error",
  "message": "1 validation error for QueryDecisionsInput\nquery\n  Field required [type=missing, ...]"
}
Error TypeCause
validation_errorInvalid input — Pydantic schema validation failed
internal_errorUnexpected server error — check server logs

The MCP server logs to stderr (stdout is reserved for the MCP protocol). Check container logs for debugging:

bash
docker logs cstp 2>&1 | grep "cstp-mcp"

Authentication

  • Streamable HTTP — Inherits authentication from the FastAPI bearer token middleware. The same CSTP_AUTH_TOKENS apply.
  • stdio — No bearer token authentication. Security relies on process-level isolation (e.g., Docker container, local access).

Local Development

bash
# 1. Install with MCP support
pip install -e ".[a2a,mcp,dev]"

# 2. Set required environment variables
export CHROMA_URL=http://localhost:8000
export GEMINI_API_KEY=your-key
export DECISIONS_PATH=./decisions

# 3. Run MCP server (stdio)
python -m a2a.mcp_server

# 4. Or start the full server (includes /mcp endpoint)
python -m uvicorn a2a.server:app --host 0.0.0.0 --port 9991

Tool-to-Method Mapping

MCP ToolCSTP JSON-RPC MethodService ModuleLevel
pre_actioncstp.preActiona2a/cstp/preaction_service.pyPrimary
get_session_contextcstp.getSessionContexta2a/cstp/session_context_service.pyPrimary
query_decisionscstp.queryDecisionsa2a/cstp/query_service.pyGranular
check_actioncstp.checkGuardrailsa2a/cstp/guardrails_service.pyGranular
log_decisioncstp.recordDecisiona2a/cstp/decision_service.pyGranular
review_outcomecstp.reviewDecisiona2a/cstp/decision_service.pyGranular
get_statscstp.getCalibrationa2a/cstp/calibration_service.pyGranular
get_decisioncstp.getDecisiona2a/cstp/decision_service.pyGranular
get_reason_statscstp.getReasonStatsa2a/cstp/reason_stats_service.pyGranular
update_decisioncstp.updateDecisiona2a/cstp/decision_service.pyGranular
record_thoughtcstp.recordThoughta2a/cstp/deliberation_tracker.pyGranular
readycstp.readya2a/cstp/ready_service.pyPrimary
link_decisionscstp.linkDecisionsa2a/cstp/graph_service.pyGraph
get_graphcstp.getGrapha2a/cstp/graph_service.pyGraph
get_neighborscstp.getNeighborsa2a/cstp/graph_service.pyGraph

Released under the Apache 2.0 License.