📘 User Guide¶

Welcome to the sktime-mcp User Guide. This comprehensive manual will help you understand how to install, configure, and master the Model Context Protocol (MCP) server for time-series forecasting.

🚀 Getting Started¶

Prerequisites¶

Before you begin, ensure you have:

Python 3.9+ installed.
pip package manager.
A compatible MCP Client (like Claude Desktop).

Installation¶

Install the package directly from the source. We recommend installing with all dependencies to unlock full functionality.

# Standard installation
pip install -e .

# Recommended: Install with all optional extras (SQL, Forecasting, Files)
pip install -e ".[all]"

Running the Server¶

Start the MCP server to begin listening for connections:

sktime-mcp

Or manually via Python:

python -m sktime_mcp.server

Client Configuration

Ensure your MCP client (e.g., Claude Desktop) is configured to run this command. See the official VSCode guidelines for configuration examples.

🛠️ Core Capabilities¶

The sktime-mcp server exposes a suite of tools designed for Large Language Models to interact with time-series data.

Category	Tools	Description
Discovery	`list_estimators`, `search_estimators`, `describe_estimator`	Find the right model for your task (Forecasting, Classification, etc.).
Instantiation	`instantiate_estimator`, `instantiate_pipeline`	Create model instances or complex pipelines.
Execution	`fit_predict`, `fit`, `predict`	Train models and generate forecasts.
Data	`load_data_source`, `list_datasets`	Load data from Pandas, CSV/Parquet, or SQL.
Export	`export_code`	Generate Python code to reproduce your results.

⚡ Workflows¶

1. The "Hello World" of Forecasting¶

Query - Run forecasting on a demo dataset.¶

Response - Standard workflow followed by SKtime MCP for forecasting on a demo dataset.¶

Step 1: Discover Data

{"tool": "list_datasets", "arguments": {}}

Step 2: Find a Forecaster

{"tool": "list_estimators", "arguments": {"task": "forecasting", "limit": 5}}

Step 3: Instantiate & Run

{
  "tool": "fit_predict",
  "arguments": {
    "estimator_handle": "est_id_from_step_2",
    "dataset": "airline",
    "horizon": 12
  }
}

2. Advanced Pipeline Composition¶

Query - Create sophisticated pipelines without writing complex code.¶

Response - Standard workflow followed by SKtime MCP.¶

Step 1: Validate Pipeline Check if components work together (e.g., Deseasonalizer -> Detrender -> ARIMA).

{
  "tool": "validate_pipeline",
  "arguments": {"components": ["ConditionalDeseasonalizer", "Detrender", "ARIMA"]}
}

Step 2: Instantiate Pipeline

{
  "tool": "instantiate_pipeline",
  "arguments": {
    "components": ["ConditionalDeseasonalizer", "Detrender", "ARIMA"],
    "params_list": [{}, {}, {"order": [1, 1, 1]}]
  }
}

💾 Data Management¶

Bring your own data into the MCP server.

Supported Sources¶

Local Files: CSV, Parquet, Excel
SQL Databases: PostgreSQL, SQLite, etc.

Example: Loading a CSV File¶

{
  "tool": "load_data_source",
  "arguments": {
    "config": {
      "type": "file",
      "path": "/absolute/path/to/your/data.csv",
      "time_column": "timestamp",
      "target_column": "value"
    }
  }
}

Absolute Paths Required

The server requires absolute file paths (e.g., /home/user/data.csv). Relative paths may fail depending on where the server was started.

💡 Best Practices¶

Resource Management: Explicitly release handles (release_handle, release_data_handle) when done to free up memory.
Reproducibility: Always use export_code after a successful experiment to save your work.
Data Hygiene: Use auto_format_on_load for messy real-world data to avoid frequent validation errors.

⚠️ Known Limitations¶

While sktime-mcp is a powerful tool for prototyping, please be aware of the current architectural limitations.

1. In-Memory "Amnesia" (No Persistence)¶

The server stores state in standard Python dictionaries.

Impact: If the server restarts or connection drops, all loaded data and trained models are lost. There is no disk-backed checkpointing.

2. Synchronous Execution (GIL Blocking)¶

Heavy operations (like AutoARIMA fitting) run on the main thread.

Impact: The server will not respond to other requests (like "hello") until the fitting operation completes.

3. "The Data Wall" (Memory Limits)¶

Data Adapters read the entire dataset into RAM.

Impact: Loading multi-gigabyte files may crash the server with an OutOfMemory error. Lazy loading is not yet supported.

4. Security¶

Instantiation allows arbitrary parameters within the registry.

Impact: While constrained to valid estimators, there is limited validation on parameter values, which could theoretically be misused.

5. Rigid Data Formatting¶

The auto_format logic is heuristic-based.

Impact: Complex time-series with irregular gaps or mixed frequencies might fail to auto-format correctly, requiring manual pre-processing outside the tool.

6. Local-Only Filesystem¶

Impact: The server cannot easily access files if running in an isolated Docker container unless volumes are mounted. It does not support "Upload over HTTP/MCP".

7. JSON Serialization Loss¶

Complex sktime types (Periods, Intervals) are converted to strings for LLM consumption.

Impact: Some rich metadata is lost during the conversion to JSON for the client response.

8. Code Export Limitations¶

export_code uses template-based generation.

Impact: Highly complex custom pipelines with lambda functions or specific edge-cases might generate code that requires minor manual fixes.

❓ Troubleshooting¶

Issue	Solution
"Unknown estimator"	Use `search_estimators` to find the exact case-sensitive name.
"Missing dependencies"	Run `pip install -e ".[all]"` to ensure all extras are present.
Validation Failures	Enable `auto_format_on_load` or use `format_time_series` to clean your data.
Server Timeout	Heavy models take time. Be patient or try a simpler model (e.g., `NaiveForecaster`) first.