📘 User Guide¶
Welcome to the sktime-mcp User Guide. This comprehensive manual will help you understand how to install, configure, and master the Model Context Protocol (MCP) server for time-series forecasting.
🚀 Getting Started¶
Prerequisites¶
Before you begin, ensure you have:
- Python 3.9+ installed.
- pip package manager.
- A compatible MCP Client (like Claude Desktop).
Installation¶
Install the package directly from the source. We recommend installing with all dependencies to unlock full functionality.
# Standard installation
pip install -e .
# Recommended: Install with all optional extras (SQL, Forecasting, Files)
pip install -e ".[all]"
Running the Server¶
Start the MCP server to begin listening for connections:
Or manually via Python:
Client Configuration
Ensure your MCP client (e.g., Claude Desktop) is configured to run this command. See the official VSCode guidelines for configuration examples.
🛠️ Core Capabilities¶
The sktime-mcp server exposes a suite of tools designed for Large Language Models to interact with time-series data.
| Category | Tools | Description |
|---|---|---|
| Discovery | list_estimators, search_estimators, describe_estimator |
Find the right model for your task (Forecasting, Classification, etc.). |
| Instantiation | instantiate_estimator, instantiate_pipeline |
Create model instances or complex pipelines. |
| Execution | fit_predict, fit, predict |
Train models and generate forecasts. |
| Data | load_data_source, list_datasets |
Load data from Pandas, CSV/Parquet, or SQL. |
| Export | export_code |
Generate Python code to reproduce your results. |
⚡ Workflows¶
1. The "Hello World" of Forecasting¶
Query - Run forecasting on a demo dataset.¶
Response - Standard workflow followed by SKtime MCP for forecasting on a demo dataset.¶
Step 1: Discover Data
Step 2: Find a Forecaster
Step 3: Instantiate & Run
{
"tool": "fit_predict",
"arguments": {
"estimator_handle": "est_id_from_step_2",
"dataset": "airline",
"horizon": 12
}
}
2. Advanced Pipeline Composition¶
Query - Create sophisticated pipelines without writing complex code.¶
Response - Standard workflow followed by SKtime MCP.¶
Step 1: Validate Pipeline Check if components work together (e.g., Deseasonalizer -> Detrender -> ARIMA).
{
"tool": "validate_pipeline",
"arguments": {"components": ["ConditionalDeseasonalizer", "Detrender", "ARIMA"]}
}
Step 2: Instantiate Pipeline
{
"tool": "instantiate_pipeline",
"arguments": {
"components": ["ConditionalDeseasonalizer", "Detrender", "ARIMA"],
"params_list": [{}, {}, {"order": [1, 1, 1]}]
}
}
💾 Data Management¶
Bring your own data into the MCP server.
Supported Sources¶
- Local Files: CSV, Parquet, Excel
- SQL Databases: PostgreSQL, SQLite, etc.
Example: Loading a CSV File¶
{
"tool": "load_data_source",
"arguments": {
"config": {
"type": "file",
"path": "/absolute/path/to/your/data.csv",
"time_column": "timestamp",
"target_column": "value"
}
}
}
Absolute Paths Required
The server requires absolute file paths (e.g., /home/user/data.csv). Relative paths may fail depending on where the server was started.
💡 Best Practices¶
- Resource Management: Explicitly release handles (
release_handle,release_data_handle) when done to free up memory. - Reproducibility: Always use
export_codeafter a successful experiment to save your work. - Data Hygiene: Use
auto_format_on_loadfor messy real-world data to avoid frequent validation errors.
⚠️ Known Limitations¶
While sktime-mcp is a powerful tool for prototyping, please be aware of the current architectural limitations.
1. In-Memory "Amnesia" (No Persistence)¶
The server stores state in standard Python dictionaries.
Impact: If the server restarts or connection drops, all loaded data and trained models are lost. There is no disk-backed checkpointing.
2. Synchronous Execution (GIL Blocking)¶
Heavy operations (like AutoARIMA fitting) run on the main thread.
Impact: The server will not respond to other requests (like "hello") until the fitting operation completes.
3. "The Data Wall" (Memory Limits)¶
Data Adapters read the entire dataset into RAM.
Impact: Loading multi-gigabyte files may crash the server with an
OutOfMemoryerror. Lazy loading is not yet supported.
4. Security¶
Instantiation allows arbitrary parameters within the registry.
Impact: While constrained to valid estimators, there is limited validation on parameter values, which could theoretically be misused.
5. Rigid Data Formatting¶
The auto_format logic is heuristic-based.
Impact: Complex time-series with irregular gaps or mixed frequencies might fail to auto-format correctly, requiring manual pre-processing outside the tool.
6. Local-Only Filesystem¶
Impact: The server cannot easily access files if running in an isolated Docker container unless volumes are mounted. It does not support "Upload over HTTP/MCP".
7. JSON Serialization Loss¶
Complex sktime types (Periods, Intervals) are converted to strings for LLM consumption.
Impact: Some rich metadata is lost during the conversion to JSON for the client response.
8. Code Export Limitations¶
export_code uses template-based generation.
Impact: Highly complex custom pipelines with lambda functions or specific edge-cases might generate code that requires minor manual fixes.
❓ Troubleshooting¶
| Issue | Solution |
|---|---|
| "Unknown estimator" | Use search_estimators to find the exact case-sensitive name. |
| "Missing dependencies" | Run pip install -e ".[all]" to ensure all extras are present. |
| Validation Failures | Enable auto_format_on_load or use format_time_series to clean your data. |
| Server Timeout | Heavy models take time. Be patient or try a simpler model (e.g., NaiveForecaster) first. |