Category Rationale: This is the first econometrics and academic computing example in the collection, showcasing how to integrate specialized statistical software with LLMs via the Model Context Protocol. It demonstrates security-conscious tool design with guard systems, RAM monitoring, and hierarchical configuration. Essential for developers building MCP servers, academic computing tools, or LLM integrations with existing software ecosystems.
- Repository: SepineTam/stata-mcp
- CLAUDE.md: View Original
- License: AGPL-3.0
- Language: Python
- Stars: 80
- Topics: mcp-server, stata, llm-integration, statistical-computing, fastmcp
- Discovery Score: 59/100 points (promoted above threshold for unique domain)
This MCP server showcases how to safely integrate LLMs with specialized academic software. It demonstrates security patterns for command execution, resource monitoring, and cross-platform executable discovery. First example bridging econometrics/statistics with LLM agents.
GuardValidator: Validates Stata dofiles against dangerous commands- Blacklist of prohibited operations (
shell,rm,! del) - Prevents destructive operations before execution
- Configurable via
IS_GUARDsetting
RAMMonitor: Tracks Stata process memory usage with psutil- Automatic process termination when RAM exceeds limit
- Configurable via
IS_MONITORandMAX_RAM_MBsettings - Extensible
MonitorBasefor custom monitors
- Priority: environment variables > config file > defaults
- TOML-based configuration (
~/.statamcp/config.toml) - Hot-reload support for configuration changes
- Environment variable prefix (
STATA_MCP_)
StataFinder: Locates Stata executable on macOS, Windows, Linux- Platform-specific default paths
- Fallback to system PATH
- Clear error messages for unsupported platforms
# GuardValidator checks dofiles against blacklist
validator = GuardValidator()
is_safe = validator.validate(dofile_path)
# Prevents shell commands, file deletions, etc.Proactive security before executing user-provided code.
# ~/.statamcp/config.toml
[SECURITY]
IS_GUARD = true
[MONITOR]
IS_MONITOR = false
MAX_RAM_MB = -1 # -1 means no limitEnvironment variables override config file values.
# StataFinder locates Stata across platforms
finder = StataFinder()
stata_path = finder.find() # macOS: /Applications/Stata/, Windows: Program Files, Linux: PATHAutomatic discovery with clear platform documentation.
<cwd>/stata-mcp-folder/
├── stata-mcp-log/ # Stata execution logs
├── stata-mcp-dofile/ # Generated do-files
├── stata-mcp-result/ # Analysis results
└── stata-mcp-tmp/ # Temporary files
Configurable via STATA_MCP_CWD environment variable.
help: Get Stata command documentation (macOS/Linux)stata_do: Execute Stata do-files with loggingwrite_dofile: Create do-files from code snippetsappend_dofile: Append code to existing do-filesget_data_info: Analyze CSV, DTA, XLSX filesado_package_install: Install Stata packages (SSC, GitHub, net)load_figure: Load Stata-generated graphsmk_dir: Create directories safely
-
Security Guards for Code Execution: Implement blacklist-based validation before executing user-provided code. Use abstract base classes (
MonitorBase) for extensible resource monitoring and automatic termination patterns. -
Hierarchical Configuration Systems: Design configuration with clear precedence (env vars > config file > defaults). Use TOML for human-readable config, provide example files, and document all environment variables with prefixes.
-
Cross-Platform Executable Discovery: Build platform-aware finders with fallback chains. Document platform-specific paths clearly, provide graceful error messages, and support system PATH as final fallback.
Original CLAUDE.md created by Sepine Tam for the stata-mcp project. This analysis references the original file under the terms of the AGPL-3.0 License.