Complete command-line interface reference for Agentic Data Scientist.
agentic-data-scientist [OPTIONS] QUERYThe CLI provides a simple interface to run data science analyses using either the full multi-agent workflow or direct coding mode.
You must specify an execution mode for every query. This ensures you're aware of the complexity and API costs.
Choices:
orchestrated: Full multi-agent workflow with planning, validation, and adaptive execution (recommended for complex analyses)simple: Direct coding mode without planning overhead (for quick scripts and simple tasks)
Examples:
# Complex analysis with full workflow
agentic-data-scientist "Perform differential expression analysis" --mode orchestrated --files data.csv
# Quick scripting task
agentic-data-scientist "Write a Python function to parse JSON" --mode simpleUpload files or directories to include in the analysis. Can be specified multiple times for multiple files.
Behavior:
- Files are uploaded to the working directory
- Directories are uploaded recursively
- All uploaded files are accessible to agents
Examples:
# Single file
agentic-data-scientist "Analyze this data" --mode orchestrated --files data.csv
# Multiple files
agentic-data-scientist "Compare datasets" --mode orchestrated -f data1.csv -f data2.csv
# Directory upload (recursive)
agentic-data-scientist "Analyze all files" --mode orchestrated --files ./data_folder/Specify a custom working directory for the session.
Default: ./agentic_output/ in your current directory
Behavior:
- Files are saved to this location
- Directory is preserved after completion (unless
--temp-diris used) - Created if it doesn't exist
Examples:
# Custom directory
agentic-data-scientist "Analyze data" --mode orchestrated --files data.csv --working-dir ./my_analysis
# Absolute path
agentic-data-scientist "Process data" --mode orchestrated --files data.csv -w /tmp/analysis_2024Use a temporary directory in /tmp with automatic cleanup after completion.
Behavior:
- Creates a unique temporary directory
- Automatically deleted after the session completes
- Overrides
--working-dirif both are specified - Useful for quick analyses where you don't need to keep files
Examples:
# Temporary analysis
agentic-data-scientist "Quick exploration" --mode simple --files data.csv --temp-dir
# Question answering (no files to keep)
agentic-data-scientist "Explain gradient boosting" --mode simple --temp-dirExplicitly preserve the working directory after completion.
Default: Files are preserved by default when using --working-dir or default directory
Note: This flag has no effect when using --temp-dir (temp directories are always cleaned up)
Examples:
# Explicitly keep files
agentic-data-scientist "Generate report" --mode orchestrated --files data.csv --keep-filesSpecify a custom path for the log file.
Default: .agentic_ds.log in the working directory
Examples:
# Custom log location
agentic-data-scientist "Analyze data" --mode orchestrated --files data.csv --log-file ./analysis.log
# Absolute path
agentic-data-scientist "Process data" --mode simple --log-file /var/log/agentic_analysis.logEnable verbose logging for debugging.
Behavior:
- Shows detailed execution logs
- Displays internal agent communication
- Useful for troubleshooting
Examples:
# Verbose output
agentic-data-scientist "Debug issue" --mode simple --files data.csv --verbose
# Combined with other options
agentic-data-scientist "Complex analysis" --mode orchestrated --files data.csv --verbose --log-file debug.logUnderstanding how working directories work is important for managing your analysis files.
When you don't specify any directory options:
- Creates
./agentic_output/in your current directory - Preserves all files after completion
- Agents can read and write files here
agentic-data-scientist "Analyze data" --mode orchestrated --files data.csv
# Files saved to: ./agentic_output/
# Preserved: YesWhen you use --temp-dir:
- Creates a unique directory in
/tmp - Automatically deleted after completion
- Use for quick analyses where you don't need files
agentic-data-scientist "Quick test" --mode simple --files data.csv --temp-dir
# Files saved to: /tmp/agentic_ds_XXXXXX/
# Preserved: No (auto-cleanup)When you specify --working-dir:
- Uses your specified directory
- Preserves files after completion
- Directory is created if it doesn't exist
agentic-data-scientist "Project analysis" --mode orchestrated --files data.csv --working-dir ./my_project
# Files saved to: ./my_project/
# Preserved: YesFull multi-agent workflow with planning, validation, and adaptive execution.
When to Use:
- Complex data analyses
- Multi-step workflows
- Tasks requiring validation and quality assurance
- Situations where planning improves outcomes
- Tasks where requirements might evolve during execution
What Happens:
- Plan Maker creates a comprehensive plan
- Plan Reviewer validates the plan
- For each stage:
- Coding Agent implements the stage
- Review Agent validates implementation
- Criteria Checker tracks progress
- Stage Reflector adapts remaining work
- Summary Agent creates final report
Examples:
# Differential expression analysis
agentic-data-scientist "Perform DEG analysis comparing treatment vs control" \
--mode orchestrated \
--files treatment_data.csv \
--files control_data.csv
# Complete analysis pipeline
agentic-data-scientist "Analyze customer churn, create predictive model, and generate report" \
--mode orchestrated \
--files customers.csv \
--working-dir ./churn_analysis
# Multi-file processing
agentic-data-scientist "Analyze all CSV files and create summary statistics" \
--mode orchestrated \
--files ./raw_data/Direct coding without planning or validation overhead.
When to Use:
- Quick scripting tasks
- Simple code generation
- Question answering
- Rapid prototyping
- Tasks where planning overhead isn't needed
What Happens:
- Direct execution by Claude Code agent
- No planning phase
- No review or validation loops
- Faster but no quality assurance
Examples:
# Generate utility scripts
agentic-data-scientist "Write a Python script to merge CSV files by common column" \
--mode simple
# Technical questions
agentic-data-scientist "Explain the difference between Random Forest and Gradient Boosting" \
--mode simple
# Quick analysis
agentic-data-scientist "Create a basic scatter plot from this data" \
--mode simple \
--files data.csv \
--temp-dir# Compare multiple datasets
agentic-data-scientist "Compare these datasets and identify trends" \
--mode orchestrated \
-f dataset1.csv \
-f dataset2.csv \
-f dataset3.csv# Process entire directory
agentic-data-scientist "Analyze all JSON files and create consolidated report" \
--mode orchestrated \
--files ./data_directory/# Quick exploration without keeping files
agentic-data-scientist "Explore data distributions" \
--mode simple \
--files data.csv \
--temp-dir# Organized project structure
agentic-data-scientist "Complete statistical analysis with visualizations" \
--mode orchestrated \
--files raw_data.csv \
--working-dir ./projects/analysis_2024 \
--log-file ./projects/analysis_2024.log# Verbose output for troubleshooting
agentic-data-scientist "Debug data processing issue" \
--mode simple \
--files problematic_data.csv \
--verbose \
--log-file debug.logMost common method - provide query as a command-line argument:
agentic-data-scientist "Your query here" --mode orchestratedPipe input from another command or file:
# From echo
echo "Analyze this dataset" | agentic-data-scientist --mode simple --files data.csv
# From file
cat query.txt | agentic-data-scientist --mode orchestrated --files data.csv0: Success1: Error (invalid arguments, runtime error, etc.)
The CLI respects these environment variables (set in .env file or shell):
Required:
OPENROUTER_API_KEY: OpenRouter API key for planning/review agentsANTHROPIC_API_KEY: Anthropic API key for coding agent
Optional:
DEFAULT_MODEL: Model for planning/review (default:google/gemini-2.5-pro)CODING_MODEL: Model for coding agent (default:claude-sonnet-4-5-20250929)
The CLI displays:
- Agent activities and progress
- Key decisions and milestones
- File creation notifications
- Completion summary
Detailed logs are written to:
- Default:
.agentic_ds.login working directory - Custom: Path specified by
--log-file
Logs include:
- Full agent conversations
- Tool calls and responses
- Error messages and stack traces
- Token usage statistics
You didn't provide a query. Either:
- Provide query as argument:
agentic-data-scientist "query" --mode orchestrated - Pipe from stdin:
echo "query" | agentic-data-scientist --mode orchestrated
The --mode flag is required. Specify either:
--mode orchestratedfor full multi-agent workflow--mode simplefor direct coding
Check that:
- File paths are correct and files exist
- You have read permissions
- File paths don't contain special characters that need escaping
Ensure you have:
OPENROUTER_API_KEYset in environment or.envfileANTHROPIC_API_KEYset in environment or.envfile- API keys are valid and have sufficient credits
Ensure you have:
- Write permissions to the working directory
- Sufficient disk space
- Parent directories exist (or can be created)
For large analyses:
- Use
--temp-dirto ensure cleanup - Process files in smaller batches
- Use simple mode for less memory overhead
-
Use Orchestrated Mode for Important Work
- Planning catches issues early
- Validation ensures quality
- Worth the extra API cost for production analyses
-
Use Simple Mode for Quick Tasks
- Fast iteration during development
- Question answering
- Simple script generation
-
Organize Your Work
- Use
--working-dirfor project organization - Use
--temp-dirfor temporary explorations - Keep related analyses in dedicated directories
- Use
-
Enable Verbose Logging When Needed
- Use
--verbosewhen debugging - Specify
--log-filefor persistent logs - Review logs to understand agent behavior
- Use
-
Manage File Lifecycle
- Use
--temp-dirfor throwaway analyses - Use custom
--working-dirfor important work - Clean up old working directories periodically
- Use
# Initial exploration (temporary)
agentic-data-scientist "Explore data distributions and missing values" \
--mode simple --files data.csv --temp-dir
# Full analysis (preserved)
agentic-data-scientist "Perform complete statistical analysis with visualizations" \
--mode orchestrated --files data.csv --working-dir ./analysis_results
# Model building
agentic-data-scientist "Build and evaluate multiple regression models" \
--mode orchestrated --files train.csv --files test.csv \
--working-dir ./models# Differential expression
agentic-data-scientist "Perform DESeq2 differential expression analysis" \
--mode orchestrated \
--files counts.csv --files metadata.csv \
--working-dir ./deg_analysis
# Pathway analysis
agentic-data-scientist "Run GSEA pathway enrichment on DEGs" \
--mode orchestrated --files deg_results.csv \
--working-dir ./pathway_analysis# Generate utility script
agentic-data-scientist "Write Python script to merge CSV files" \
--mode simple --working-dir ./scripts
# Batch processing script
agentic-data-scientist "Create script to process multiple data files" \
--mode simple --files sample_data.csv --working-dir ./scripts# Technical questions
agentic-data-scientist "Explain PCA and when to use it" --mode simple --temp-dir
# Code examples
agentic-data-scientist "Show me how to use pandas groupby with multiple aggregations" \
--mode simple --temp-dir