Skip to content

Conversation

aminghadersohi
Copy link
Contributor

@aminghadersohi aminghadersohi commented Sep 16, 2025

SUMMARY

This PR introduces the basic scaffold for the Superset MCP (Model Context Protocol) service, enabling AI agents to interact programmatically with Apache Superset.

PR #2: aminghadersohi#6

Rationale:
Breaking this into smaller PRs makes it easier for reviewers to evaluate the changes incrementally rather than reviewing a large implementation all at once.

What's Included:

  • CLI command: superset mcp run
  • Basic FastMCP server using HTTP transport
  • Flask integration for Superset context
  • Development environment support

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Screenshot 2025-09-22 at 4 10 15 PM

N/A - This PR adds backend service scaffolding only, no UI changes.

TESTING INSTRUCTIONS

# 2. Set up Python environment (Python 3.10 or 3.11 required)
python3 -m venv venv
source venv/bin/activate

# 3. Install dependencies
pip install -e .[development,fastmcp]
cd superset-frontend && npm ci && npm run build && cd ..

# 4. Configure Superset manually
# Create superset_config.py in your current directory:
cat > superset_config.py << 'EOF'
# Apache Superset Configuration
SECRET_KEY = '<your secret here - hint: `secrets.token_urlsafe(42)`>'

# MCP Service Configuration
MCP_ADMIN_USERNAME = 'admin'
MCP_DEV_USERNAME = 'admin'
SUPERSET_WEBSERVER_ADDRESS = 'http://localhost:9001'

# WebDriver Configuration for screenshots
WEBDRIVER_BASEURL = 'http://localhost:9001/'
WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL

# MCP Service Connection
MCP_SERVICE_HOST = os.environ.get("MCP_SERVICE_HOST", "localhost")
MCP_SERVICE_PORT = int(os.environ.get("MCP_SERVICE_PORT", 5008))

EOF

# 5. Initialize database
export FLASK_APP=superset
superset db upgrade
superset init

# 6. Create admin user
superset fab create-admin \
  --username admin \
  --firstname Admin \
  --lastname Admin \
  --email admin@localhost \
  --password admin

# 7. Start Superset (in one terminal)
superset run -p 9001 --with-threads --reload --debugger

# 8. Start frontend (in another terminal)
cd superset-frontend && npm run dev

# 9. Start MCP service (in another terminal, only if you want MCP features)
source venv/bin/activate
superset mcp run --port 5008 --debug

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Notes:

  • Foundational scaffold with minimal functionality, completely isolated from existing code
  • Requires fastmcp library (added to requirements/development.in)
  • Zero risk - no impact on existing Superset functionality

Copy link

korbit-ai bot commented Sep 16, 2025

Based on your review schedule, I'll hold off on reviewing this PR until it's marked as ready for review. If you'd like me to take a look now, comment /korbit-review.

Your admin can change your review schedule in the Korbit Console

@aminghadersohi aminghadersohi force-pushed the feat/mcp_service_pr1_scaffold branch from ca040e0 to 42af216 Compare September 16, 2025 21:24
@aminghadersohi aminghadersohi changed the base branch from master to mcp-v1 September 16, 2025 21:28
@aminghadersohi aminghadersohi changed the title feat(mcp): Add MCP service scaffold for Apache Superset feat(mcp): PR1 - Add MCP service scaffold for Apache Superset Sep 16, 2025
@aminghadersohi aminghadersohi marked this pull request as ready for review September 16, 2025 21:31
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Design Mixed Dependency Types ▹ view 🧠 Not in standard
Performance Unconditional MCP profile cleanup ▹ view 🧠 Not in scope
Performance Missing port number validation ▹ view 🧠 Not in standard
Performance Uncached external tool installation ▹ view 🧠 Not in scope
Functionality Non-functional start method ▹ view 🧠 Not in scope
Design Global State Usage ▹ view
Functionality Unconditional MCP profile usage in cleanup ▹ view 🧠 Not in scope
Functionality Incomplete proxy cleanup in signal handler ▹ view 🧠 Not in scope
Design Duplicate Module Import ▹ view 🧠 Incorrect
Functionality Insufficient Docker startup wait time ▹ view 🧠 Not in scope
Files scanned
File Path Reviewed
superset/mcp_service/run_proxy.sh
superset/mcp_service/scripts/init.py
superset/mcp_service/utils/init.py
superset/mcp_service/index.js
superset/mcp_service/init.py
superset/config_mcp.py
docker/docker-bootstrap.sh
superset/cli/mcp.py
superset/mcp_service/simple_proxy.py
.devcontainer/setup-dev.sh
superset/mcp_service/flask_singleton.py
superset/mcp_service/server.py
.devcontainer/start-superset.sh
superset/mcp_service/app.py
superset-frontend/plugins/plugin-chart-echarts/src/Heatmap/transformProps.ts
superset/mcp_service/main.py
superset/mcp_service/scripts/setup.py
superset-frontend/plugins/plugin-chart-echarts/src/BigNumber/BigNumberWithTrendline/transformProps.ts
superset-frontend/plugins/legacy-preset-chart-deckgl/src/layers/Polygon/Polygon.tsx
superset/datasets/schemas.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

from colorama import Fore, Style


def run_setup(force: bool) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? How about using a JINJA template, or just create a base file and append SECRET_KEY = '{secrets.token_urlsafe(42)}' saying it because it does not seem very maintainable this way (specifically talking about the config_content. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall I agree. its not very clean. the rest of superset just provides readme instructions, i wanted to provide something simpler at least for dev. im gonna try the base file approach and see how it looks. i'll also update the README for the manual steps maybe thats all we need in the long term.

if "WEBDRIVER_BASEURL" not in content:
additions.append("\n# WebDriver Configuration for screenshots")
additions.append("WEBDRIVER_BASEURL = 'http://localhost:9001/'")
additions.append("WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! does MCP need a webdriver, just out of curiosity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it needs it for the screenshots of charts and dashboards.

"sqlalchemy.pool",
"sqlalchemy.dialects",
]:
logging.getLogger(logger_name).setLevel(logging.INFO)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this remove the possibility to configure logging using a logger.ini file?
I find it to be the easiest and most flexible way to do it in production

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah you are right. i think i was trying to fix the stdio approach and this is left over from that :D

logger = logging.getLogger(__name__)

# Create MCP instance without auth for scaffold
mcp = FastMCP(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could have a configurable factory to create the FastMCP instance, this way users can freely configure it the way they want/need, add custom Auth, Middleware etc. Copying the Flask pattern.
https://gofastmcp.com/servers/server#creating-a-server

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea!

@rusackas
Copy link
Member

Linking to a couple relevant SIPs for good measure:
#33870
#33215

I'll slap the appropriate hold label on this since we probably ought to get consensus on this stuff before merging.

@aminghadersohi aminghadersohi force-pushed the feat/mcp_service_pr1_scaffold branch from 2221f8b to ed73757 Compare September 19, 2025 17:45
@github-actions github-actions bot removed the plugins label Sep 19, 2025
Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!!

Copy link

codecov bot commented Sep 22, 2025

Codecov Report

❌ Patch coverage is 0% with 260 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.43%. Comparing base (77a5969) to head (6bb4feb).

Files with missing lines Patch % Lines
superset/mcp_service/__main__.py 0.00% 51 Missing ⚠️
superset/mcp_service/app.py 0.00% 46 Missing ⚠️
...uperset/mcp_service/examples/custom_mcp_factory.py 0.00% 36 Missing ⚠️
superset/mcp_service/server.py 0.00% 33 Missing ⚠️
superset/mcp_service/simple_proxy.py 0.00% 31 Missing ⚠️
superset/mcp_service/mcp_config.py 0.00% 26 Missing ⚠️
superset/mcp_service/flask_singleton.py 0.00% 16 Missing ⚠️
superset/cli/mcp.py 0.00% 15 Missing ⚠️
superset/mcp_service/__init__.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #35163       +/-   ##
===========================================
+ Coverage        0   71.43%   +71.43%     
===========================================
  Files           0      596      +596     
  Lines           0    43744    +43744     
  Branches        0     4726     +4726     
===========================================
+ Hits            0    31249    +31249     
- Misses          0    11266    +11266     
- Partials        0     1229     +1229     
Flag Coverage Δ
hive 46.00% <0.00%> (?)
mysql 70.46% <0.00%> (?)
postgres 70.51% <0.00%> (?)
presto 49.67% <0.00%> (?)
python 71.40% <0.00%> (?)
sqlite 70.11% <0.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…perset

This commit introduces the basic scaffold for the MCP service that enables
AI agents to interact with Apache Superset programmatically.

## Key Components

### Core Structure
- **CLI Interface**: `superset mcp run` and `superset mcp setup` commands
- **FastMCP Server**: HTTP-based MCP server using FastMCP protocol
- **Flask Integration**: Singleton pattern for Superset app integration
- **Configuration Setup**: Automated setup script for MCP service config

### Files Added
- `superset/cli/mcp.py`: CLI commands for running and setting up MCP service
- `superset/mcp_service/`: Core MCP service package with server, app, and config
- `superset/mcp_service/scripts/setup.py`: Configuration setup utilities
- `superset/config_mcp.py`: MCP-specific configuration template
- Development container configurations for MCP-enabled development

### Features
- Basic HTTP transport for MCP communication
- Configuration management for Superset integration
- Development environment setup with Docker support
- Node.js proxy support for alternative connection methods

This scaffold provides the foundation for subsequent PRs that will add:
- Authentication and authorization
- Tool implementations (dashboards, charts, datasets)
- Advanced middleware and permissions
- Production-ready features

The implementation follows Superset's architectural patterns and can be
extended incrementally without breaking changes.
- Add mcp_config.py with centralized MCP configuration
- Refactor setup.py to use import-based config instead of hardcoded strings
- Update README with manual setup instructions following standard Superset patterns
- Replace hidden MCP_DEBUG env var with proper app.config setting
- Remove opinionated warnings filter to let logger.ini handle warnings
- Make configuration more maintainable and discoverable
Add Flask-style application factory for FastMCP instances with support for:
- Custom authentication providers via auth parameter
- Middleware and lifespan handlers
- Tag-based tool filtering (include_tags/exclude_tags)
- Additional configuration options
- Backward compatibility with existing mcp instance

Examples provided in superset/mcp_service/examples/ showing usage
patterns for secure, filtered, and custom configurations.
Fix invalid import from non-existent superset.mcp_service.config module
to use get_mcp_config() from superset.mcp_service.mcp_config instead.
Add health_check tool in system/tool/ that returns:
- Service status and timestamp
- System information (Python version, platform)
- Basic uptime information

Tool is automatically registered via import in __init__.py
for easy connectivity testing of the MCP service.
- Replace complex threading singleton with simple module-level pattern
- Clean up server.py imports and formatting
- Update README.md with better configuration examples
Remove mcp_service/scripts/ directory and superset mcp setup command
to simplify the MCP service implementation.
- Only configure basic logging if no handlers exist (respects logging.ini)
- Only override SQLAlchemy logger levels if they're at default levels
- Prevents interference with production logging configuration
Remove time.sleep(0.1) from signal handler in simple_proxy.py
as it's not a maintainable solution for cleanup. FastMCP
proxy should handle its own cleanup when process exits.
Add required Apache Software Foundation license headers to:
- superset/mcp_service/flask_singleton.py
- superset/mcp_service/simple_proxy.py
- superset/mcp_service/run_proxy.sh
- superset/mcp_service/index.js
- superset/mcp_service/bin/superset-mcp.js

Fixes license check violations for MCP service scaffold.
Add FastMCP package dependency to requirements/base.txt for
MCP service functionality. Include base-constraint.txt for
dependency version management.

Update pyproject.toml with any related configuration changes.
- Make MCP service imports optional in CLI to prevent CI failures
- Add proper exception chaining with "from e" to fix B904 lint error
- MCP service now gracefully handles missing fastmcp dependency
- Provides helpful error message when dependencies are missing

This allows CI to pass without fastmcp while maintaining MCP
functionality when the dependency is installed.
 Please enter the commit message for your changes. Lines starting
Update MCP config loading to properly read from Flask app.config
before falling back to defaults. This allows users to override
MCP settings in their superset_config.py file.

- Refactor get_mcp_config() to accept app_config parameter
- Use Pythonic dict unpacking for cleaner config merging
- Add type hints with modern union syntax
- Update flask_singleton.py to pass app.config to get_mcp_config()
@aminghadersohi aminghadersohi force-pushed the feat/mcp_service_pr1_scaffold branch from be6d3b1 to 6bb4feb Compare September 25, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants