- FastAPI - High-performance Python web framework with SSE streaming
- JWT Authentication - Secure token-based user authentication
- Yahoo Finance API - Real-time market data integration
- NumPy & Pandas - Vectorized analytics for sub-second responses
- uvicorn - Lightning-fast ASGI server with WebSocket support
- React 18 - Modern component-based UI framework
- Material-UI (MUI) - Professional Material Design component library
- React Router - Client-side routing and navigation
- Axios - HTTP client for API communication
- Chart.js - Interactive data visualization
- Ollama - Local LLM inference with quantized Llama 3.1 8B model
- FAISS - Vector database for hybrid semantic + keyword search
- Sentence Transformers - Financial document embeddings
- CUDA Acceleration - GPU-optimized inference (30-70 tok/s)
- LangChain - LLM orchestration and prompt engineering
- yfinance - Historical market data retrieval
- Enhanced Monte Carlo - Multi-path simulations with correlation modeling
- Black-Scholes with Greeks - Full derivatives pricing (Delta, Gamma, Theta, Vega, Rho)
- Advanced Risk Models - VaR/CVaR, stress testing, and portfolio optimization
- Retool - No-code admin dashboard with real-time metrics
- Zapier - Automated workflows for alerts and onboarding
- n8n - Complex automation for rebalancing and compliance
- Webhooks - Event-driven portfolio management
- Built complete RESTful API with FastAPI featuring 20+ endpoints
- Implemented JWT-based authentication with secure token management
- Created responsive React SPA with Material-UI component system
- Designed scalable backend architecture supporting multiple portfolio management
- Local LLM Inference: Quantized Llama 3.1 8B with CUDA acceleration (30-70 tok/s)
- RAG Pipeline: FAISS hybrid search across financial databases
- Real-time AI Analysis: SSE streaming for live portfolio insights
- Investment Signals: AI-generated buy/sell recommendations with confidence scores
- Enhanced Monte Carlo: 10,000+ simulations with correlation modeling
- Black-Scholes with Greeks: Full derivatives pricing and risk sensitivity analysis
- Multi-Portfolio Risk Management: Consolidated analysis across multiple accounts
- Developed intuitive dashboard with portfolio management capabilities
- Implemented multiple portfolio views (Standard, Robinhood-style, Analytics)
- Created interactive charts and visualizations for financial data
- Built responsive design optimized for desktop and mobile devices
- Historical position entry capability dating back to year 2000
- Secure data validation and sanitization across all endpoints
- Implemented proper error handling and user feedback systems
- Built comprehensive logging and monitoring capabilities
- Create/Delete Portfolios - Complete CRUD operations with confirmation dialogs
- Add/Remove Holdings - Dynamic position management with real-time updates
- Historical Data Integration - Add positions from any date since 2000
- Real-time Price Updates - Live market data with automatic refresh
- Performance Analytics - P&L tracking, percentage gains/losses, volatility analysis
- Multi-Method VaR - Historical, parametric, and Monte Carlo VaR calculations
- Enhanced Monte Carlo - Correlated asset simulations with 10,000+ scenarios
- AI Risk Explanations - Plain-English interpretation of complex risk metrics
- Cross-Portfolio Analysis - Consolidated risk across multiple portfolios
- Stress Testing - AI-powered scenario analysis and tail risk assessment
- Black-Scholes with Greeks - Complete sensitivity analysis (Delta, Gamma, Theta, Vega, Rho)
- Monte Carlo Option Pricing - Simulation-based pricing for complex derivatives
- Implied Volatility Surface - 3D volatility visualization across strikes and expiries
- AI Options Analysis - Intelligent options strategy recommendations
- Real-time Greeks Monitoring - Live risk sensitivity tracking
- AI-Powered Optimization - Machine learning enhanced portfolio allocation
- Multi-Portfolio Management - Cross-portfolio optimization and rebalancing
- RAG-Enhanced Research - AI-driven market research and analysis
- Automated Rebalancing - Workflow-driven portfolio maintenance
- Performance Attribution - AI-explained return decomposition
The following table demonstrates the incremental optimization improvements for local LLM inference on consumer RTX hardware:
| Change | Tok/s | TTFT (ms) | p95 (ms) | VRAM (GB) | Δ Quality |
|---|---|---|---|---|---|
| Baseline FP16 | 32 | 350 | 1200 | 18.2 | 0 |
| + vLLM (paged KV) | 54 | 260 | 980 | 16.9 | 0 |
| + Q4_K_M | 68 | 250 | 910 | 13.1 | −1.3% |
| + FlashAttn/fused | 81 | 240 | 820 | 13.1 | −1.3% |
| + batched tokenizers/SSE | 81 | 210 | 620 | 13.1 | −1.3% |
Performance Metrics:
- Tok/s: Tokens per second throughput
- TTFT: Time to first token (latency)
- p95: 95th percentile response time
- VRAM: GPU memory usage
- Δ Quality: Accuracy delta vs baseline
This optimization pipeline achieves a 2.5x throughput improvement (32→81 tok/s) while reducing memory usage by 28% and maintaining near-baseline accuracy.
# JWT Authentication with FastAPI
@app.post("/auth/login", response_model=Token)
async def login(user_credentials: UserLogin):
user = authenticate_user(user_credentials.email, user_credentials.password)
access_token = create_access_token(data={"sub": user_credentials.email})
return {"access_token": access_token, "token_type": "bearer"}
# Real-time Portfolio Analytics
@app.get("/portfolio/{portfolio_id}")
async def get_portfolio(portfolio_id: int, current_user: User = Depends(get_current_user)):
# Advanced portfolio calculations with risk metrics
return sanitize_for_json(portfolio_data)// React Component with Material-UI
function Portfolio() {
const [portfolio, setPortfolio] = useState(null);
const [deleteDialog, setDeleteDialog] = useState({ open: false });
const handleDeleteHolding = async (holdingId) => {
await axios.delete(`/portfolio/${portfolioId}/holdings/${holdingId}`);
fetchPortfolio(); // Real-time updates
};
}# Monte Carlo Simulation Implementation
def run_monte_carlo_simulation(portfolio_data, n_simulations=10000):
mean_return = 0.08
volatility = 0.2
random_returns = np.random.normal(mean_return, volatility, n_simulations)
final_values = [initial_value * (1 + ret) for ret in random_returns]
return calculate_percentiles(final_values)- Portfolio Transparency - Complete visibility into holdings and performance
- Risk Assessment - Quantitative risk analysis with actionable insights
- Historical Analysis - Track performance from any starting date since 2000
- Professional Tools - Institution-grade analytics accessible to retail investors
- Modular Architecture - Easily extensible for additional features
- API-First Design - RESTful endpoints suitable for mobile app integration
- Real-time Data - Live market data integration with automated updates
- Performance Optimized - Efficient data processing and responsive UI
- Real-time stock prices from Yahoo Finance
- Historical price data dating back to 2000
- Automatic portfolio value calculations
- Daily P&L tracking with percentage changes
- Interactive charts and visualizations
- Risk metrics and correlation analysis
- Monte Carlo simulation results
- Portfolio optimization recommendations
- Secure JWT-based authentication
- User registration and login
- Protected routes and API endpoints
- Session management and token refresh
# One-command automated setup (recommended)
python3 setup.py
# Or test dependencies first
python3 test_dependencies.pyThe automated setup will:
- Detect your platform (macOS Intel/Silicon, Linux GPU, Windows)
- Install appropriate dependencies for optimal performance
- Download Ollama and LLM models automatically
- Test the installation and provide feedback
- Create platform-specific startup scripts
Requirements File: requirements-mac.txt
- Optimized for macOS Metal Performance Shaders
- CPU-optimized FAISS and PyTorch
- Compatible with both Intel and Apple Silicon
# Install dependencies optimized for Mac
pip install -r requirements-mac.txt
# Install Ollama for AI features
brew install ollama
# OR: curl -fsSL https://ollama.ai/install.sh | sh
# Download LLM model
ollama pull llama3.1:8b-instruct-q4_K_M
# Start services
./start_backend.sh # Terminal 1
./start_frontend.sh # Terminal 2Requirements File: requirements-gpu.txt
- CUDA-accelerated PyTorch and FAISS
- GPU monitoring utilities
- Optimized for 30-70 tokens/second LLM inference
# Verify GPU first
nvidia-smi
# Install GPU-optimized dependencies
pip install -r requirements-gpu.txt
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Download LLM model
ollama pull llama3.1:8b-instruct-q4_K_M
# Start with GPU acceleration
CUDA_VISIBLE_DEVICES=0 ./start_backend.sh # Terminal 1
./start_frontend.sh # Terminal 2Requirements File: requirements-core.txt or requirements-mac.txt
- Manual Ollama installation required
- GPU support via CUDA toolkit
# Install Python dependencies
pip install -r requirements-core.txt
# Download Ollama from https://ollama.ai/download/windows
# Then: ollama pull llama3.1:8b-instruct-q4_K_M
# Start services
start_backend.bat # Terminal 1
start_frontend.bat # Terminal 2Requirements File: requirements-core.txt
- Minimal installation without AI features
- Basic portfolio analytics only
# Minimal installation
pip install -r requirements-core.txt
# Test installation
python3 test_dependencies.py
# Start backend (AI features limited)
python3 main.py
# Frontend (separate terminal)
cd frontend && npm install && npm start# Step 1: Test your platform
python3 test_dependencies.py
# Step 2: Install based on recommendation
pip install -r [recommended-file]
# Step 3: Test again
python3 test_dependencies.py
# Step 4: Start application
./start_backend.sh
./start_frontend.sh# Automated setup (detects your platform automatically)
python3 setup.py
# Manual quick start
python3 test_dependencies.py # Check what you need
pip install -r requirements-core.txt # Install basics
./start_backend.sh # Start backend
./start_frontend.sh # Start frontend (new terminal)
# Access application
open http://localhost:3000- Demo User: demo@auravest.com / demo123
- Admin User: admin@auravest.com / admin123
- Frontend Application: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
- Interactive Swagger UI: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
- Core API:
/portfolio/*,/auth/*,/market/* - AI Features:
/ai/*- LLM analysis and RAG search - Multi-Portfolio:
/multi-portfolio/*- Cross-portfolio management - Automation:
/automation/*- Workflow triggers and admin tools - Advanced Analytics:
/options/*,/analysis/*
| Feature | Core | macOS | GPU | Full AI |
|---|---|---|---|---|
| Portfolio Management | Yes | Yes | Yes | Yes |
| Risk Analytics | Yes | Yes | Yes | Yes |
| Monte Carlo | Yes | Yes | Yes | Yes |
| Options Pricing | Yes | Yes | Yes | Yes |
| Local LLM | No | Yes | Yes | Yes |
| RAG Search | No | Yes | Yes | Yes |
| AI Analysis | No | Yes | Yes | Yes |
| GPU Acceleration | No | Metal | CUDA | CUDA |
1. Python Version
- Requires Python 3.10+
- Check:
python3 --version
2. Node.js Missing
- Install from https://nodejs.org/
- Requires Node.js 16+
3. Dependency Conflicts
# Try core requirements first
pip install -r requirements-core.txt
# Test what works
python3 test_dependencies.py
# Add AI packages individually
pip install ollama faiss-cpu sentence-transformers4. Ollama Connection
- macOS/Linux: Service starts automatically
- Windows: Manual installation required
- Test:
ollama list
5. GPU Not Detected
- Check:
nvidia-smi(Linux) - Install CUDA toolkit if needed
- Use CPU-only requirements as fallback
- Local LLM Deployment: Optimized quantized Llama 3.1 8B with CUDA acceleration achieving 30-70 tok/s
- RAG Pipeline Architecture: Built FAISS hybrid search with semantic + keyword matching
- Real-time AI Streaming: Implemented FastAPI SSE for live analysis and research
- Performance Optimization: Vectorized analytics with NumPy/Pandas for sub-second responses
- Multi-Portfolio Risk Management: Advanced correlation analysis and cross-portfolio optimization
- Created responsive React application with Material-UI component system
- Implemented complex state management for real-time portfolio updates
- Designed intuitive user interface for financial data visualization
- Built interactive charts and dashboards for quantitative analysis
- Developed high-performance FastAPI backend with JWT authentication
- Integrated Yahoo Finance API for live market data processing
- Implemented advanced financial calculations including risk metrics and portfolio optimization
- Created comprehensive API documentation with automatic OpenAPI generation
- Enhanced Monte Carlo Engine: Multi-path simulations with correlation modeling and stress testing
- Black-Scholes with Greeks: Complete derivatives pricing with sensitivity analysis
- AI-Powered Risk Analysis: LLM-generated explanations and investment signals
- RAG Financial Research: Semantic search across earnings, news, and market data
- Automated Risk Management: Workflow-driven compliance and portfolio rebalancing
- LLM Throughput: 30-70 tokens/second on consumer RTX GPUs
- Monte Carlo Speed: 10,000 simulations in <2 seconds
- Portfolio Analytics: Sub-second response for 100+ holdings
- RAG Search: <500ms hybrid search across financial databases
- Real-time Streaming: Live AI analysis with FastAPI SSE
- Retool Admin Dashboard: Real-time portfolio monitoring and user analytics
- Zapier Integration: Automated email alerts and user onboarding workflows
- n8n Complex Workflows: Portfolio rebalancing and compliance automation
- Webhook-Driven Events: Real-time notifications and risk management
AuraVest represents a cutting-edge demonstration of AI-powered financial engineering, combining local LLM inference, advanced quantitative models, and automated workflow orchestration in a production-ready platform.