Real-time AI object detection desktop application.
Point your webcam at anything — VisionTrack AI identifies objects instantly using two AI models running fully on your machine.
Features · Getting Started · Tech Stack · Architecture · Build · Author
- Real-time webcam detection with live bounding boxes and confidence scores
- Dual AI pipeline — COCO-SSD for fast bounding boxes + MobileNet for specific labels
- 1000+ object classes including everyday items: water bottle, coffee mug, pencil, laptop, phone, book, chair, and much more
- GPU-accelerated inference via WebGL backend (TensorFlow.js)
- CPU fallback when GPU is unavailable — works on any machine
- Detection history stored locally in SQLite — fully offline, no cloud
- Auto screenshot captured on every new object detection
- System notifications when a new object appears
- Search & export history to JSON or CSV
- Dark mode UI built with React + TailwindCSS
- Works offline after first model download
| Tool | Version |
|---|---|
| Node.js | 18 or higher |
| npm | 9 or higher |
| Git | Any recent version |
# 1. Clone the repository
git clone https://github.com/jovbcorreia/visiontrack-ai.git
cd visiontrack-ai
# 2. Install dependencies
# (also rebuilds better-sqlite3 for Electron automatically)
npm install
# 3. Start in development mode
npm run devFirst launch: The COCO-SSD model (~25 MB) and MobileNet model (~16 MB) are downloaded from Google's servers. This only happens once — after that the app works fully offline.
- Wait for the status badge to turn green (
COCO-SSD ready) - Click Start Camera — allow camera permission when prompted
- Point the camera at any object
- Detection results appear instantly in the sidebar with name, confidence %, and bounding boxes
- Everything is automatically saved to the History tab
- Use the History tab to search, view, and export detections
| Script | Description |
|---|---|
npm run dev |
Start Vite dev server + Electron with hot reload |
npm run build |
Build React app to dist/ |
npm run dist:mac |
Build macOS .dmg installer |
npm run dist:win |
Build Windows .exe installer |
npm run dist |
Build for both platforms |
| Technology | Version | Role |
|---|---|---|
| Electron | 29 | Cross-platform desktop shell (macOS + Windows) |
| electron-builder | 24 | Packaging and distribution (.dmg, .exe) |
| Technology | Version | Role |
|---|---|---|
| React | 18 | UI component framework |
| TailwindCSS | 3 | Utility-first styling |
| Vite | 5 | Bundler and dev server |
| Technology | Version | Role |
|---|---|---|
| TensorFlow.js | 4 | Core ML runtime (browser/Electron) |
| COCO-SSD | 2.2 | Object detection with bounding boxes (80 classes) |
| MobileNet v1 | 2.1 | Fine-grained image classification (1000 ImageNet classes) |
| WebGL backend | — | GPU-accelerated inference via Chromium |
| Technology | Version | Role |
|---|---|---|
| better-sqlite3 | 9 | Fast synchronous SQLite for detection history |
| Node.js | 20 (embedded in Electron) | File system, IPC, screenshot saving |
| Pattern | Description |
|---|---|
contextBridge |
Secure bridge between renderer and main process |
ipcMain.handle / ipcRenderer.invoke |
Async two-way communication |
visiontrack-ai/
├── electron/ # Main process (Node.js / Electron APIs)
│ ├── main.js # Window management, IPC handlers, app lifecycle
│ ├── preload.js # Secure contextBridge — exposes window.electronAPI
│ └── database.js # SQLite CRUD via better-sqlite3
│
├── src/ # Renderer process (React, runs in Chromium)
│ ├── index.html # HTML entry point
│ ├── main.jsx # React root
│ ├── App.jsx # Root component — wires all hooks together
│ │
│ ├── components/
│ │ ├── CameraFeed.jsx # <video> + <canvas> overlay for bounding boxes
│ │ ├── Sidebar.jsx # Live detections tab + History tab
│ │ ├── Controls.jsx # Start / Stop / Clear buttons
│ │ ├── StatusBar.jsx # Top bar: logo, model status, FPS, LIVE badge
│ │ └── Toast.jsx # In-app detection notifications
│ │
│ ├── hooks/
│ │ ├── useCamera.js # MediaStream lifecycle (getUserMedia)
│ │ ├── useDetection.js # COCO-SSD + MobileNet inference loop (rAF)
│ │ └── useHistory.js # SQLite history via IPC calls
│ │
│ ├── utils/
│ │ ├── canvas.js # Bounding box drawing with corner accents
│ │ └── colors.js # Class → hex color + emoji (1000+ patterns)
│ │
│ └── styles/
│ └── index.css # Tailwind base + custom component classes
│
├── assets/ # App icons and macOS entitlements
├── vite.config.js
├── tailwind.config.js
├── electron-builder.config.js
└── package.json
Webcam frame (every ~130ms)
│
▼
COCO-SSD model
──────────────
• Detects bounding boxes
• 80 generic classes
• e.g. "bottle", "laptop", "person"
│
▼ (on NEW objects only)
MobileNet model — crops each bbox region
──────────────────────────────────────────
• Classifies what's inside the box
• 1000 ImageNet classes
• e.g. "water bottle", "notebook computer", "coffee mug"
│
▼
Enhanced label displayed
Screenshot captured
SQLite history updated
System notification fired
Renderer (React) Main Process (Node.js)
│ │
│ window.electronAPI.addDetection() │
│ ────────────────────────────────────►│ better-sqlite3 INSERT
│ │
│ window.electronAPI.getHistory() │
│ ────────────────────────────────────►│ better-sqlite3 SELECT
│ ◄────────────────────────────────────│ returns rows[]
│ │
│ window.electronAPI.saveScreenshot() │
│ ────────────────────────────────────►│ fs.writeFileSync (PNG)
│ │
│ window.electronAPI.exportCSV() │
│ ────────────────────────────────────►│ dialog.showSaveDialog
| Data | Location |
|---|---|
| SQLite database | ~/Library/Application Support/visiontrack-ai/visiontrack.db (macOS) |
%APPDATA%\visiontrack-ai\visiontrack.db (Windows) |
|
| Screenshots | <userData>/screenshots/capture-<timestamp>.png |
CREATE TABLE detections (
id INTEGER PRIMARY KEY AUTOINCREMENT,
object_name TEXT NOT NULL, -- Detected label (refined by MobileNet)
confidence REAL NOT NULL, -- COCO-SSD confidence 0.0–1.0
first_seen TEXT NOT NULL, -- ISO 8601 timestamp
last_seen TEXT NOT NULL,
first_seen_ms INTEGER NOT NULL, -- Unix ms for fast comparison
last_seen_ms INTEGER NOT NULL,
count INTEGER DEFAULT 1, -- Times seen in same session (30s window)
screenshot_path TEXT -- Absolute path to PNG capture
);# macOS (.dmg — Universal: Intel + Apple Silicon)
npm run dist:mac
# Windows (.exe NSIS installer)
npm run dist:win
# Both platforms
npm run distPackaged files are output to the release/ directory.
macOS note: The binary is re-signed with an ad-hoc signature during development. For App Store or notarization, a paid Apple Developer certificate is required.
person · bicycle · car · motorcycle · airplane · bus · train · truck · boat · traffic light · bench · bird · cat · dog · horse · cow · elephant · bear · backpack · umbrella · bottle · wine glass · cup · fork · knife · bowl · banana · apple · orange · pizza · chair · couch · bed · tv · laptop · mouse · remote · keyboard · cell phone · book · clock · scissors · teddy bear · and more
water bottle · wine bottle · coffee mug · pencil · ballpoint pen · notebook computer · mobile phone · headphone · sunglasses · backpack · wristwatch · toaster · spatula · frying pan · measuring cup · basketball · soccer ball · tennis ball · and 980+ more
electron react tensorflow coco-ssd mobilenet object-detection computer-vision ai desktop-app real-time webcam sqlite tailwindcss vite cross-platform macos windows webgl gpu typescript
João Vilas-Boas Correia
| joaopsn3@gmail.com | |
| Business | contact@jovbcorreia.com |
| Website | jovbcorreia.com |
| GitHub | @jovbcorreia |
MIT License
Copyright (c) 2025 João Vilas-Boas Correia
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.