Skip to content
View krishvadhani19's full-sized avatar
🎯
Focusing
🎯
Focusing
  • krishvadhani7@gmail.com
  • Mountain View, CA
  • 06:52 (UTC -07:00)

Highlights

  • Pro

Block or report krishvadhani19

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
krishvadhani19/README.md

Krish Mehul Vadhani

AI Systems · Full-Stack · DevOps · Startup Engineer

I build things that scale. I've done it multiple times. I'll do it again.

Location LinkedIn Email Phone


Open to Work

I'm actively looking for roles in Software Engineering and DevOps / Cloud Infrastructure where I can own hard problems end-to-end and move fast. I thrive in startup-speed environments, but I bring the engineering discipline of a much larger org. If you're building something ambitious, let's talk.

vadhani.k@northeastern.edu · (617) 560-0171


Who I Am

I'm the engineer who joined Resemble AI and built their entire voice agents platform from zero. Architecture, infrastructure, real-time streaming, multi-agent orchestration, and enterprise deployment. All of it. Before that I engineered event-driven microservices at a Sequoia-backed company, led a full platform redesign, and built a multi-agent RAG system for a Northeastern-backed research lab. From scratch. Every time.

I don't wait for tickets. I identify the problem, design the solution, build it, instrument it, and ship it. That's just how I work.

I've operated across the full stack: frontend, backend, cloud infrastructure, AI/ML pipelines, DevOps. I'm equally at home optimizing a Kubernetes cluster as I am tuning vLLM inference or building a React dashboard. What stays constant is the standard: production-grade, observable, scalable, and fast.

MS Computer Software Engineering · Northeastern University, Boston · GPA 3.8 BS Computer Engineering · University of Mumbai · GPA 3.6


What I've Actually Built

Voice Agents Platform

Resemble AI · Google & Sony backed

  • Built the entire platform from scratch: SIP trunk integration across Twilio, Telnyx, and BYO-SIP, full call lifecycle management, MCP integration, RAG pipelines, tool calling, and multi-agent orchestration for real-time in-call handoffs between sales, technical, and billing agents. Scaled to 1,000+ concurrent enterprise calls.
  • Went deep into vLLM internals: KV cache allocation, continuous batching, speculative decoding. Cut TTFT by 37%.
  • Built custom WebSocket streaming pipelines with backpressure handling and connection multiplexing to hit sub-800ms p95 end-to-end latency.
  • GPU-aware auto-scaling and model sharding brought inference costs down by 38%.
  • Built the full observability layer: per-session tracking of end-of-utterance delay, STT transcription latency, LLM TTFT, TTS TTFB, tokens per second, and character counts for cost profiling.
  • Extended the platform into a post-call suite with recording, transcription, and a management dashboard that competed with Otter.ai and Fireflies and generated paying customers.

Real-Time Deepfake Detection

Resemble AI · Google & Sony backed

  • Designed and shipped a live deepfake detection system for Google Meet, Teams, and Zoom.
  • Engineered per-participant audio and video stream ingestion feeding proprietary deepfake models in real time.
  • Wired up automated host alerting at 90%+ confidence. Real-time. Production.

Multi-Agent RAG System

Dash Labs · Northeastern University backed

  • Built a multi-agent RAG pipeline from scratch: LangChain orchestration, FAISS vector indexing, and sentence-transformer embeddings for automated essay evaluation.
  • Built the entire product around it: frontend, backend, API layer, and integration points between the agent pipeline and application layer.
  • Benchmarked grading accuracy across Llama 3.1 and DeepSeek-R1 using single-shot prompting, chain-of-thought, and multi-agent strategies to pick the right model for production.

Platform Engineering

Avataar · Sequoia & Tiger Global backed

  • Engineered a JWT session governance system with server-side token rotation and invalidation. Result: +33% paid conversions among free-tier users.
  • Led a platform-wide responsive redesign with a breakpoint-driven layout system, adaptive component rendering, and lazy-loaded viewport-specific assets. Result: +57% mobile signups.
  • Architected event-driven microservices on Node.js and Express for social interactions with async message queues and read-replica routing. Result: -43% API response times.
  • Built a multi-tier caching layer using AWS CloudFront, S3, and cache invalidation policies with origin shield and edge-optimized distribution. Result: -38% thumbnail load times globally.

Technical Stack

Languages

Python TypeScript JavaScript C++ Java SQL Ruby

AI / ML & LLMs

Python LangChain HuggingFace Ollama OpenAI FAISS

Full-Stack

React Next.js Node.js Express Vue React Native Django Rails

Cloud & DevOps

AWS Docker Kubernetes GitHub Actions Airflow Twilio

Databases

PostgreSQL MySQL MongoDB Redis InfluxDB OpenSearch


How I Work

I own the outcome, not the task. I don't execute tickets and hand off. I identify the problem, design the solution, build it, ship it, and monitor it. If something breaks at 2am, I already have alerts set up.

I move startup fast with production standards. Short cycles, fast feedback, aggressive iteration. But with proper observability, error handling, and architecture baked in from the start. I've seen what happens when you skip those steps. I don't skip them.

I go deep. I've debugged vLLM KV cache behavior. I've traced WebSocket backpressure to the byte level. I've profiled SIP signaling latency end-to-end. When something is slow or broken, I don't guess. I instrument, measure, and fix.

I build for scale from day one. Not premature optimization. But architecture decisions that don't require a full rewrite at 10x traffic. Concurrency models, resource management, and deployment infrastructure designed to grow.

Agile isn't a process to me. It's a mindset. Deliver incrementally. Validate constantly. Adapt quickly. Ship.


Currently Exploring

Advanced LLM inference optimization · Real-time multimodal AI systems · Large-scale DevOps and platform engineering


"Move fast, build things that last."

Visitor Count

Pinned Loading

  1. java-spring-microservices java-spring-microservices Public

    Java 2

  2. messenger_chat messenger_chat Public

    TypeScript

  3. messenger-chat messenger-chat Public

    Messenger chat application

    TypeScript

  4. React-WebRTC React-WebRTC Public

    TypeScript

  5. krishvadhani19 krishvadhani19 Public