Skip to content
View HenryNdubuaku's full-sized avatar
  • Cactus Compute
  • London
  • 09:42 (UTC +01:00)

Block or report HenryNdubuaku

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HenryNdubuaku/README.md

Henry Ndubuaku

LinkedIn Twitter Email Spotify

I build AI/ML models & systems, from the fundamental maths to JAX/Torch on large-scale infra or low-level implementations on accelerators. Author of cactus, 2.5k stars, now backed by YCombinator & Oxford Seed Fund.

Core Expertise

Maths Computing AI/ML/RL Distributed Systems GPU

Main Tools

Python C++ PyTorch Jax CUDA Vulkan Neon Cloud

Career Progression

  • 2025-XX: Cactus (YC S25) - Founder & CTO (tiny inference engine for phones and wearables).
  • 2024-25: Deep Render - AI Research Engineer (realtime video models that run on phone GPU/NPU).
  • 2021-24: Wisdm - ML Software Engineer (distributed perception AI for Maxar Defence satelite views).
  • 2019-21: MSc + Open-source activities (JAX/NanoDl, Torch/SuperLazyAutograd, CUDARepo, etc.).
  • 2018-19: Google GADS Scholarship Programme with Andela (pre-MSc), around systems design.
  • 2017-18: National Youth service, posted to software engineering after bootcamp, mostly ARM.
  • 2011-16: Started uni at 15y, covered EECS, data structures, algorithms, maths, physics.

Fun Highlights

  • Wrote Math For ML (with codes).
  • Gave this lecture to a small ML group in Nigeria, on optimising large-scale ML in JAX.
  • Co-host this monthly dinner for AI researchers, engineers and founders in London.
  • Kevin Murphy (DeepMind Principal), Daniel Holtz (Mid Journey Founder), Steve Messina (IBM CTO) followed back on X.
  • After CUDARepo, Nvidia reached out, I did 7 technical rounds, got a verbal offer, back-and-forth over YOE/pay, then I got YC.
  • Did MSc at QMUL, just to work with Prof Matt Purver (Ex-Stanford Researcher on CALO), did my project/thesis with his team.
  • Did BEng under Prof Onyema Uzoamaka (Rumoured first Nigerian CS grad from MIT), he taught computing archs off-head!

Pinned Loading

  1. cactus-compute/cactus cactus-compute/cactus Public

    Cross-platform framework for deploying LLM/VLM/TTS models locally on smartphones.

    C++ 2.7k 155

  2. nanodl nanodl Public

    A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.

    Python 290 11

  3. cuda-tutorials cuda-tutorials Public

    CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.

    Cuda 187 5

  4. super-lazy-autograd super-lazy-autograd Public

    Hand-derived memory-efficient super lazy PyTorch VJPs for training LLMs on laptop, all using one op (bundled scaled matmuls).

    Python 73 1

  5. pete pete Public

    Parameter-efficient transformer embeddings replace learned embeddings with hardware-aware polynomial expansions of token IDs.

    Python 6

  6. tango tango Public

    Decentralised ML engine, where tiny edge devices like smart watches, phones, VR headsets, game consoles etc., could contribute.

    Go 1 1