Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
adding-new-workstations.md	adding-new-workstations.md
docker-development-workflow.md	docker-development-workflow.md
exercises.md	exercises.md
gpu-sharing-guide.md	gpu-sharing-guide.md
local-mlops-stack.md	local-mlops-stack.md
remote-development-setup.md	remote-development-setup.md
system-setup-guide.md	system-setup-guide.md
team-image-management.md	team-image-management.md
workstation-best-practices.md	workstation-best-practices.md

Module 3: ML Workstation Setup

This module covers setting up and using shared GPU workstations for ML development. It is specifically designed for teams where multiple people share one or more machines with NVIDIA GPUs.

A shared GPU workstation is often the most cost-effective way to do ML work. Cloud GPU instances cost $0.50-3.00/hour. A desktop workstation with an RTX 4090 pays for itself in weeks of heavy use. But sharing a machine requires discipline -- one person's careless sudo pip install can break everyone's environment.

This module teaches you how to set up a workstation correctly from day one, how to work remotely on it, how to use Docker as your daily development environment, how to share GPUs across a team, and how to run a complete MLOps stack locally. It is designed to be thorough enough to follow entirely on your own, without an instructor.

Estimated time: 3-4 hours (read the guides, then work through the exercises).

Topics

File	What You Will Learn	Time
system-setup-guide.md	Platform-specific setup (Linux, macOS, Windows)	1 hour
remote-development-setup.md	SSH, VS Code Remote, port forwarding, tmux	45 min
workstation-best-practices.md	Golden rules for shared GPU workstations	30 min
docker-development-workflow.md	Docker as your daily ML dev environment	45 min
gpu-sharing-guide.md	Sharing GPUs across a team, monitoring, MIG	30 min
team-image-management.md	Building, naming, storing, cleaning images	20 min
local-mlops-stack.md	MLflow + MinIO + K3s locally, no cloud needed	30 min
adding-new-workstations.md	Bringing a new GPU machine online end to end	20 min
exercises.md	Hands-on practice with workstation workflow	45 min

Prerequisites

Completed Modules 0-1 (Python venvs, bash, Docker basics)
Access to a Linux machine with an NVIDIA GPU (or plans to set one up)
An SSH client on your laptop (built-in on macOS/Linux, available on Windows)

Who Needs This Module

Definitely read this if:

Your team shares a GPU workstation
You are setting up a new GPU machine for ML work
You work remotely and connect to a GPU machine via SSH

You can skip this if:

You only use cloud GPU instances (AWS, GCP, Azure)
You have a personal GPU workstation that no one else uses
You already have an established workstation workflow

How This Connects to the Pipeline

This repo is designed to run on AWS EC2 (a cloud GPU instance). But the same Docker images, Kubernetes manifests, and MLflow setup work on a local GPU workstation. Module 3 bridges the gap:

The Dockerfile builds the same training image on your workstation
K3s can be installed locally instead of on EC2
MLflow can run locally instead of on a cloud server
You can develop and test locally, then deploy to AWS for production runs

The workstation is your development environment. AWS is your production environment. The tools (Docker, K3s, MLflow) are the same in both.

Checklist Before Moving to Module 4

< Back to Learning Path

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Module 3: ML Workstation Setup

Topics

Prerequisites

Who Needs This Module

Definitely read this if:

You can skip this if:

How This Connects to the Pipeline

Checklist Before Moving to Module 4

FilesExpand file tree

03-ml-workstation-setup

Directory actions

More options

Directory actions

More options

Latest commit

History

03-ml-workstation-setup

Folders and files

parent directory

README.md

Module 3: ML Workstation Setup

Topics

Prerequisites

Who Needs This Module

Definitely read this if:

You can skip this if:

How This Connects to the Pipeline

Checklist Before Moving to Module 4