Skip to content

Add --device argument to run examples on a specific device #1288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
3 changes: 2 additions & 1 deletion dcgan/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ usage: main.py [-h] --dataset DATASET --dataroot DATAROOT [--workers WORKERS]
[--batchSize BATCHSIZE] [--imageSize IMAGESIZE] [--nz NZ]
[--ngf NGF] [--ndf NDF] [--niter NITER] [--lr LR]
[--beta1 BETA1] [--cuda] [--ngpu NGPU] [--netG NETG]
[--netD NETD] [--mps]
[--netD NETD] [--mps] [--device DEVICE]

optional arguments:
-h, --help show this help message and exit
Expand All @@ -41,6 +41,7 @@ optional arguments:
--beta1 BETA1 beta1 for adam. default=0.5
--cuda enables cuda
--mps enables macOS GPU
--device backend device
--ngpu NGPU number of GPUs to use
--netG NETG path to netG (to continue training)
--netD NETD path to netD (to continue training)
Expand Down
3 changes: 2 additions & 1 deletion dcgan/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
parser.add_argument('--manualSeed', type=int, help='manual seed')
parser.add_argument('--classes', default='bedroom', help='comma separated list of classes for the lsun data set')
parser.add_argument('--mps', action='store_true', default=False, help='enables macOS GPU training')
parser.add_argument('--device', type=str, default='cpu', help='backend device')

opt = parser.parse_args()
print(opt)
Expand Down Expand Up @@ -112,7 +113,7 @@
elif use_mps:
device = torch.device("mps")
else:
device = torch.device("cpu")
device = torch.device(opt.device)

ngpu = int(opt.ngpu)
nz = int(opt.nz)
Expand Down
2 changes: 2 additions & 0 deletions fast_neural_style/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ python neural_style/neural_style.py eval --content-image </path/to/content/image
- `--content-scale`: factor for scaling down the content image if memory is an issue (eg: value of 2 will halve the height and width of content-image)
- `--cuda`: set it to 1 for running on GPU, 0 for CPU.
- `--mps`: set it to 1 for running on macOS GPU
- `--device DEVICE`: backend device to run on, 'cpu' by default.

Train model

Expand All @@ -42,6 +43,7 @@ There are several command line arguments, the important ones are listed below
- `--save-model-dir`: path to folder where trained model will be saved.
- `--cuda`: set it to 1 for running on GPU, 0 for CPU.
- `--mps`: set it to 1 for running on macOS GPU
- `--device DEVICE`: backend device to run on, 'cpu' by default.

Refer to `neural_style/neural_style.py` for other command line arguments. For training new models you might have to tune the values of `--content-weight` and `--style-weight`. The mosaic style model shown above was trained with `--content-weight 1e5` and `--style-weight 1e10`. The remaining 3 models were also trained with similar order of weight parameters with slight variation in the `--style-weight` (`5e10` or `1e11`).

Expand Down
9 changes: 6 additions & 3 deletions fast_neural_style/neural_style/neural_style.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def train(args):
elif args.mps:
device = torch.device("mps")
else:
device = torch.device("cpu")
device = torch.device(args.device)

np.random.seed(args.seed)
torch.manual_seed(args.seed)
Expand Down Expand Up @@ -125,7 +125,7 @@ def train(args):


def stylize(args):
device = torch.device("cuda" if args.cuda else "cpu")
device = torch.device("cuda" if args.cuda else args.device)

content_image = utils.load_image(args.content_image, scale=args.content_scale)
content_transform = transforms.Compose([
Expand Down Expand Up @@ -205,8 +205,10 @@ def main():
help="size of training images, default is 256 X 256")
train_arg_parser.add_argument("--style-size", type=int, default=None,
help="size of style-image, default is the original size of style image")
train_arg_parser.add_argument("--cuda", type=int, required=True,
train_arg_parser.add_argument("--cuda", type=int, required=True, default=0,
help="set it to 1 for running on GPU, 0 for CPU")
train_arg_parser.add_argument('--device', type=str, default='cpu',
help='backend device')
train_arg_parser.add_argument("--seed", type=int, default=42,
help="random seed for training")
train_arg_parser.add_argument("--content-weight", type=float, default=1e5,
Expand Down Expand Up @@ -234,6 +236,7 @@ def main():
eval_arg_parser.add_argument("--export_onnx", type=str,
help="export ONNX model to a given file")
eval_arg_parser.add_argument('--mps', action='store_true', default=False, help='enable macOS GPU training')
eval_arg_parser.add_argument('--device', type=str, default='cpu', help='backend device')

args = main_arg_parser.parse_args()

Expand Down
3 changes: 2 additions & 1 deletion gat/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ python main.py --epochs 300 --lr 0.005 --l2 5e-4 --dropout-p 0.6 --num-heads 8 -
In more detail, the `main.py` script recieves following arguments:
```
usage: main.py [-h] [--epochs EPOCHS] [--lr LR] [--l2 L2] [--dropout-p DROPOUT_P] [--hidden-dim HIDDEN_DIM] [--num-heads NUM_HEADS] [--concat-heads] [--val-every VAL_EVERY]
[--no-cuda] [--no-mps] [--dry-run] [--seed S]
[--no-cuda] [--no-mps] [--dry-run] [--seed S] [--device DEVICE]

PyTorch Graph Attention Network

Expand All @@ -89,6 +89,7 @@ options:
epochs to wait for print training and validation evaluation (default: 20)
--no-cuda disables CUDA training
--no-mps disables macOS GPU training
--device DEVICE backend device
--dry-run quickly check a single pass
--seed S random seed (default: 13)
```
Expand Down
4 changes: 3 additions & 1 deletion gat/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,8 @@ def test(model, criterion, input, target, mask):
help='disables CUDA training')
parser.add_argument('--no-mps', action='store_true', default=False,
help='disables macOS GPU training')
parser.add_argument('--device', type=str, default='cpu',
help='backend device')
parser.add_argument('--dry-run', action='store_true', default=False,
help='quickly check a single pass')
parser.add_argument('--seed', type=int, default=13, metavar='S',
Expand All @@ -327,7 +329,7 @@ def test(model, criterion, input, target, mask):
elif use_mps:
device = torch.device('mps')
else:
device = torch.device('cpu')
device = torch.device(args.device)
print(f'Using {device} device')

# Load the dataset
Expand Down
4 changes: 3 additions & 1 deletion gcn/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,8 @@ def test(model, criterion, input, target, mask):
help='disables CUDA training')
parser.add_argument('--no-mps', action='store_true', default=False,
help='disables macOS GPU training')
parser.add_argument('--device', type=str, default='cpu',
help='backend device')
parser.add_argument('--dry-run', action='store_true', default=False,
help='quickly check a single pass')
parser.add_argument('--seed', type=int, default=42, metavar='S',
Expand All @@ -236,7 +238,7 @@ def test(model, criterion, input, target, mask):
elif use_mps:
device = torch.device('mps')
else:
device = torch.device('cpu')
device = torch.device(args.device)
print(f'Using {device} device')

cora_url = 'https://linqs-data.soe.ucsc.edu/public/lbc/cora.tgz'
Expand Down
8 changes: 4 additions & 4 deletions language_translation/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,9 +272,9 @@ def main(opts):
help="Default learning rate")
parser.add_argument("--batch", type=int, default=128,
help="Batch size")
parser.add_argument("--backend", type=str, default="cpu",
help="Batch size")
parser.add_argument("--device", type=str, default="cpu",
help="backend device")

# Transformer settings
parser.add_argument("--attn_heads", type=int, default=8,
help="Number of attention heads")
Expand All @@ -298,7 +298,7 @@ def main(opts):

args = parser.parse_args()

DEVICE = torch.device("cuda" if args.backend == "gpu" and torch.cuda.is_available() else "cpu")
DEVICE = torch.device("cuda" if args.device == "gpu" and torch.cuda.is_available() else args.device)

if args.inference:
inference(args)
Expand Down
2 changes: 1 addition & 1 deletion legacy/snli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ spacy
Start the training process with:

```bash
python train.py --lower --word-vectors [PATH_TO_WORD_VECTORS] --vector-cache [PATH_TO_VECTOR_CACHE] --epochs [NUMBER_OF_EPOCHS] --batch-size [BATCH_SIZE] --save-path [PATH_TO_SAVE_MODEL] --gpu [GPU_NUMBER]
python train.py --lower --word-vectors [PATH_TO_WORD_VECTORS] --vector-cache [PATH_TO_VECTOR_CACHE] --epochs [NUMBER_OF_EPOCHS] --batch-size [BATCH_SIZE] --save-path [PATH_TO_SAVE_MODEL] --gpu [GPU_NUMBER] --device [BACKEND_DEVICE]
```

## 🏋️‍♀️ Training
Expand Down
2 changes: 1 addition & 1 deletion legacy/snli/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
elif torch.backends.mps.is_available():
device = torch.device('mps')
else:
device = torch.device('cpu')
device = torch.device(args.device)

inputs = data.Field(lower=args.lower, tokenize='spacy')
answers = data.Field(sequential=False)
Expand Down
2 changes: 2 additions & 0 deletions legacy/snli/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ def makedirs(name):

def get_args():
parser = ArgumentParser(description='PyTorch/torchtext SNLI example')
parser.add_argument('--device', type=str, default='cpu',
help='backend device')
parser.add_argument('--epochs', type=int, default=50,
help='the number of total epochs to run.')
parser.add_argument('--batch_size', type=int, default=128,
Expand Down
4 changes: 3 additions & 1 deletion mnist/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ def main():
help='disables CUDA training')
parser.add_argument('--no-mps', action='store_true', default=False,
help='disables macOS GPU training')
parser.add_argument('--device', type=str, default='cpu',
help='backend device')
parser.add_argument('--dry-run', action='store_true', default=False,
help='quickly check a single pass')
parser.add_argument('--seed', type=int, default=1, metavar='S',
Expand All @@ -105,7 +107,7 @@ def main():
elif use_mps:
device = torch.device("mps")
else:
device = torch.device("cpu")
device = torch.device(args.device)

train_kwargs = {'batch_size': args.batch_size}
test_kwargs = {'batch_size': args.test_batch_size}
Expand Down
1 change: 1 addition & 0 deletions mnist_forward_forward/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ optional arguments:
--lr LR learning rate (default: 0.03)
--no_cuda disables CUDA training
--no_mps disables MPS training
--device DEVICE backend device
--seed SEED random seed (default: 1)
--save_model For saving the current Model
--train_size TRAIN_SIZE
Expand Down
5 changes: 4 additions & 1 deletion mnist_forward_forward/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,9 @@ def train(self, x_pos, x_neg):
parser.add_argument(
"--no_mps", action="store_true", default=False, help="disables MPS training"
)
parser.add_argument(
'--device', type=str, default='cpu', help='backend device'
)
parser.add_argument(
"--seed", type=int, default=1, metavar="S", help="random seed (default: 1)"
)
Expand Down Expand Up @@ -145,7 +148,7 @@ def train(self, x_pos, x_neg):
elif use_mps:
device = torch.device("mps")
else:
device = torch.device("cpu")
device = torch.device(args.device)

train_kwargs = {"batch_size": args.train_size}
test_kwargs = {"batch_size": args.test_size}
Expand Down
1 change: 1 addition & 0 deletions mnist_hogwild/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ optional arguments:
--log_interval how many batches to wait before logging training status
--num_process how many training processes to use (default: 2)
--cuda enables CUDA training
--device DEVICE backend device
--dry-run quickly check a single pass
--save-model For Saving the current Model
```
4 changes: 3 additions & 1 deletion mnist_hogwild/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
help='enables CUDA training')
parser.add_argument('--mps', action='store_true', default=False,
help='enables macOS GPU training')
parser.add_argument('--device', type=str, default='cpu',
help='backend device')
parser.add_argument('--save_model', action='store_true', default=False,
help='save the trained model to state_dict')
parser.add_argument('--dry-run', action='store_true', default=False,
Expand Down Expand Up @@ -65,7 +67,7 @@ def forward(self, x):
elif use_mps:
device = torch.device("mps")
else:
device = torch.device("cpu")
device = torch.device(args.device)

transform=transforms.Compose([
transforms.ToTensor(),
Expand Down
4 changes: 3 additions & 1 deletion mnist_rnn/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,8 @@ def main():
help='enables CUDA training')
parser.add_argument('--mps', action="store_true", default=False,
help="enables MPS training")
parser.add_argument('--device', type=str, default='cpu',
help='backend device')
parser.add_argument('--dry-run', action='store_true', default=False,
help='quickly check a single pass')
parser.add_argument('--seed', type=int, default=1, metavar='S',
Expand All @@ -110,7 +112,7 @@ def main():
elif args.mps and not args.cuda:
device = "mps"
else:
device = "cpu"
device = args.device

device = torch.device(device)

Expand Down
37 changes: 23 additions & 14 deletions run_python_examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,15 @@
BASE_DIR="$(pwd)/$(dirname $0)"
source $BASE_DIR/utils.sh

# Run on a specific backend device with 'export BACKEND_DEVICE=cpu'. It will
# be set to 'cpu' by default and has lower priority than '--cuda' and '--mps'.
# See https://github.com/pytorch/examples/pull/1288 for more information.
if [ -n "${BACKEND_DEVICE}" ]; then
DEVICE_FLAG="--device ${BACKEND_DEVICE}"
else
DEVICE_FLAG=""
fi

USE_CUDA=$(python -c "import torchvision, torch; print(torch.cuda.is_available())")
case $USE_CUDA in
"True")
Expand All @@ -32,7 +41,7 @@ esac

function dcgan() {
start
python main.py --dataset fake $CUDA_FLAG --mps --dry-run || error "dcgan failed"
python main.py --dataset fake $CUDA_FLAG --mps $DEVICE_FLAG --dry-run || error "dcgan failed"
}

function fast_neural_style() {
Expand All @@ -44,7 +53,7 @@ function fast_neural_style() {
test -d "saved_models" || { error "saved models not found"; return; }

echo "running fast neural style model"
python neural_style/neural_style.py eval --content-image images/content-images/amber.jpg --model saved_models/candy.pth --output-image images/output-images/amber-candy.jpg --cuda $CUDA --mps || error "neural_style.py failed"
python neural_style/neural_style.py eval --content-image images/content-images/amber.jpg --model saved_models/candy.pth --output-image images/output-images/amber-candy.jpg --cuda $CUDA --mps $DEVICE_FLAG || error "neural_style.py failed"
}

function imagenet() {
Expand All @@ -63,36 +72,36 @@ function language_translation() {
start
python -m spacy download en || error "couldn't download en package from spacy"
python -m spacy download de || error "couldn't download de package from spacy"
python main.py -e 1 --enc_layers 1 --dec_layers 1 --backend cpu --logging_dir output/ --dry_run || error "language translation example failed"
python main.py -e 1 --enc_layers 1 --dec_layers 1 $DEVICE_FLAG --logging_dir output/ --dry_run || error "language translation example failed"
}

function mnist() {
start
python main.py --epochs 1 --dry-run || error "mnist example failed"
python main.py --epochs 1 --dry-run $DEVICE_FLAG || error "mnist example failed"
}
function mnist_forward_forward() {
start
python main.py --epochs 1 --no_mps --no_cuda || error "mnist forward forward failed"
python main.py --epochs 1 --no_mps --no_cuda $DEVICE_FLAG || error "mnist forward forward failed"

}
function mnist_hogwild() {
start
python main.py --epochs 1 --dry-run $CUDA_FLAG || error "mnist hogwild failed"
python main.py --epochs 1 --dry-run $CUDA_FLAG $DEVICE_FLAG || error "mnist hogwild failed"
}

function mnist_rnn() {
start
python main.py --epochs 1 --dry-run || error "mnist rnn example failed"
python main.py --epochs 1 --dry-run $DEVICE_FLAG || error "mnist rnn example failed"
}

function regression() {
start
python main.py --epochs 1 $CUDA_FLAG || error "regression failed"
python main.py --epochs 1 $CUDA_FLAG $DEVICE_FLAG || error "regression failed"
}

function siamese_network() {
start
python main.py --epochs 1 --dry-run || error "siamese network example failed"
python main.py --epochs 1 --dry-run $DEVICE_FLAG || error "siamese network example failed"
}

function reinforcement_learning() {
Expand Down Expand Up @@ -123,7 +132,7 @@ function fx() {

function super_resolution() {
start
python main.py --upscale_factor 3 --batchSize 4 --testBatchSize 100 --nEpochs 1 --lr 0.001 --mps || error "super resolution failed"
python main.py --upscale_factor 3 --batchSize 4 --testBatchSize 100 --nEpochs 1 --lr 0.001 --mps $DEVICE_FLAG || error "super resolution failed"
}

function time_sequence_prediction() {
Expand All @@ -134,7 +143,7 @@ function time_sequence_prediction() {

function vae() {
start
python main.py --epochs 1 || error "vae failed"
python main.py --epochs 1 $DEVICE_FLAG || error "vae failed"
}

function vision_transformer() {
Expand All @@ -144,17 +153,17 @@ function vision_transformer() {

function word_language_model() {
start
python main.py --epochs 1 --dry-run $CUDA_FLAG --mps || error "word_language_model failed"
python main.py --epochs 1 --dry-run $CUDA_FLAG --mps $DEVICE_FLAG || error "word_language_model failed"
}

function gcn() {
start
python main.py --epochs 1 --dry-run || error "graph convolutional network failed"
python main.py --epochs 1 --dry-run $DEVICE_FLAG || error "graph convolutional network failed"
}

function gat() {
start
python main.py --epochs 1 --dry-run || error "graph attention network failed"
python main.py --epochs 1 --dry-run $DEVICE_FLAG || error "graph attention network failed"
}

function clean() {
Expand Down
Loading