I installed learn2learn using "pip install learn2learn". When I try to run maml_miniimagenet.py (from learn2learn/examples/vision/maml_miniimagenet.py ) with a batch size of 2 and shot = 1, I get the same error after 63 iterations. When I change to shot = 5, I get the error after 3 iterations.
Iteration 63
Meta Train Error 2.0417345762252808
Meta Train Accuracy 0.20000000298023224
Meta Valid Error 1.8002310991287231
Meta Valid Accuracy 0.20000000298023224
Traceback (most recent call last):
File "/home/deep/Desktop/IMPLEMENTATION/MyTry/MetaSGD/mini_Temp_Test.py", line 156, in
main()
File "/home/deep/Desktop/IMPLEMENTATION/MyTry/MetaSGD/mini_Temp_Test.py", line 106, in main
evaluation_error.backward()
File "/home/deep/.local/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/deep/.local/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 5.79 GiB total capacity; 3.60 GiB already allocated; 77.56 MiB free; 3.62 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
When I look at nvidia-smi, the memory usage gradually increases with each iteration.
However, If I comment out the meta-validation loss part, (line 114-112 in this script) then I don't get the memory leak problem. I think the issue is similar to (Potential Memory Leak #278 ) I wonder why this issue is and how the issue can be solved?
I installed learn2learn using "pip install learn2learn". When I try to run maml_miniimagenet.py (from learn2learn/examples/vision/maml_miniimagenet.py ) with a batch size of 2 and shot = 1, I get the same error after 63 iterations. When I change to shot = 5, I get the error after 3 iterations.
When I look at nvidia-smi, the memory usage gradually increases with each iteration.
However, If I comment out the meta-validation loss part, (line 114-112 in this script) then I don't get the memory leak problem. I think the issue is similar to (Potential Memory Leak #278 ) I wonder why this issue is and how the issue can be solved?