Skip to content

Potential Memory Leak #278

@Phoveran

Description

@Phoveran

I'm running exactly examples/vision/anil_fc100.py with

  1. learn2learn 0.1.6 (using pip install learn2learn)
  2. pytorch 1.10
  3. CUDA 11.4
  4. a single RTX 2080Ti

and it crashes after 3 iterations, telling me cuda memory has run out

Iteration 0
Meta Train Error 1.496383372694254
Meta Train Accuracy 0.3362499892245978
Meta Valid Error 1.6015896834433079
Meta Valid Accuracy 0.2937499925028533
Meta Test Error 1.5358335226774216
Meta Test Accuracy 0.3562499899417162


Iteration 1
Meta Train Error 1.4316311068832874
Meta Train Accuracy 0.39999998873099685
Meta Valid Error 1.588501501828432
Meta Valid Accuracy 0.28749999054707587
Meta Test Error 1.45738809928298
Meta Test Accuracy 0.3774999915622175


Iteration 2
Meta Train Error 1.3444917295128107
Meta Train Accuracy 0.47624998819082975
Meta Valid Error 1.5741207413375378
Meta Valid Accuracy 0.28874999145045877
Meta Test Error 1.4722651988267899
Meta Test Accuracy 0.3599999900907278
Traceback (most recent call last):
  File "/home/stan/work/icml2022/test.py", line 207, in <module>
    main()
  File "/home/stan/work/icml2022/test.py", line 179, in main
    evaluation_error, evaluation_accuracy = fast_adapt(batch,
  File "/home/stan/work/icml2022/test.py", line 32, in fast_adapt
    data = features(data)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/learn2learn/vision/models/cnn4.py", line 247, in forward
    x = super(CNN4Backbone, self).forward(x)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/learn2learn/vision/models/cnn4.py", line 96, in forward
    x = self.relu(x)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/torch/nn/modules/activation.py", line 98, in forward
    return F.relu(input, inplace=self.inplace)
  File "/home/stan/tool/miniconda3/lib/python3.9/site-packages/torch/nn/functional.py", line 1299, in relu
    result = torch.relu(input)
RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 10.76 GiB total capacity; 9.25 GiB already allocated; 2.31 MiB free; 9.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CON

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions