Tips to reduce memory consumption in Auto-PyTorch #259

franchuterivera · 2021-06-17T19:56:01Z

We should not let the datamanager actively reside in memory when we are not using it. For example, there is no need to have a datamanager in smbo.
Also, after search has save the datamanager to disk, we can delete and garbage collect it.
We also should datacollect and challenge the need of datamanager in the evaluator
We should improve the cross validation handling of the out of fold predictions. Rather than having a list that contains the OOF predicitons here we should have a fixed array of n_samples created once at the beginning. OOF predictions from the k-fold model should be added smartly to this pre-existing array, something like self.Y_optimization[test_indices] = opt_pred. This ways predictions are sorted and can be used directly by ensemble selection without the need of saving this array
Calculating the train loss should be optional, not done by default here. We should prevent doing predict if not strictly needed.
As reported already by @nabenabe0928 the biggest contribution comes from import files. In particular, just doing import torch consumes 2Gb of peak virtual memory and the majority of times this happens is for mypy typing. We should encapsulate these calls under typing.TYPE_CHECKING and only import the strictly needed class from pytorch.

The text was updated successfully, but these errors were encountered:

nabenabe0928 · 2021-06-21T12:45:24Z

Hi, thanks for raising this issue, I will move this issue to another memory consumption issue and close this issue once.

franchuterivera mentioned this issue Jun 17, 2021

Still crazy large mem consumption #19

Closed

nabenabe0928 closed this as completed Jun 21, 2021

nabenabe0928 mentioned this issue Jun 21, 2021

[memo] High memory consumption and the places of doubts #180

Open

Provide feedback