You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should not let the datamanager actively reside in memory when we are not using it. For example, there is no need to have a datamanager in smbo.
Also, after search has save the datamanager to disk, we can delete and garbage collect it.
We also should datacollect and challenge the need of datamanager in the evaluator
We should improve the cross validation handling of the out of fold predictions. Rather than having a list that contains the OOF predicitons here we should have a fixed array of n_samples created once at the beginning. OOF predictions from the k-fold model should be added smartly to this pre-existing array, something like self.Y_optimization[test_indices] = opt_pred. This ways predictions are sorted and can be used directly by ensemble selection without the need of saving this array
Calculating the train loss should be optional, not done by default here. We should prevent doing predict if not strictly needed.
As reported already by @nabenabe0928 the biggest contribution comes from import files. In particular, just doing import torch consumes 2Gb of peak virtual memory and the majority of times this happens is for mypy typing. We should encapsulate these calls under typing.TYPE_CHECKING and only import the strictly needed class from pytorch.
The text was updated successfully, but these errors were encountered:
self.Y_optimization[test_indices] = opt_pred
. This ways predictions are sorted and can be used directly by ensemble selection without the need of saving this arrayimport torch
consumes 2Gb of peak virtual memory and the majority of times this happens is for mypy typing. We should encapsulate these calls undertyping.TYPE_CHECKING
and only import the strictly needed class from pytorch.The text was updated successfully, but these errors were encountered: