Skip to content

Fix bugs in cutout training #233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ravinkohli
Copy link
Contributor

This PR fixes the following bugs:

  1. we were not sampling the indices with replacement.
  2. we were taking the minimum between batch size and num features, which is not necessary
  3. Numerical features were getting -1 to them, whereas 0 makes more sense

@ArlindKadra ArlindKadra self-requested a review May 21, 2021 14:17
@@ -39,7 +39,8 @@ def data_preparation(self, X: np.ndarray, y: np.ndarray,
# It is unlikely that the batch size is lower than the number of features, but
# be safe
size = min(X.shape[0], X.shape[1])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be changed to size=X.shape[1] right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I missed that


# We use an ordinal encoder on the tabular data
if not isinstance(self.numerical_columns, typing.Iterable):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the numerical columns are None, we should still continue with only categorical imputing in this case or not.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if there are only numerical columns, there should not be a conversion for categorical ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually when there are no numerical columns, it is not none but it is an empty list. And indexing with an empty list does not affect the tensor so this should work

Copy link

@ArlindKadra ArlindKadra May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numerical_columns=X['dataset_properties']['numerical_columns'] if 'numerical_columns' in X[
'dataset_properties'] else None

Is numerical_columns always in dataset_properties ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when its tabular data then yeah

@ArlindKadra ArlindKadra merged commit 463c166 into refactor_development_regularization_cocktails May 21, 2021
github-actions bot pushed a commit that referenced this pull request May 21, 2021
@ravinkohli ravinkohli mentioned this pull request Jun 17, 2021
@ravinkohli ravinkohli deleted the fix_cutTrainer branch October 22, 2021 09:36
ravinkohli added a commit that referenced this pull request Dec 8, 2021
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Dec 8, 2021
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Dec 21, 2021
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Jan 24, 2022
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Jan 28, 2022
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Feb 28, 2022
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Feb 28, 2022
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Mar 9, 2022
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit to ravinkohli/Auto-PyTorch that referenced this pull request Apr 12, 2022
* Fix bugs in cutout training

* Address comments from arlind
ravinkohli added a commit that referenced this pull request Jul 26, 2022
* Fix bugs in cutout training

* Address comments from arlind
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants