Skip to content

set fill value to max of full dataset + 1#36

Merged
LMZimmer merged 1 commit intoautoml:masterfrom
jonathanburns:nanmax_X
Mar 23, 2020
Merged

set fill value to max of full dataset + 1#36
LMZimmer merged 1 commit intoautoml:masterfrom
jonathanburns:nanmax_X

Conversation

@jonathanburns
Copy link
Contributor

@jonathanburns jonathanburns commented Mar 23, 2020

Hello, I'm somewhat new to this repo 👋 .

It looks like if the global max is in not in the current cross_validation split, using X[train_indices] would lead to different fill values in different CV splits. My impression is that this is undesired, but correct me if I am wrong.

Similarly, all_nan_columns should be the same across all CV splits, I think?

@LMZimmer
Copy link
Contributor

Hi there,
I believe the idea was that if NaNs occur for a feature in the validation set but not the train set, we can still train on the train set with that feature (in particular for .refit). However, this might through errors when trying to validate so I also think this is cleaner.

@LMZimmer LMZimmer merged commit d603692 into automl:master Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants