You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Weird thing is that in a dataset with no nulls, adding a SimpleImputer along with a TargetEncoder(), several null values start to come out.
I'm not sure if I'm doing something wrong, but if using SimpleImputer with no Null values, nothing should happen. And actually I ran process separately and simpleImputer will not output any null value. But, once this Numpy array goes into TargetEncoder() it will output more than 2000 Nulls.
Why is that?
Expected Behavior
If no nulls are provided, no nulls should Output. See attached notebook, such when running the TargetEncoder by its own.
Actual Behavior
Steps to Reproduce the Problem
Refer to attached notebook with example code.
Specifications
Version: 2.2.2
Platform: Windows 10
Subsystem: Python 3.7.7
Thanks Guys,
Alfonso
The text was updated successfully, but these errors were encountered:
Just to add something extra. I've noticed that using Pipeline and Doing by its own clculates completely different values:
You can see some NaN coming up in columns 2 and 3, I would think it is because they are numbers, and since they come from a numpy array it has no way to determine the incoming dtype.
I updated to pandas 1.1.2 to check if that helped but I had no luck.
I've noticed that this only happens when trying to use something that is or used to be a numpy array. So if some kind of underlying metadata tells the encoder this is coming from numpy it will tangle and provide null values.
Reading some other issues out there noticed that this could be related? Sorry to post so much, I'm just trying contribute as much as I can to solve this issue.
I have been modelling using the ames_housing dataset with the code attached in the following zip file.
rep_example.zip
Weird thing is that in a dataset with no nulls, adding a SimpleImputer along with a TargetEncoder(), several null values start to come out.
I'm not sure if I'm doing something wrong, but if using SimpleImputer with no Null values, nothing should happen. And actually I ran process separately and simpleImputer will not output any null value. But, once this Numpy array goes into TargetEncoder() it will output more than 2000 Nulls.
Why is that?
Expected Behavior
If no nulls are provided, no nulls should Output. See attached notebook, such when running the TargetEncoder by its own.
Actual Behavior
Steps to Reproduce the Problem
Refer to attached notebook with example code.
Specifications
Thanks Guys,
Alfonso
The text was updated successfully, but these errors were encountered: