-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG? merging on column of empty frame with index of right frame #15692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
changed here: e8d9e79 this tries to coerce keys back to the original dtype, might be buggy |
So the question is, when not merging on a common column (in this case left:column and right:index), do we want both in the output, and should both be unique? (or only unique in the overlapping values?) Some further exploration (with the same example as above): Settting the right index to other values makes it clear the index and key column duplicates the 'merge' values:
When the left frame is not fully empty, but just missing values in comparison to right frame, you also get strange behaviour:
The 'key' column is also filled like before, but now the index is gone (and not even default values, but all 0's) |
Behaviour in 0.18 of the above:
So the bug with the 0's in the index is the same, and also here the key column gets filled in (only not retained dtype, that is what was fixed in 0.19). So probably the filling of the key column is then the desired behaviour ? |
The behaviour of the resulting index is rather buggy:
|
yeah I would say the index of the result is wrong. It should be just a range index. Maybe not getting setup somehow.
|
It is a right join on the index (at least on the right index), so an option is also to preserve the right index. It makes that you have the merge values duplicate (in column and in index), but the same is true when merging on two different named columns. BTW, when merging on two different columns, the key columns also don't get filled in on non-matching rows:
So this is a further inconsistency with the |
seems to work correctly when using dates for the keys ... except for the weird index.
|
@randomgambit sorry, I don't see how this example is different from the example without dates. The buggy index (all 0's) is exactly one of the issues. |
hello cher monsieur @jorisvandenbossche ! yes you are actually right. I misread my console. Sorry about that. Well, I guess my point is that this bug carries over also for datetime variables :) I also noticed this pb a couple of times, and I always do my merges after having reset the index anyways |
No problem! |
It is a rather specific corner case, but there has been a change in behaviour when merging an empty frame:
vs
So with 0.19 the
'key'
column has values, in 0.18 this holds NaNs. The key column comes from the empty frame (so it had no values, how can it have values now?), but is merged with the index of the left frame (and this has of course values -> should these end up in the 'key' column of the resulting frame?)It is such a strange case, that I am actually not sure which of both is the expected behaviour .. (and also not sure if this was an intentional change in behaviour).
Encountered here: geopandas/geopandas#422
The text was updated successfully, but these errors were encountered: