Skip to content

Conversation

@itholic
Copy link
Contributor

@itholic itholic commented May 11, 2020

Resolves #1469

>>> df = ks.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'], 'C': [1, 2, 3]}, columns=['A', 'B', 'C'])

>>> ks.get_dummies(df, prefix={'B': 'ohe_B', 'C': 'ohe_C'}, columns=['B', 'C'])
   A  ohe_B_a  ohe_B_b  ohe_B_c  ohe_C_1  ohe_C_2  ohe_C_3
0  a        0        1        0        1        0        0
1  b        1        0        0        0        1        0
2  a        0        0        1        0        0        1

>>> pd.get_dummies(df.to_pandas(), prefix={'B': 'ohe_B', 'C': 'ohe_C'}, columns=['B', 'C'])
   A  ohe_B_a  ohe_B_b  ohe_B_c  ohe_C_1  ohe_C_2  ohe_C_3
0  a        0        1        0        1        0        0
1  b        1        0        0        0        1        0
2  a        0        0        1        0        0        1

@itholic itholic changed the title Fix get dummies when uses the prefix parameter whose type is dict [WIP] Fix get dummies when uses the prefix parameter whose type is dict May 11, 2020
@itholic itholic changed the title [WIP] Fix get dummies when uses the prefix parameter whose type is dict Fix get dummies when uses the prefix parameter whose type is dict May 11, 2020
@codecov-io
Copy link

codecov-io commented May 11, 2020

Codecov Report

Merging #1478 into master will decrease coverage by 1.21%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1478      +/-   ##
==========================================
- Coverage   93.76%   92.55%   -1.22%     
==========================================
  Files          36       36              
  Lines        8409     8411       +2     
==========================================
- Hits         7885     7785     -100     
- Misses        524      626     +102     
Impacted Files Coverage Δ
databricks/koalas/namespace.py 86.09% <100.00%> (+0.07%) ⬆️
databricks/koalas/usage_logging/__init__.py 26.12% <0.00%> (-68.47%) ⬇️
databricks/koalas/usage_logging/usage_logger.py 50.00% <0.00%> (-50.00%) ⬇️
databricks/koalas/__init__.py 86.53% <0.00%> (-5.77%) ⬇️
databricks/koalas/utils.py 94.14% <0.00%> (-2.44%) ⬇️
databricks/koalas/generic.py 96.65% <0.00%> (-0.38%) ⬇️
databricks/koalas/frame.py 95.33% <0.00%> (-0.25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 998f8f7...c0eee47. Read the comment docs.

@HyukjinKwon
Copy link
Member

Looks pretty good otherwise

@HyukjinKwon HyukjinKwon merged commit b14d3fa into databricks:master May 11, 2020
@itholic itholic deleted the fix_get_dummies branch May 29, 2020 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_dummies uses the prefix parameter whose type is dict return KeyError

3 participants