Skip to content

Conversation

@HyukjinKwon
Copy link
Member

There was a concern about the naming 'one-by-one' since it's confusing (see https://groups.google.com/forum/m/#!topic/koalas-dev/xzZ0VvXFsGM)
This PR proposes to rename the default index 'one-by-one' to 'sequence'.

This is conceptually equivalent to the Spark example as below:
>>> spark_df = ks.range(3).to_spark()
>>> spark_df.rdd.zipWithIndex().map(lambda p: p[1]).collect()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, RDD API is not used in actual index implementation but pandas group-map UDF is being used, and it mimics zipWithIndex in RDD API.

@softagram-bot
Copy link

Softagram Impact Report for pull/679 (head commit: a527e4e)

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

📄 Full report

Give feedback on this report to [email protected]

@HyukjinKwon
Copy link
Member Author

Merged for now.

@HyukjinKwon HyukjinKwon merged commit 34088a4 into databricks:master Aug 26, 2019
@HyukjinKwon HyukjinKwon deleted the rename-to-sequence branch November 6, 2019 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants