-
Notifications
You must be signed in to change notification settings - Fork 367
Add spark namespace in DataFrame, Series, Index and MultiIndex #1530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add spark namespace in DataFrame, Series, Index and MultiIndex #1530
Conversation
|
The PR happened to be too big 😓 .. I will split next time. |
| with self.assertRaisesRegex(ValueError, msg): | ||
| kdf.truncate("C", "B", axis=1) | ||
|
|
||
| def test_spark_schema(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the tests are for deprecated methods.
172fc98 to
e1908ec
Compare
Codecov Report
@@ Coverage Diff @@
## master #1530 +/- ##
=======================================
Coverage 94.14% 94.14%
=======================================
Files 36 37 +1
Lines 8396 8487 +91
=======================================
+ Hits 7904 7990 +86
- Misses 492 497 +5
Continue to review full report at Codecov.
|
ueshin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!!
Basically LGTM, except for a few comments.
e1908ec to
efd8d5e
Compare
d9fb2c1 to
289475e
Compare
|
Merged! Thanks @ueshin. |
|
Gorgeous ! |
This PR proposes to have
sparknamespace inDataFrame,Series,IndexandMultiIndex.Spark related features are placed under this namespace.
(Series|Index|MultiIndex).spark_type->(Series|Index|MultiIndex).spark.data_typespark_typeis deprecated(Series|Index|MultiIndex).spark_column->(Series|Index|MultiIndex).spark.columnspark_columnis deprecatedNew API
(Series|Index).transformI intentionally named it
transformbecause it needs to have the same length.DataFrame.spark_schema->DataFrame.spark.schemaDataFrame.spark_schemais deprecatedDataFrame.print_schema->DataFrame.spark.print_schemaDataFrame.print_schemais deprecatedDataFrame.to_spark->DataFrame.spark.frameDataFrame.to_sparkis NOT deprecated to keep the semantic betweento_koalas<>to_spark. It's just an alias ofDataFrame.spark.frameDataFrame.cache->DataFrame.spark.cacheDataFrame.cacheis deprecatedDataFrame.persist->DataFrame.spark.persistDataFrame.persistis deprecatedDataFrame.hint->DataFrame.spark.hintDataFrame.hintis deprecatedDataFrame.unpersist->DataFrame.spark.unpersistDataFrame.unpersistis deprecatedDataFrame.storage_level->DataFrame.spark.storage_levelDataFrame.storage_levelis deprecatedDataFrame.to_table->DataFrame.spark.to_tableDataFrame.to_tableis NOT deprecated to keep the semantic betweenks.read_table<>to_table. It's just an alias ofDataFrame.spark.to_table. It's also similar withDataFrame.to_parquet,DataFrame.to_csv, etc.DataFrame.to_spark_io->DataFrame.spark.to_spark_ioDataFrame.to_spark_iois NOT deprecated to keep the semantic betweenks.read_spark_io<>to_spark_io. It's just an alias ofDataFrame.spark.to_spark_io. It's also similar withDataFrame.to_parquet,DataFrame.to_csv, etc.DataFrame.explain->DataFrame.spark.explainDataFrame.explainis deprecated