-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
EHN: Add index parameter to to_json #18591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty good
pandas/io/json/json.py
Outdated
@@ -108,6 +115,19 @@ def _format_axes(self): | |||
raise ValueError("Series index must be unique for orient=" | |||
"'{orient}'".format(orient=self.orient)) | |||
|
|||
def write(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you refactor the super-class to make .write()
be ._write(self, .......)
(IOW to accept parameters), then these 2 can just be a super call
pandas/tests/io/json/test_pandas.py
Outdated
def test_index_false_to_json(self): | ||
# GH 17394 | ||
# Testing index parameter in to_json | ||
import json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import this at the top
pandas/tests/io/json/test_pandas.py
Outdated
'name': 'A', | ||
'data': [1, 2, 3] | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add comments before each section of tests
Codecov Report
@@ Coverage Diff @@
## master #18591 +/- ##
==========================================
- Coverage 91.44% 91.43% -0.02%
==========================================
Files 157 157
Lines 51379 51393 +14
==========================================
+ Hits 46985 46992 +7
- Misses 4394 4401 +7
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18591 +/- ##
==========================================
- Coverage 91.44% 91.43% -0.02%
==========================================
Files 157 157
Lines 51379 51393 +14
==========================================
+ Hits 46985 46992 +7
- Misses 4394 4401 +7
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18591 +/- ##
==========================================
- Coverage 91.44% 91.43% -0.02%
==========================================
Files 157 157
Lines 51379 51393 +14
==========================================
+ Hits 46985 46992 +7
- Misses 4394 4401 +7
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18591 +/- ##
==========================================
- Coverage 91.6% 91.56% -0.05%
==========================================
Files 153 153
Lines 51273 51290 +17
==========================================
- Hits 46970 46965 -5
- Misses 4303 4325 +22
Continue to review full report at Codecov.
|
pandas/io/json/json.py
Outdated
|
||
self.is_copy = None | ||
self._format_axes() | ||
|
||
def _format_axes(self): | ||
raise AbstractMethodError(self) | ||
|
||
def write(self): | ||
def _write(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really what I meant
.write()
should remain in the superclass as the public interface with no arguments, _write(obj)
will take an obj
for example, so we dont' mutate in the subclasses (you will instead pass the revised obj in the super call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So should there be a .write(self)
and ._write(self, obj)
method in the Writer
class and then a ._write(self, obj)
method in each of the subclasses (SeriesWriter
, FrameWriter
and JSONTableWriter
)?
Then the ._write(self, obj)
methods in each of the subclasses call super()._write(obj)
after making any necessary adjustments to obj
?
And in the to_json()
function at the top of the file we use ._write(obj)
on an instance of the relevant class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.write() will just call ._write()
so the original api is preserved
looks good. rebase and ping on green. |
@jreback thanks! Green now. |
pandas/tests/io/json/test_pandas.py
Outdated
@@ -1147,3 +1148,84 @@ def test_data_frame_size_after_to_json(self): | |||
size_after = df.memory_usage(index=True, deep=True).sum() | |||
|
|||
assert size_before == size_after | |||
|
|||
def test_index_false_to_json(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you parametrize this, will be a bit more readable i think.
pandas/core/generic.py
Outdated
@@ -1671,6 +1672,13 @@ def to_json(self, path_or_buf=None, orient=None, date_format=None, | |||
|
|||
.. versionadded:: 0.21.0 | |||
|
|||
index : boolean, default True | |||
Whether to include the index values in the JSON string. A | |||
ValueError will be thrown if index is False when orient is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say something like "Not including the index (index=False
) is only supported for orient 'split' and 'table'" instead
pandas/io/json/json.py
Outdated
def _write(self, obj, orient, double_precision, ensure_ascii, | ||
date_unit, iso_dates, default_handler): | ||
if not self.index: | ||
obj = obj.drop('index', axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not robust (you can have a MultiIndex, or an index with another name, etc ..)
You need to handle this above when reset_index
is called (do drop=True/False
depending on the value of self.index
)
(and also make sure to add a test for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a good point. I think I have it fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you push the new commits? (I don't see any change here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sorry I forgot to push the new commit. I've pushed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks better now!
934a1d3
to
e3df081
Compare
@@ -1147,3 +1148,64 @@ def test_data_frame_size_after_to_json(self): | |||
size_after = df.memory_usage(index=True, deep=True).sum() | |||
|
|||
assert size_before == size_after | |||
|
|||
@pytest.mark.parametrize('data, expected', [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
lgtm. @jorisvandenbossche merge when satisfied. |
@reidy-p Thanks! |
git diff upstream/master -u -- "*.py" | flake8 --diff
Examples: