Skip to content

EHN: Add index parameter to to_json #18591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 10, 2017

Conversation

reidy-p
Copy link
Contributor

@reidy-p reidy-p commented Dec 1, 2017

Examples:

In [1]: df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['a', 'b', 'c'])

In [2]: df.to_json(orient='split', index=True)
Out[2]: 
{"columns":["a","b","c"],"index":[0,1],"data":[[1,2,3],[4,5,6]]}

In [3]: df.to_json(orient='split', index=False)
Out[3]:
{"columns":["a","b","c"],"data":[[1,2,3],[4,5,6]]}

In [4]: df.to_json(orient='table', index=True)
Out[4]:
{"schema": {"fields":[{"name":"index","type":"integer"},{"name":"a","type":"integer"},{"name":"b","type":"integer"},{"name":"c","type":"integer"}],"primaryKey":["index"],"pandas_version":"0.20.0"}, "data": [{"index":0,"a":1,"b":2,"c":3},{"index":1,"a":4,"b":5,"c":6}]}

In [5]: df.to_json(orient='table', index=False)
Out[5]:
{"schema": {"fields":[{"name":"a","type":"integer"},{"name":"b","type":"integer"},{"name":"c","type":"integer"}],"pandas_version":"0.20.0"}, "data": [{"a":1,"b":2,"c":3},{"a":4,"b":5,"c":6}]}

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good

@@ -108,6 +115,19 @@ def _format_axes(self):
raise ValueError("Series index must be unique for orient="
"'{orient}'".format(orient=self.orient))

def write(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you refactor the super-class to make .write() be ._write(self, .......) (IOW to accept parameters), then these 2 can just be a super call

def test_index_false_to_json(self):
# GH 17394
# Testing index parameter in to_json
import json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import this at the top

'name': 'A',
'data': [1, 2, 3]
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add comments before each section of tests

@jreback jreback added Enhancement IO JSON read_json, to_json, json_normalize labels Dec 1, 2017
@codecov
Copy link

codecov bot commented Dec 1, 2017

Codecov Report

Merging #18591 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         157      157              
  Lines       51379    51393      +14     
==========================================
+ Hits        46985    46992       +7     
- Misses       4394     4401       +7
Flag Coverage Δ
#multiple 89.3% <100%> (ø) ⬆️
#single 40.56% <12.5%> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 95.78% <ø> (ø) ⬆️
pandas/io/json/json.py 91.78% <100%> (+0.03%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/util/testing.py 81.8% <0%> (+0.19%) ⬆️
pandas/io/json/table_schema.py 97.22% <0%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d74ac70...df61cee. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 1, 2017

Codecov Report

Merging #18591 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         157      157              
  Lines       51379    51393      +14     
==========================================
+ Hits        46985    46992       +7     
- Misses       4394     4401       +7
Flag Coverage Δ
#multiple 89.3% <100%> (ø) ⬆️
#single 40.56% <12.5%> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 95.78% <ø> (ø) ⬆️
pandas/io/json/json.py 91.78% <100%> (+0.03%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/util/testing.py 81.8% <0%> (+0.19%) ⬆️
pandas/io/json/table_schema.py 97.22% <0%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d74ac70...df61cee. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 1, 2017

Codecov Report

Merging #18591 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         157      157              
  Lines       51379    51393      +14     
==========================================
+ Hits        46985    46992       +7     
- Misses       4394     4401       +7
Flag Coverage Δ
#multiple 89.3% <100%> (ø) ⬆️
#single 40.56% <12.5%> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 95.78% <ø> (ø) ⬆️
pandas/io/json/json.py 91.78% <100%> (+0.03%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/util/testing.py 81.8% <0%> (+0.19%) ⬆️
pandas/io/json/table_schema.py 97.22% <0%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d74ac70...df61cee. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 1, 2017

Codecov Report

Merging #18591 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage    91.6%   91.56%   -0.05%     
==========================================
  Files         153      153              
  Lines       51273    51290      +17     
==========================================
- Hits        46970    46965       -5     
- Misses       4303     4325      +22
Flag Coverage Δ
#multiple 89.43% <100%> (-0.03%) ⬇️
#single 40.7% <19.04%> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 95.9% <ø> (ø) ⬆️
pandas/io/json/json.py 92.08% <100%> (+0.33%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 64.78% <0%> (-1.74%) ⬇️
pandas/util/testing.py 81.82% <0%> (-0.2%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/io/json/table_schema.py 97.22% <0%> (+1.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34a8d36...b204d09. Read the comment docs.


self.is_copy = None
self._format_axes()

def _format_axes(self):
raise AbstractMethodError(self)

def write(self):
def _write(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really what I meant

.write() should remain in the superclass as the public interface with no arguments, _write(obj) will take an obj for example, so we dont' mutate in the subclasses (you will instead pass the revised obj in the super call)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should there be a .write(self) and ._write(self, obj) method in the Writer class and then a ._write(self, obj) method in each of the subclasses (SeriesWriter, FrameWriter and JSONTableWriter)?

Then the ._write(self, obj) methods in each of the subclasses call super()._write(obj) after making any necessary adjustments to obj?

And in the to_json() function at the top of the file we use ._write(obj) on an instance of the relevant class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.write() will just call ._write()

so the original api is preserved

@jreback jreback added this to the 0.22.0 milestone Dec 6, 2017
@jreback
Copy link
Contributor

jreback commented Dec 6, 2017

looks good. rebase and ping on green.

@reidy-p
Copy link
Contributor Author

reidy-p commented Dec 6, 2017

@jreback thanks! Green now.

@@ -1147,3 +1148,84 @@ def test_data_frame_size_after_to_json(self):
size_after = df.memory_usage(index=True, deep=True).sum()

assert size_before == size_after

def test_index_false_to_json(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parametrize this, will be a bit more readable i think.

@@ -1671,6 +1672,13 @@ def to_json(self, path_or_buf=None, orient=None, date_format=None,

.. versionadded:: 0.21.0

index : boolean, default True
Whether to include the index values in the JSON string. A
ValueError will be thrown if index is False when orient is not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say something like "Not including the index (index=False) is only supported for orient 'split' and 'table'" instead

def _write(self, obj, orient, double_precision, ensure_ascii,
date_unit, iso_dates, default_handler):
if not self.index:
obj = obj.drop('index', axis=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not robust (you can have a MultiIndex, or an index with another name, etc ..)
You need to handle this above when reset_index is called (do drop=True/False depending on the value of self.index)

(and also make sure to add a test for this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a good point. I think I have it fixed now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you push the new commits? (I don't see any change here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry I forgot to push the new commit. I've pushed now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks better now!

@reidy-p reidy-p force-pushed the json_index branch 2 times, most recently from 934a1d3 to e3df081 Compare December 9, 2017 15:44
@@ -1147,3 +1148,64 @@ def test_data_frame_size_after_to_json(self):
size_after = df.memory_usage(index=True, deep=True).sum()

assert size_before == size_after

@pytest.mark.parametrize('data, expected', [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@jreback
Copy link
Contributor

jreback commented Dec 10, 2017

lgtm. @jorisvandenbossche merge when satisfied.

@jorisvandenbossche jorisvandenbossche merged commit 2efd67f into pandas-dev:master Dec 10, 2017
@jorisvandenbossche
Copy link
Member

@reidy-p Thanks!

@reidy-p reidy-p deleted the json_index branch December 10, 2017 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: to_json(index=False)
3 participants