EHN: Add index parameter to to_json #18591

reidy-p · 2017-12-01T14:33:43Z

closes ENH: to_json(index=False) #17394
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Examples:

In [1]: df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['a', 'b', 'c'])

In [2]: df.to_json(orient='split', index=True)
Out[2]: 
{"columns":["a","b","c"],"index":[0,1],"data":[[1,2,3],[4,5,6]]}

In [3]: df.to_json(orient='split', index=False)
Out[3]:
{"columns":["a","b","c"],"data":[[1,2,3],[4,5,6]]}

In [4]: df.to_json(orient='table', index=True)
Out[4]:
{"schema": {"fields":[{"name":"index","type":"integer"},{"name":"a","type":"integer"},{"name":"b","type":"integer"},{"name":"c","type":"integer"}],"primaryKey":["index"],"pandas_version":"0.20.0"}, "data": [{"index":0,"a":1,"b":2,"c":3},{"index":1,"a":4,"b":5,"c":6}]}

In [5]: df.to_json(orient='table', index=False)
Out[5]:
{"schema": {"fields":[{"name":"a","type":"integer"},{"name":"b","type":"integer"},{"name":"c","type":"integer"}],"pandas_version":"0.20.0"}, "data": [{"a":1,"b":2,"c":3},{"a":4,"b":5,"c":6}]}

jreback

looks pretty good

jreback · 2017-12-01T14:52:24Z

pandas/io/json/json.py

@@ -108,6 +115,19 @@ def _format_axes(self):
            raise ValueError("Series index must be unique for orient="
                             "'{orient}'".format(orient=self.orient))

+    def write(self):


if you refactor the super-class to make .write() be ._write(self, .......) (IOW to accept parameters), then these 2 can just be a super call

jreback · 2017-12-01T14:53:04Z

pandas/tests/io/json/test_pandas.py

+    def test_index_false_to_json(self):
+        # GH 17394
+        # Testing index parameter in to_json
+        import json


import this at the top

jreback · 2017-12-01T14:54:30Z

pandas/tests/io/json/test_pandas.py

+            'name': 'A',
+            'data': [1, 2, 3]
+        }
+


can you add comments before each section of tests

codecov · 2017-12-01T15:48:28Z

Codecov Report

Merging #18591 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         157      157              
  Lines       51379    51393      +14     
==========================================
+ Hits        46985    46992       +7     
- Misses       4394     4401       +7

Flag	Coverage Δ
#multiple	`89.3% <100%> (ø)`	⬆️
#single	`40.56% <12.5%> (-0.12%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/generic.py	`95.78% <ø> (ø)`	⬆️
pandas/io/json/json.py	`91.78% <100%> (+0.03%)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.81% <0%> (-0.1%)`	⬇️
pandas/util/testing.py	`81.8% <0%> (+0.19%)`	⬆️
pandas/io/json/table_schema.py	`97.22% <0%> (+1.38%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d74ac70...df61cee. Read the comment docs.

codecov · 2017-12-01T15:48:36Z

Codecov Report

Merging #18591 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         157      157              
  Lines       51379    51393      +14     
==========================================
+ Hits        46985    46992       +7     
- Misses       4394     4401       +7

Flag	Coverage Δ
#multiple	`89.3% <100%> (ø)`	⬆️
#single	`40.56% <12.5%> (-0.12%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/generic.py	`95.78% <ø> (ø)`	⬆️
pandas/io/json/json.py	`91.78% <100%> (+0.03%)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.81% <0%> (-0.1%)`	⬇️
pandas/util/testing.py	`81.8% <0%> (+0.19%)`	⬆️
pandas/io/json/table_schema.py	`97.22% <0%> (+1.38%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d74ac70...df61cee. Read the comment docs.

codecov · 2017-12-01T15:48:44Z

Codecov Report

Merging #18591 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         157      157              
  Lines       51379    51393      +14     
==========================================
+ Hits        46985    46992       +7     
- Misses       4394     4401       +7

Flag	Coverage Δ
#multiple	`89.3% <100%> (ø)`	⬆️
#single	`40.56% <12.5%> (-0.12%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/generic.py	`95.78% <ø> (ø)`	⬆️
pandas/io/json/json.py	`91.78% <100%> (+0.03%)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.81% <0%> (-0.1%)`	⬇️
pandas/util/testing.py	`81.8% <0%> (+0.19%)`	⬆️
pandas/io/json/table_schema.py	`97.22% <0%> (+1.38%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d74ac70...df61cee. Read the comment docs.

codecov · 2017-12-01T15:48:45Z

Codecov Report

Merging #18591 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #18591      +/-   ##
==========================================
- Coverage    91.6%   91.56%   -0.05%     
==========================================
  Files         153      153              
  Lines       51273    51290      +17     
==========================================
- Hits        46970    46965       -5     
- Misses       4303     4325      +22

Flag	Coverage Δ
#multiple	`89.43% <100%> (-0.03%)`	⬇️
#single	`40.7% <19.04%> (-0.09%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/generic.py	`95.9% <ø> (ø)`	⬆️
pandas/io/json/json.py	`92.08% <100%> (+0.33%)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/plotting/_converter.py	`64.78% <0%> (-1.74%)`	⬇️
pandas/util/testing.py	`81.82% <0%> (-0.2%)`	⬇️
pandas/core/frame.py	`97.81% <0%> (-0.1%)`	⬇️
pandas/io/json/table_schema.py	`97.22% <0%> (+1.38%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34a8d36...b204d09. Read the comment docs.

jreback · 2017-12-03T15:48:09Z

pandas/io/json/json.py


        self.is_copy = None
        self._format_axes()

    def _format_axes(self):
        raise AbstractMethodError(self)

-    def write(self):
+    def _write(self):


not really what I meant

.write() should remain in the superclass as the public interface with no arguments, _write(obj) will take an obj for example, so we dont' mutate in the subclasses (you will instead pass the revised obj in the super call)

So should there be a .write(self) and ._write(self, obj) method in the Writer class and then a ._write(self, obj) method in each of the subclasses (SeriesWriter, FrameWriter and JSONTableWriter)?

Then the ._write(self, obj) methods in each of the subclasses call super()._write(obj) after making any necessary adjustments to obj?

And in the to_json() function at the top of the file we use ._write(obj) on an instance of the relevant class?

.write() will just call ._write()

so the original api is preserved

jreback · 2017-12-06T11:31:41Z

looks good. rebase and ping on green.

reidy-p · 2017-12-06T20:10:26Z

@jreback thanks! Green now.

jreback · 2017-12-08T02:19:02Z

pandas/tests/io/json/test_pandas.py

@@ -1147,3 +1148,84 @@ def test_data_frame_size_after_to_json(self):
        size_after = df.memory_usage(index=True, deep=True).sum()

        assert size_before == size_after
+
+    def test_index_false_to_json(self):


can you parametrize this, will be a bit more readable i think.

jorisvandenbossche · 2017-12-08T10:52:42Z

pandas/core/generic.py

@@ -1671,6 +1672,13 @@ def to_json(self, path_or_buf=None, orient=None, date_format=None,

            .. versionadded:: 0.21.0

+        index : boolean, default True
+            Whether to include the index values in the JSON string. A
+            ValueError will be thrown if index is False when orient is not


I would say something like "Not including the index (index=False) is only supported for orient 'split' and 'table'" instead

jorisvandenbossche · 2017-12-08T10:57:44Z

pandas/io/json/json.py

+    def _write(self, obj, orient, double_precision, ensure_ascii,
+               date_unit, iso_dates, default_handler):
+        if not self.index:
+            obj = obj.drop('index', axis=1)


this is not robust (you can have a MultiIndex, or an index with another name, etc ..)
You need to handle this above when reset_index is called (do drop=True/False depending on the value of self.index)

(and also make sure to add a test for this)

Thanks, that's a good point. I think I have it fixed now.

Did you push the new commits? (I don't see any change here)

Yeah sorry I forgot to push the new commit. I've pushed now.

That looks better now!

…index name

jreback · 2017-12-10T14:49:32Z

pandas/tests/io/json/test_pandas.py

@@ -1147,3 +1148,64 @@ def test_data_frame_size_after_to_json(self):
        size_after = df.memory_usage(index=True, deep=True).sum()

        assert size_before == size_after
+
+    @pytest.mark.parametrize('data, expected', [


jreback · 2017-12-10T14:50:20Z

lgtm. @jorisvandenbossche merge when satisfied.

jorisvandenbossche · 2017-12-10T15:26:40Z

@reidy-p Thanks!

jreback requested changes Dec 1, 2017

View reviewed changes

jreback added Enhancement IO JSON read_json, to_json, json_normalize labels Dec 1, 2017

reidy-p force-pushed the json_index branch from df61cee to c867166 Compare December 2, 2017 15:56

jreback requested changes Dec 3, 2017

View reviewed changes

reidy-p force-pushed the json_index branch from c867166 to 176934f Compare December 4, 2017 15:05

jreback added this to the 0.22.0 milestone Dec 6, 2017

jreback approved these changes Dec 6, 2017

View reviewed changes

reidy-p force-pushed the json_index branch from 176934f to 753afc9 Compare December 6, 2017 11:41

jreback requested changes Dec 8, 2017

View reviewed changes

jorisvandenbossche requested changes Dec 8, 2017

View reviewed changes

reidy-p force-pushed the json_index branch 2 times, most recently from 934a1d3 to e3df081 Compare December 9, 2017 15:44

reidy-p added 4 commits December 10, 2017 13:36

EHN: Add index parameter to to_json

f8618b8

Use super() in dervived class and add comments to tests

90c1a32

rewriting .write and ._write

1a3e41a

parametrize tests and allow index parameter to handle MultiIndex and …

b204d09

…index name

reidy-p force-pushed the json_index branch from e3df081 to b204d09 Compare December 10, 2017 13:45

jreback reviewed Dec 10, 2017

View reviewed changes

jreback approved these changes Dec 10, 2017

View reviewed changes

jorisvandenbossche approved these changes Dec 10, 2017

View reviewed changes

jorisvandenbossche merged commit 2efd67f into pandas-dev:master Dec 10, 2017

reidy-p deleted the json_index branch December 10, 2017 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EHN: Add index parameter to to_json #18591

EHN: Add index parameter to to_json #18591

reidy-p commented Dec 1, 2017 •

edited

Loading

jreback left a comment

jreback Dec 1, 2017

jreback Dec 1, 2017

jreback Dec 1, 2017

codecov bot commented Dec 1, 2017

codecov bot commented Dec 1, 2017

codecov bot commented Dec 1, 2017

codecov bot commented Dec 1, 2017 •

edited

Loading

jreback Dec 3, 2017

reidy-p Dec 3, 2017

jreback Dec 3, 2017

jreback commented Dec 6, 2017

reidy-p commented Dec 6, 2017

jreback Dec 8, 2017

jorisvandenbossche Dec 8, 2017

jorisvandenbossche Dec 8, 2017

reidy-p Dec 8, 2017

jorisvandenbossche Dec 10, 2017

reidy-p Dec 10, 2017

jorisvandenbossche Dec 10, 2017

jreback Dec 10, 2017

jreback commented Dec 10, 2017

jorisvandenbossche commented Dec 10, 2017

EHN: Add index parameter to to_json #18591

EHN: Add index parameter to to_json #18591

Conversation

reidy-p commented Dec 1, 2017 • edited Loading

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 1, 2017

Codecov Report

codecov bot commented Dec 1, 2017

Codecov Report

codecov bot commented Dec 1, 2017

Codecov Report

codecov bot commented Dec 1, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 6, 2017

reidy-p commented Dec 6, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Dec 10, 2017

jorisvandenbossche commented Dec 10, 2017

reidy-p commented Dec 1, 2017 •

edited

Loading

codecov bot commented Dec 1, 2017 •

edited

Loading