-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
COMPAT: reading generic PyTables Table format fails with sub-selection #26818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/tests/io/test_pytables.py
Outdated
(not written by pandas). | ||
""" | ||
|
||
def _create_simple_hdf5_file_with_pytables(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary or could you just add a file in pandas\tests\io\data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary or could you just add a file in pandas\tests\io\data
Oh, that's some early feedback. Thank you! Much appreciated.
I think a proper test for the fix for #11188 should confirm that the minimal working example for reproducing the problem does not fail anymore.
I also think this is more readable (when the test itself shows how the data file was created, in contrast to committing a binary file of which it's not 100 % clear and discoverable of how it has been created).
Also, I'd like to think of this (using the PyTables
API to create an HDF5 file) as an important aspect of this test: it effectively confirms a common way to use PyTables
and pandas
in tandem; PyTables
for writing an HDF5 file, and pandas
for reading it. These tests are not so much about the contents in the file, but about that a specific way to create the file matches a specific way to parse the file.
I'll try to make this a little more compact, though.
Well, let's see about the "fix", looking forward to your feedback there (commit upcoming). I think this will be more controversial.
Codecov Report
@@ Coverage Diff @@
## master #26818 +/- ##
===========================================
- Coverage 91.86% 41.1% -50.77%
===========================================
Files 179 179
Lines 50707 50710 +3
===========================================
- Hits 46583 20842 -25741
- Misses 4124 29868 +25744
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26818 +/- ##
==========================================
- Coverage 91.98% 91.98% -0.01%
==========================================
Files 180 180
Lines 50772 50774 +2
==========================================
- Hits 46704 46702 -2
- Misses 4068 4072 +4
Continue to review full report at Codecov.
|
Funny |
@gfyoung @simonjayhawkins @jreback I would appreciate your feedback, especially about the "fix" itself for starters. The workaround is not necessarily nice. All tests pass, but could it be that this approach (introducing |
pandas/io/pytables.py
Outdated
# Piggy-back normalized `nrows` (considering start and stop) onto | ||
# PyTables tables object so that the GenericIndexCol constructor | ||
# knows how large the index should be. | ||
self.s.table._nrows_to_read = stop - start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start
and stop
- why not make these instance attributes of self
?
Or even just compute nrows
as an attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huhu!
start and stop - why not make these instance attributes of self ?
Hm, what do you have in mind about doing with them after having set them as instance attributes of self
? In the three lines below we're doing exactly that (we set self.start
and self.stop
; did you mean something else?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even just compute nrows as an attribute?
Interesting. I thought about this a bit but I am not sure what you mean. Where; on the PyTables
table object?
That would be quite invasive. A PyTables
table object might use its nrows
property internally.
Adding a new private attribute to an object (guaranteed to not mess with the object's intrinsics) is not beautiful, but a little less invasive than overwriting an object property which might mess with that object's internal consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgehrcke : I'm talking about attaching a new attribute to the TableIterator
instance itself.
Attaching attributes to the TableIterator
's s.table
is riskier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgehrcke : I'm talking about attaching a new attribute to the TableIterator instance itself.
Gotcha! I'll look into that. Thanks for the feedback, much appreciated.
Attaching attributes to the TableIterator's s.table is riskier.
Ack!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is however no obvious way for me to access the TableIterator
instance from the GenericIndexCol
instance. The latter inherits from IndexCol
which gets its .table
attribute set through a set_table()
method, but as far as I can see there is no existing obvious relationship to reach back to the TableIterator
... I think I had the problem before which is why I chose the table object to piggy-back the piece of information on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gfyoung so this is the context where we call into the convert()
method of the GenericIndexCol
:
Line 3437 in a60888c
a.convert(values, nan_rep=self.nan_rep, encoding=self.encoding, |
There are certainly other ways to pass along the _nrows_to_read
concept but I didn't see a cleaner way yet, specifically I didn't see a way that does not add additional scaffold, passing arguments around etc.
Did you have something very specific in mind that I simply didn't see?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting...we might have to then propagate an nrows
parameter throughout the IndexCol
instantiations.
Attaching a parameter to an object we have no information about is definitely riskier and hackier (and is therefore a practice I would discourage) compared to passing arguments around.
@jreback : Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attaching a parameter to an object we have no information about is definitely riskier and hackier (and is therefore a practice I would discourage) compared to passing arguments around.
I do agree.
@gfyoung @jreback after following the quite entangled code path I think I found a cleaner solution. Key insight was the following: In # function to call on iteration
def func(_start, _stop, _where):
return s.read(start=_start, stop=_stop,
where=_where,
columns=columns)
# create the iterator
it = TableIterator(self, s, func, where=where, nrows=s.nrows,
start=start, stop=stop, iterator=iterator,
chunksize=chunksize, auto_close=auto_close)
return it.get_result() In the problematic code path def read_axes(self, where, **kwargs):
"""create and return the axes sniffed from the table: return boolean
for success
"""
... As of the callback being
|
doc/source/whatsnew/v0.25.0.rst
Outdated
@@ -656,6 +656,7 @@ I/O | |||
- Bug in :func:`read_csv` not properly interpreting the UTF8 encoded filenames on Windows on Python 3.6+ (:issue:`15086`) | |||
- Improved performance in :meth:`pandas.read_stata` and :class:`pandas.io.stata.StataReader` when converting columns that have missing values (:issue:`25772`) | |||
- Bug in :meth:`DataFrame.to_html` where header numbers would ignore display options when rounding (:issue:`17280`) | |||
- Bug in :func:`read_hdf` where the `start` and `stop` arguments would raise a ``ValueError`` indicating an index size mismatch (:issue:`11188`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not very informative of what you are actually addressing, can you re-word
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit upcoming
@@ -1627,7 +1627,7 @@ def infer(self, handler): | |||
new_self.read_metadata(handler) | |||
return new_self | |||
|
|||
def convert(self, values, nan_rep, encoding, errors): | |||
def convert(self, values, nan_rep, encoding, errors, **kwargs): | |||
""" set the values from this selection: take = take ownership """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc-string here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
types are a bonus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just add start=None
and stop=None
instead of kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just add start=None and stop=None instead of kwargs
done, commit upcoming
can you add a doc-string here
I would love to add one! I have difficulties understanding the general purpose of the convert()
method though and documenting values
, nan_rep
, encoding
, and errors
is pretty challenging given my lack of understanding. As far as I see the start
and stop
arguments are not really meaningful arguments here (in IndexCol.convert()
), so I am also a bit helpless with documenting them! :-) How would you document start
and stop
here?
pandas/io/pytables.py
Outdated
@@ -1816,10 +1816,14 @@ class GenericIndexCol(IndexCol): | |||
def is_indexed(self): | |||
return False | |||
|
|||
def convert(self, values, nan_rep, encoding, errors): | |||
def convert(self, values, nan_rep, encoding, errors, **kwargs): | |||
""" set the values from this selection: take = take ownership """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc-string here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 adding a docstring explaining the start
and stop` parameters, commit upcoming
@@ -2162,7 +2166,7 @@ def validate_attr(self, append): | |||
raise ValueError("appended items dtype do not match existing " | |||
"items dtype in table!") | |||
|
|||
def convert(self, values, nan_rep, encoding, errors): | |||
def convert(self, values, nan_rep, encoding, errors, **kwargs): | |||
"""set the data from this selection (and convert to the correct dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
pandas/tests/io/test_pytables.py
Outdated
] | ||
|
||
# This returns a path and does not open the file. | ||
tmpfilepath = create_tempfile(self.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this should just be a fixture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I now turned the entire thing into a module-scoped fixture, providing the path to the HDF5 file and details about its contents.
pandas/tests/io/test_pytables.py
Outdated
|
||
def _compare(self, df, samples): | ||
"""Compare the reference `samples` with the contents in DataFrame `df`. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't write a custom comparator, use assert_frame_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Interesting:
E AssertionError: Attributes are different
E
E Attribute "dtype" are different
E [left]: uint32
E [right]: int64
But that's probably expected, right? So, we expect the values in the UInt32Col
in the PyTables file to end up being of type int64, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for the review, I have committed and pushed changes and would appreciate another round of feedback.
🚀
Table row number: the start of the sub-selection. | ||
stop : int, optional | ||
Table row number: the end of the sub-selection. Values larger than | ||
the underlying table's row count are normalized to that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least here I understand the meaning of start
and stop
and have tried to document it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, can you add all the parameters. See if you can put something down for them, some doc-string is better than none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Adding statements for values
, nan_rep
, encoding
, and errors
.
The biggest challenge for me still is that I do not feel like I understand the general purpose of the convert() method and its arguments.
Will polish (squash commits, address line length issues) after another round of comments. Update: just force-pushed after rebase on current master:
|
--start and --stop upon read_hdf() do not yet work as of pandas-dev/pandas#26818
Table row number: the start of the sub-selection. | ||
stop : int, optional | ||
Table row number: the end of the sub-selection. Values larger than | ||
the underlying table's row count are normalized to that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, can you add all the parameters. See if you can put something down for them, some doc-string is better than none.
pandas/io/pytables.py
Outdated
""" | ||
|
||
start = start if start is not None else 0 | ||
stop = min(stop, self.table.nrows) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use perens rather than a line-continuation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit upcoming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/io/pytables.py
Outdated
a.convert(values, nan_rep=self.nan_rep, encoding=self.encoding, | ||
errors=self.errors) | ||
errors=self.errors, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this kwargs coming from? do you mean to pass ing start/stop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this kwargs coming from
I have tried to answer this question with the code comment added right above that line, and all I know is that kwargs may contain start
and stop
. I could not infer any guarantees.
Following the kwargs flow(s) is a challenging part of reading io/pytables.py
. It's pretty non-obvious.
There are two calls into that function that pass **kwargs
where it's non-obvious where it's coming from:
Line 3888 in d47947a
if not self.read_axes(where=where, **kwargs): |
Line 4143 in d47947a
if not self.read_axes(where=where, **kwargs): |
For following just the latter flow I am now looking for read()
being called and this is where things get tricky...
$ grep -nr 'read(' pandas/io/pytables.py | grep -v 'def ' | grep -v validate_read
721: return s.read(start=_start, stop=_stop,
837: objs = [t.read(where=_where, columns=columns, start=_start,
1421: return s.read(**kwargs)
2934: sdict[c] = s.read()
4241: s = super().read(columns=columns, **kwargs)
4342: df = super().read(**kwargs)
4740: return self.table.table.read(start=self.start, stop=self.stop)
From here it looks like there is more kwargs to follow. I know one definite call path that comes along line 837 and there we know that start
and stop
are being passed explicitly (that's the code path coming from read_hdf()
). But I am not sure about other possible code paths.
Relying on the test suite to not break, but also relying on not relying on start
and stop
being passed explicitly.
I am now doing this instead:
- errors=self.errors, **kwargs)
+ errors=self.errors, start=kwargs.get('start'),
+ stop=kwargs.get('stop'))
So that things still don't break when start
or stop
are not in kwargs
. Sound good?
pandas/tests/io/test_pytables.py
Outdated
@@ -5167,3 +5168,71 @@ def test_dst_transitions(self): | |||
store.append('df', df) | |||
result = store.select('df') | |||
assert_frame_equal(result, df) | |||
|
|||
|
|||
@pytest.fixture(scope='module') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not necessary, function scoped is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 commit upcoming
pandas/tests/io/test_pytables.py
Outdated
{'c0': t0 + 2, 'c1': 'ccccc', 'c2': 10**5}, | ||
{'c0': t0 + 3, 'c1': 'ddddd', 'c2': 4294967295}, | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use ensure_clean_path here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, commit upcoming.
Food for thought (I don't expect answers for this pull request here, it's probably a rabbit hole):
-
If a test fails, don't we want to retain the corresponding HDF5 file in the file system to enable later inspection? If the answer is "yes" then we should not use a context manager for (quite reliable) file removal but do it in the fixture teardown only upon test success.
-
What do we think about using a pytest-provided fixture for temporary file management?
@pytest.fixture()
def pytables_hdf5_file(tmp_path):
[...]
In that case the file path would contain the test name, e.g. tmp_path = PosixPath('PYTEST_TMPDIR/test_create_file0')
as documented here: https://docs.pytest.org/en/latest/tmpdir.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure_clean_path already does this, not really necessary to do anything else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure_clean_path already does this
The way I read the code it will always remove the file, also upon test failure.
Anyway :).
pandas/tests/io/test_pytables.py
Outdated
return tmpfilepath, objectname, pd.DataFrame(testsamples) | ||
|
||
|
||
class TestReadPyTablesHDF5(Base): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to inherit from Base?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to inherit from Base?
I do not really know. Not sure if tm.reset_testing_mode()
is important at all in my context here, # Pytables 3.0.0 deprecates lots of things
kind of sounds important though:
@classmethod
def setup_class(cls):
# Pytables 3.0.0 deprecates lots of things
tm.reset_testing_mode()
@classmethod
def teardown_class(cls):
# Pytables 3.0.0 deprecates lots of things
tm.set_testing_mode()
Given that you doubt that I need it I have added a commit and don't inherit from Base
anymore. The tests still pass on my system.
pandas/tests/io/test_pytables.py
Outdated
|
||
def test_read_complete(self, pytables_hdf5_file): | ||
path, objname, expected_df = pytables_hdf5_file | ||
assert_frame_equal(pd.read_hdf(path, key=objname), expected_df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write the asserts like
result =
expected =
assert_frame_equal(result, expected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks
@jreback I tried to address all points in your last round of feedback. I feel like we're getting closer. Would very much appreciate another look, thanks for your time 🕙 ! |
pandas/io/pytables.py
Outdated
---------- | ||
|
||
values : | ||
Expected to be passed but ignored in this implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you write that these are, e.g.
values: np.ndarray
nan_rep : str
encoding: str
errors : str (I think)
pandas/io/pytables.py
Outdated
values : | ||
Expected to be passed but ignored in this implementation. | ||
nan_rep : | ||
Expected to be passed but ignored in this implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need these comments
pandas/tests/io/test_pytables.py
Outdated
@@ -5,6 +5,7 @@ | |||
from io import BytesIO | |||
import os | |||
import tempfile | |||
import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are you using this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used this for dynamically generating unix time stamps (time.time()
) before writing them to the HDF5 file, but made this static now.
pandas/tests/io/test_pytables.py
Outdated
{'c0': t0 + 2, 'c1': 'ccccc', 'c2': 10**5}, | ||
{'c0': t0 + 3, 'c1': 'ddddd', 'c2': 4294967295}, | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure_clean_path already does this, not really necessary to do anything else
pandas/tests/io/test_pytables.py
Outdated
@@ -5167,3 +5168,70 @@ def test_dst_transitions(self): | |||
store.append('df', df) | |||
result = store.select('df') | |||
assert_frame_equal(result, df) | |||
|
|||
|
|||
@pytest.fixture() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need the ()
pandas/tests/io/test_pytables.py
Outdated
@@ -5167,3 +5168,70 @@ def test_dst_transitions(self): | |||
store.append('df', df) | |||
result = store.select('df') | |||
assert_frame_equal(result, df) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move these to a separate (new) file: test_pytables_compat.py (even better to move to
pandas/tests/io/pytables/test_compat.py (you will need to add an init.py in the subdir)
we are starting to split the test_pytables up a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(even better to move to pandas/tests/io/pytables/test_compat.py )
Doing so!
(you will need to add an init.py in the subdir)
Why?
It does not need to be a Python package for pytest to discover tests in the directory:
$ pytest --collect-only pandas/tests
[...]
<Module pandas/tests/io/pytables/test_compat.py>
<Class TestReadPyTablesHDF5>
<Function test_read_complete>
<Function test_read_with_start>
<Function test_read_with_stop>
<Function test_read_with_startstop>
[...]
Any other reason?
@jreback thanks for your feedback once again. I have
In particular, I have now moved the new tests to I have also not used
|
|
||
objname = 'pandas_test_timeseries' | ||
|
||
# The `tmp_path` fixture provides a temporary directory unique to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use ensure_clean_path; we don't use the pytest fixture at this time, though you can open an issue to change (many occurrences of this I suppose).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, but need to clean up the test files, ping on green.
def pytables_hdf5_file(tmp_path): | ||
"""Use PyTables to create a simple HDF5 file. | ||
|
||
There is no need for temporary file cleanup: pytest's `tmp_path` fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this comment as well. see my comment below. we don't want to introduce a different idiom for testing (but as I said you can open an issue for discussion).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can open an issue for discussion
#26984 :)
|
||
|
||
@pytest.fixture | ||
def pytables_hdf5_file(tmp_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this fixture anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this fixture anymore
Thanks man. I hope one of the linters would have told me! :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i don’t think we have that check in linting :)
|
||
class TestReadPyTablesHDF5: | ||
""" | ||
A group of tests which covers reading HDF5 files written by plain PyTables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number here in the comment
@jreback it's green now. Wanna have another look before I squash some commits? |
thanks @jgehrcke nice patch. We squash on merge anyhow, so no need! |
If interested would like to move the existing test_pytables test files (test_pytables.py and test_pytables_missing.py) to pandas/test/io/pytables (just a simple move). We want to split the main file, but that will be separate. and I think we usually add an |
Thank you for the nice collaboration! |
Sure, PR upcoming. Update: #26986 |
git diff upstream/master -u -- "*.py" | flake8 --diff