-
Notifications
You must be signed in to change notification settings - Fork 356
Add Detailed Docstrings in pyiceberg/table/inspect.py
file.
#2406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@Fokko, Can you please review this PR and let me know if any additions or modifications are needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left some nits for typos. One thing Im on the fence about, are the table's schemas (especially when they are big schemas) since most of them like data_file
are well documented objects.
pyiceberg/table/inspect.py
Outdated
@@ -34,6 +34,12 @@ | |||
|
|||
|
|||
class InspectTable: | |||
"""A utility class for inspecting and analysing Iceberge table metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""A utility class for inspecting and analysing Iceberge table metadata. | |
"""A utility class for inspecting and analyzing Iceberg table metadata. |
pyiceberg/table/inspect.py
Outdated
snapshot_id (Optional[int]): ID of the snapshot to read entries from. If None, the current snapshot is used. | ||
|
||
Returns: | ||
pa.Table: A PyArraow table where each row represent a manifest entry with fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pa.Table: A PyArraow table where each row represent a manifest entry with fields: | |
pa.Table: A PyArrow table where each row represent a manifest entry with fields: |
pyiceberg/table/inspect.py
Outdated
"""Generate a PyArrow table containing metadata references from a table. | ||
|
||
Returns: | ||
pa.Table: A PyArraow table with the following schema: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pa.Table: A PyArraow table with the following schema: | |
pa.Table: A PyArrow table with the following schema: |
pyiceberg/table/inspect.py
Outdated
"""Process a manifest file and extract partition-level statistics. | ||
|
||
Args: | ||
manifest: The manifest file containing metadata about data files and delete files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manifest: The manifest file containing metadata about data files and delete files. | |
manifest: The manifest file containing metadata about data files and delete files. |
Thanks for the review! I agree that the table's schemas are already well-documented objects. I added them in the docstrings just as part of documentation. If you think they are unnecessary, I can remove them. |
Thanks for applying the changes! CI have a warning for this: could you run |
@@ -612,6 +858,37 @@ def _get_files_from_manifest( | |||
) | |||
|
|||
def _get_files_schema(self) -> "pa.Schema": | |||
"""Build the PyArrow schema for the files metadata table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good then to remove the schemas from the docstring since it is easy to infer from the code. But I'll tag @sungwy for another opinion on this.
|
Fixes #1191
In this PR, I added detailed docstrings to the
pyiceberg/table/inspect
file. I verified that everything works correctly by runningmake lint
, and all checks passed successfully.