Skip to content

Conversation

ayushjariyal
Copy link
Contributor

@ayushjariyal ayushjariyal commented Aug 30, 2025

Fixes #1191

In this PR, I added detailed docstrings to the pyiceberg/table/inspect file. I verified that everything works correctly by running make lint, and all checks passed successfully.

Screenshot from 2025-08-30 13-04-04

@ayushjariyal
Copy link
Contributor Author

@Fokko, Can you please review this PR and let me know if any additions or modifications are needed?

Copy link
Contributor

@gabeiglio gabeiglio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I left some nits for typos. One thing Im on the fence about, are the table's schemas (especially when they are big schemas) since most of them like data_file are well documented objects.

@@ -34,6 +34,12 @@


class InspectTable:
"""A utility class for inspecting and analysing Iceberge table metadata.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""A utility class for inspecting and analysing Iceberge table metadata.
"""A utility class for inspecting and analyzing Iceberg table metadata.

snapshot_id (Optional[int]): ID of the snapshot to read entries from. If None, the current snapshot is used.

Returns:
pa.Table: A PyArraow table where each row represent a manifest entry with fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pa.Table: A PyArraow table where each row represent a manifest entry with fields:
pa.Table: A PyArrow table where each row represent a manifest entry with fields:

"""Generate a PyArrow table containing metadata references from a table.

Returns:
pa.Table: A PyArraow table with the following schema:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pa.Table: A PyArraow table with the following schema:
pa.Table: A PyArrow table with the following schema:

"""Process a manifest file and extract partition-level statistics.

Args:
manifest: The manifest file containing metadata about data files and delete files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
manifest: The manifest file containing metadata about data files and delete files.
manifest: The manifest file containing metadata about data files and delete files.

@ayushjariyal
Copy link
Contributor Author

Thanks for the PR! I left some nits for typos. One thing Im on the fence about, are the table's schemas (especially when they are big schemas) since most of them like data_file are well documented objects.

Thanks for the review! I agree that the table's schemas are already well-documented objects. I added them in the docstrings just as part of documentation. If you think they are unnecessary, I can remove them.

@ayushjariyal ayushjariyal requested a review from gabeiglio August 31, 2025 06:44
@gabeiglio
Copy link
Contributor

Thanks for applying the changes! CI have a warning for this:
WARNING - griffe: pyiceberg/table/inspect.py:1013: Confusing indentation for continuation line 7 in docstring, should be 4 * 2 = 8 spaces, not 7

could you run make lint to see if it automatically changes it?

@@ -612,6 +858,37 @@ def _get_files_from_manifest(
)

def _get_files_schema(self) -> "pa.Schema":
"""Build the PyArrow schema for the files metadata table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good then to remove the schemas from the docstring since it is easy to infer from the code. But I'll tag @sungwy for another opinion on this.

@ayushjariyal
Copy link
Contributor Author

Thanks for applying the changes! CI have a warning for this: WARNING - griffe: pyiceberg/table/inspect.py:1013: Confusing indentation for continuation line 7 in docstring, should be 4 * 2 = 8 spaces, not 7

could you run make lint to see if it automatically changes it?

make lint didn’t fix it automatically, so I corrected the indentation manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Docstrings to pyiceberg/table/inspect.py
2 participants