-
Notifications
You must be signed in to change notification settings - Fork 701
Open
Labels
bugIncorrect behavior inside of ibisIncorrect behavior inside of ibis
Description
What happened?
If the source table schema has a nullable struct column whose child fields are non-nullable, executing to_pyarrow() fails with
ArrowInvalid: field 'X' of type T has nulls. Can't cast to non-nullable field 'X' of type T
I think the failure occurs for rows where the parent struct is NULL and the downstream conversion materializes child fields as NULLs, conflicting with the non-nullable child schema.
Code to reproduce the error
import ibis
from ibis.expr import datatypes as dt
# Define the struct type:
# - The struct itself is nullable
# - Its children are required (non-nullable) string and int64
address_type = dt.Struct({
"street": "!string", # required child
"number": "!int", # required child
})
# Build the table schema with the nullable struct column
schema = ibis.schema({
"id": "int",
"address": address_type,
})
ibis.set_backend("polars")
schema = ibis.schema(schema)
data = [
{"id": 1, "address": {"street": "Main St", "number": 10}},
{"id": 2, "address": None},
{"id": 3, "address": {"street": "High St", "number": 42}},
]
# Create a memtable with the schema
table = ibis.memtable(data, schema=schema)
# Show the inferred schema (for verification)
print(table.schema())
# This fails with ArrowInvalid: field has nulls. Can't cast to non-nullable string type
print(table.select("address").to_pyarrow())
What version of ibis are you using?
10.3.0
What backend(s) are you using, if any?
BigQuery
Relevant log output
ibis.Schema {
id int64
address struct<street: !string, number: !int64>
}
Traceback (most recent call last):
File "/opt/projects/scratch/ibisbug.py", line 39, in <module>
print(table.select("address").to_pyarrow())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/ibis/expr/types/core.py", line 579, in to_pyarrow
return self._find_backend(use_default=True).to_pyarrow(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/ibis/backends/polars/__init__.py", line 547, in to_pyarrow
result = self._to_pyarrow_table(expr, params=params, limit=limit, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/ibis/backends/polars/__init__.py", line 536, in _to_pyarrow_table
return PyArrowData.convert_table(df.to_arrow(), expr.as_table().schema())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/ibis/formats/pyarrow.py", line 332, in convert_table
arrays = [
^
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/ibis/formats/pyarrow.py", line 333, in <listcomp>
cls.convert_column(table[name], dtype) for name, dtype in schema.items()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/ibis/formats/pyarrow.py", line 323, in convert_column
return column.cast(desired_type)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/table.pxi", line 597, in pyarrow.lib.ChunkedArray.cast
File "/opt/projects/scratch/venv/lib/python3.11/site-packages/pyarrow/compute.py", line 412, in cast
return call_function("cast", [arr], options, memory_pool)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pyarrow/_compute.pyx", line 604, in pyarrow._compute.call_function
File "pyarrow/_compute.pyx", line 399, in pyarrow._compute.Function.call
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: field 'street' of type large_string has nulls. Can't cast to non-nullable field 'street' of type stringCode of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugIncorrect behavior inside of ibisIncorrect behavior inside of ibis
Type
Projects
Status
backlog