-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
BUG: Fix incorrect FutureWarning for logical ops on pyarrow bool Series (#62260) #62290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
BUG: Fix incorrect FutureWarning for logical ops on pyarrow bool Series (#62260) #62290
Conversation
…e for logical operations
the dosctring check required me to return a true or false bool only
…manual pre-commit hooks (pull_request)Failing after 16m
…or truthy values like 1
Kindly review my PR for improving the logical operation of arrays and aligning them with Kleene's Principles. Please tell me if any issues. Finally heading out to touch some grass :) |
@simonjayhawkins Kindly review my PR and let me know 👍 (If in testing phase or not) |
This Wikipedia article may be useful to understand the changes in this PR: https://en.wikipedia.org/wiki/Three-valued_logic#Kleene_and_Priest_logics It's also important to point-out that this is a breaking change and should have an entry in |
Also, import pandas as pd
index = list("FUT")
a = pd.Series([False, None, True], index=index, dtype="bool[pyarrow]")
t = pd.Series([True] * 3, index=index, dtype = "bool[pyarrow]")
u = pd.Series([None] * 3, index=index, dtype = "bool[pyarrow]")
f = pd.Series([False] * 3, index=index, dtype = "bool[pyarrow]")
print("negation")
print(~a)
methods = ["__and__", "__or__", "__xor__"]
for method in methods:
print(method)
fn = getattr(a, method)
observed = pd.DataFrame(dict(F=fn(f), U=fn(u), T=fn(t)), index=index)
print(observed) Output
|
@Alvaro-Kothe Yes I agree, the pyarrow implementation follows the kleene principle whereas bool does not. Thank you for attaching the wiki article 👍 The core members of pandas lib are the ones who fill the whatsnew right? |
@Tarun2605 Usually, whoever creates the pull request should fill the |
@Alvaro-Kothe Ohhhh!! Could you please tell me where do I fill it out? Thank you so much btw man |
Utilize your own discretion or await guidance from a core member. |
Okkk |
This pull request introduces support for Kleene's three-valued logic (handling True, False, and NA) in pandas logical operations on arrays containing missing values. The main changes include new helper functions to safely evaluate boolean logic with missing values, modifications to the logical operation implementation, and updates to tests to reflect the new behavior.
Enhancements to logical operations with missing values:
is_nullable_bool
,safe_is_true
, andalignOutputWithKleene
helper functions inpandas/core/ops/array_ops.py
to enable elementwise logical operations that correctly handle NA values using Kleene logic.logical_op
inpandas/core/ops/array_ops.py
to use Kleene logic when both operands are boolean arrays (possibly with NA), ensuring correct propagation of unknowns.Test updates for new logic:
TestDataFrameLogicalOperators
inpandas/tests/frame/test_logical_ops.py
to match the new Kleene logic semantics, where logical operations with NA now return NA or propagate True/False according to Kleene's rules.test_logical_with_nas
test to expect results consistent with Kleene logic, ensuring that logical operations involving NA and True/False yield the correct outcomes.- [x] closes BUG: Incorrect Future warning using a logical operation between two pyarrow boolean series #62260pre-commit run --all-files
What does this PR change?
This pull request updates the Arrow extension array implementation in pandas to improve how missing values are handled during logical operations on
bool
.Previously, operations like
&
or|
between two bool series was not following Kleene's Logic.How was this fixed?