Closed
Description
Describe the bug
Upgrading Comet to use 48.0.0-rc2 causes tests to fail with a attempt to subtract with overflow
panic. This did not happen with rc1. I have not debugged this yet to find the root cause.
PR: apache/datafusion-comet#1853
failing build: https://github.com/apache/datafusion-comet/actions/runs/15491877086/job/43619110943?pr=1853
The relevant part of the stack trace is:
2025-06-06T13:57:54.1903145Z at datafusion_expr::window_state::WindowAggState::update(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/expr/src/window_state.rs:95)
2025-06-06T13:57:54.1905310Z at datafusion_physical_expr::window::window_expr::AggregateWindowExpr::aggregate_evaluate_stateful(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-expr/src/window/window_expr.rs:260)
2025-06-06T13:57:54.1920612Z at <datafusion_physical_expr::window::aggregate::PlainAggregateWindowExpr as datafusion_physical_expr::window::window_expr::WindowExpr>::evaluate_stateful(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-expr/src/window/aggregate.rs:148)
2025-06-06T13:57:54.1924024Z at datafusion_physical_plan::windows::bounded_window_agg_exec::BoundedWindowAggStream::compute_aggregates(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:983)
2025-06-06T13:57:54.1927398Z at datafusion_physical_plan::windows::bounded_window_agg_exec::BoundedWindowAggStream::poll_next_inner(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:1033)
2025-06-06T13:57:54.1930653Z at <datafusion_physical_plan::windows::bounded_window_agg_exec::BoundedWindowAggStream as futures_core::stream::Stream>::poll_next(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:949)
There was one PR between rc1 and rc2 specifically related to evaluating window expressions, so I wonder if that is the issue. I will try and confirm.
Full stack trace:
2025-06-06T13:57:54.1864287Z - aggregate window function for all types *** FAILED *** (406 milliseconds)
2025-06-06T13:57:54.1871363Z org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2045.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2045.0 (TID 5401) (62bae2d9d85a executor driver): org.apache.comet.CometNativeException: attempt to subtract with overflow
2025-06-06T13:57:54.1873529Z at comet::errors::init::{{closure}}(/__w/datafusion-comet/datafusion-comet/native/core/src/errors.rs:151)
2025-06-06T13:57:54.1883399Z at <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/alloc/src/boxed.rs:1980)
2025-06-06T13:57:54.1894489Z at std::panicking::rust_panic_with_hook(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/panicking.rs:841)
2025-06-06T13:57:54.1895884Z at std::panicking::begin_panic_handler::{{closure}}(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/panicking.rs:699)
2025-06-06T13:57:54.1897662Z at std::sys::backtrace::__rust_end_short_backtrace(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/sys/backtrace.rs:168)
2025-06-06T13:57:54.1899012Z at __rustc::rust_begin_unwind(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/panicking.rs:697)
2025-06-06T13:57:54.1900180Z at core::panicking::panic_fmt(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/core/src/panicking.rs:75)
2025-06-06T13:57:54.1901495Z at core::panicking::panic_const::panic_const_sub_overflow(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/core/src/panicking.rs:178)
2025-06-06T13:57:54.1903145Z at datafusion_expr::window_state::WindowAggState::update(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/expr/src/window_state.rs:95)
2025-06-06T13:57:54.1905310Z at datafusion_physical_expr::window::window_expr::AggregateWindowExpr::aggregate_evaluate_stateful(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-expr/src/window/window_expr.rs:260)
2025-06-06T13:57:54.1920612Z at <datafusion_physical_expr::window::aggregate::PlainAggregateWindowExpr as datafusion_physical_expr::window::window_expr::WindowExpr>::evaluate_stateful(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-expr/src/window/aggregate.rs:148)
2025-06-06T13:57:54.1924024Z at datafusion_physical_plan::windows::bounded_window_agg_exec::BoundedWindowAggStream::compute_aggregates(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:983)
2025-06-06T13:57:54.1927398Z at datafusion_physical_plan::windows::bounded_window_agg_exec::BoundedWindowAggStream::poll_next_inner(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:1033)
2025-06-06T13:57:54.1930653Z at <datafusion_physical_plan::windows::bounded_window_agg_exec::BoundedWindowAggStream as futures_core::stream::Stream>::poll_next(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs:949)
2025-06-06T13:57:54.1933599Z at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
2025-06-06T13:57:54.1935713Z at futures_util::stream::stream::StreamExt::poll_next_unpin(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
2025-06-06T13:57:54.1938604Z at <datafusion_physical_plan::projection::ProjectionStream as futures_core::stream::Stream>::poll_next(/usr/local/cargo/git/checkouts/datafusion-11a8b534adb6bd68/85f6621/datafusion/physical-plan/src/projection.rs:354)
2025-06-06T13:57:54.1940894Z at <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:130)
2025-06-06T13:57:54.1942871Z at futures_util::stream::stream::StreamExt::poll_next_unpin(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638)
2025-06-06T13:57:54.1945055Z at <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/next.rs:32)
2025-06-06T13:57:54.1947663Z at futures_util::future::future::FutureExt::poll_unpin(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/future/future/mod.rs:558)
2025-06-06T13:57:54.1949835Z at <futures_util::async_await::poll::PollOnce<F> as core::future::future::Future>::poll(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/async_await/poll.rs:37)
2025-06-06T13:57:54.1952041Z at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}::{{closure}}(/__w/datafusion-comet/datafusion-comet/native/core/src/execution/jni_api.rs:438)
2025-06-06T13:57:54.1954070Z at tokio::runtime::park::CachedParkThread::block_on::{{closure}}(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/park.rs:284)
2025-06-06T13:57:54.1955846Z at tokio::task::coop::with_budget(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/task/coop/mod.rs:167)
2025-06-06T13:57:54.1957636Z at tokio::task::coop::budget(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/task/coop/mod.rs:133)
2025-06-06T13:57:54.1959325Z at tokio::runtime::park::CachedParkThread::block_on(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/park.rs:284)
2025-06-06T13:57:54.1961375Z at tokio::runtime::context::blocking::BlockingRegionGuard::block_on(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/context/blocking.rs:66)
2025-06-06T13:57:54.1963697Z at tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/scheduler/multi_thread/mod.rs:87)
2025-06-06T13:57:54.1965887Z at tokio::runtime::context::runtime::enter_runtime(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/context/runtime.rs:65)
2025-06-06T13:57:54.1968188Z at tokio::runtime::scheduler::multi_thread::MultiThread::block_on(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/scheduler/multi_thread/mod.rs:86)
2025-06-06T13:57:54.1970246Z at tokio::runtime::runtime::Runtime::block_on_inner(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/runtime.rs:358)
2025-06-06T13:57:54.1972087Z at tokio::runtime::runtime::Runtime::block_on(/usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.45.1/src/runtime/runtime.rs:330)
2025-06-06T13:57:54.1974189Z at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}::{{closure}}(/__w/datafusion-comet/datafusion-comet/native/core/src/execution/jni_api.rs:438)
2025-06-06T13:57:54.1975895Z at comet::execution::tracing::with_trace(/__w/datafusion-comet/datafusion-comet/native/core/src/execution/tracing.rs:117)
2025-06-06T13:57:54.1977694Z at comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}}(/__w/datafusion-comet/datafusion-comet/native/core/src/execution/jni_api.rs:395)
2025-06-06T13:57:54.1979212Z at comet::errors::curry::{{closure}}(/__w/datafusion-comet/datafusion-comet/native/core/src/errors.rs:485)
2025-06-06T13:57:54.1980462Z at std::panicking::try::do_call(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/panicking.rs:589)
2025-06-06T13:57:54.1981370Z at __rust_try(__internal__:0)
2025-06-06T13:57:54.1982193Z at std::panicking::try(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/panicking.rs:552)
2025-06-06T13:57:54.1983410Z at std::panic::catch_unwind(/rustc/17067e9ac6d7ecb70e50f92c1944e545188d2359/library/std/src/panic.rs:359)
2025-06-06T13:57:54.1984614Z at comet::errors::try_unwrap_or_throw(/__w/datafusion-comet/datafusion-comet/native/core/src/errors.rs:499)
2025-06-06T13:57:54.1985938Z at Java_org_apache_comet_Native_executePlan(/__w/datafusion-comet/datafusion-comet/native/core/src/execution/jni_api.rs:375)
2025-06-06T13:57:54.1987315Z at <unknown>(__internal__:0)
To Reproduce
No response
Expected behavior
No response
Additional context
No response