Skip to content

feat(function): Add Spark CAST string to TIME with ANSI mode support#16123

Open
malinjawi wants to merge 9 commits intofacebookincubator:mainfrom
malinjawi:spark-cast-string-to-time-ansi
Open

feat(function): Add Spark CAST string to TIME with ANSI mode support#16123
malinjawi wants to merge 9 commits intofacebookincubator:mainfrom
malinjawi:spark-cast-string-to-time-ansi

Conversation

@malinjawi
Copy link
Contributor

Implements string-to-TIME casting following Spark semantics with ANSI mode
support. TIME values are represented as microseconds since midnight
(0-86,399,999,999).

Added castStringToTime() to CastHooks interface and implemented
Spark-specific behavior in SparkCastHooks. Supports format H:m:s[.SSS]
with whitespace trimming. ANSI mode integration via isAnsiSupported()
helper determines whether invalid input throws error (ANSI ON) or
returns NULL (ANSI OFF).

Comprehensive tests added covering valid/invalid inputs for both ANSI
modes with verified microsecond precision calculations.

Part of apache/incubator-gluten#10134

@netlify
Copy link

netlify bot commented Jan 25, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 20d7bae
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/6983d12566584800083dc326

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 25, 2026
Copy link
Collaborator

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.


::

SELECT cast('24:00:00' as time); -- NULL (hour out of range)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps combine the ANSI on/off behaviors with /.

} else {
VELOX_USER_FAIL(
makeErrorMessage(input, row, TIME()) + " " +
timeResult.error().message());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract this lambda function for reuse?

auto setError = [&](const std::string& details) INLINE_LAMBDA {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rui-mo I've addressed both pieces of feedback:

  1. Combined ANSI on/off documentation examples using / separator for more concise format
  2. Extracted the setError lambda inside the applyToSelectedNoThrowLocal callback to avoid duplication and properly capture the row parameter

The branch has been rebased on latest main and all changes are formatted. Ready for re-review. Thanks!

@malinjawi malinjawi force-pushed the spark-cast-string-to-time-ansi branch from b5432ee to 1f0c2f7 Compare February 4, 2026 11:59
- Implements ANSI-compliant string-to-time casting
- Adds scope guard in test for proper config cleanup
- Aligns with string-to-boolean PR feedback
- Combines ANSI support checks for both boolean and time casts
@malinjawi malinjawi force-pushed the spark-cast-string-to-time-ansi branch from 1f0c2f7 to 28da2ed Compare February 4, 2026 13:05
…mbda

- Combined ANSI on/off documentation examples using '/' separator
- Extracted setError lambda inside applyToSelectedNoThrowLocal for code reuse
- Follows pattern from existing code at line 104

Addresses feedback from facebookincubator#16123
@malinjawi malinjawi requested a review from rui-mo February 4, 2026 13:51
Mohammad Linjawi and others added 7 commits February 4, 2026 16:55
Commit 28da2ed introduced hooks architecture for VARCHAR to TIME casting
but incorrectly made PrestoCastHooks return 'not supported', breaking the
existing Presto VARCHAR to TIME functionality that was added in 2191ecb.

This commit restores the original behavior by implementing castStringToTime()
to call TIME()->valueToTime(), which is what Presto expects based on the
605 lines of tests in velox/expression/tests/CastExprTest.cpp.

The Spark-specific ANSI mode behavior is handled separately in SparkCastHooks.
The original Presto implementation of VARCHAR to TIME casting (commit 2191ecb)
required timezone and session start time for proper DST handling. When the hooks
architecture was introduced, these parameters were missing from the hook signature,
causing Presto tests to fail with incorrect time values.

This commit:
- Updates CastHooks interface to include timeZone and sessionStartTimeMs parameters
- Updates CastExpr.cpp to pass these parameters from the query context
- Updates PrestoCastHooks to use TIME()->valueToTime() with all 3 parameters
- Updates SparkCastHooks to accept but ignore these parameters (Spark doesn't use timezone for TIME)

This fixes the Presto varcharToTimeCast and varcharToTimeDSTGapHandling test failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants