Skip to content

[FEA]Align the exception type with Spark for JSON read #12943

Open
@firestarman

Description

@firestarman

Spark 400 complains a pyspark.errors.exceptions.captured.SparkUpgradeException when reading in an invalid date string, and the DateTimeException no longer shows up in the error message. This caused some test failures (a quick fix is at #12945), e.g.

FAILED ../../../../integration_tests/src/main/python/json_test.py::test_json_read_invalid_dates[EXCEPTION-yyyy-MM-dd-false-read_json_sql-schema0-dates_invalid.json][DATAGEN_SEED=1750041138, TZ=UTC, INJECT_OOM, APPROXIMATE_FLOAT] - AssertionError: Expected error 'DateTimeException' did not appear in 'pyspark.errors.exceptions.captured.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get a different result due to the upgrading to Spark >= 3.0:
Fail to parse '2020-09-32' in the new parser.
You can set "spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string. SQLSTATE: 42K0B'

It would be better GPU can throw out the same exception type as Spark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Spark 4.0+Spark 4.0+ issuestaskWork required that improves the product but is not user facing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions