Open
Description
Spark 400 complains a pyspark.errors.exceptions.captured.SparkUpgradeException
when reading in an invalid date string, and the DateTimeException
no longer shows up in the error message. This caused some test failures (a quick fix is at #12945), e.g.
FAILED ../../../../integration_tests/src/main/python/json_test.py::test_json_read_invalid_dates[EXCEPTION-yyyy-MM-dd-false-read_json_sql-schema0-dates_invalid.json][DATAGEN_SEED=1750041138, TZ=UTC, INJECT_OOM, APPROXIMATE_FLOAT] - AssertionError: Expected error 'DateTimeException' did not appear in 'pyspark.errors.exceptions.captured.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get a different result due to the upgrading to Spark >= 3.0:
Fail to parse '2020-09-32' in the new parser.
You can set "spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string. SQLSTATE: 42K0B'
It would be better GPU can throw out the same exception type as Spark.