[SPARK-50762][SQL][TESTS] Add more scalar SQL UDF SQL query tests #50898

allisonwang-db · 2025-05-15T00:57:14Z

What changes were proposed in this pull request?

This PR adds more SQL query tests for scalar SQL UDFs.

Why are the changes needed?

To make sure SQL UDF works with different operators and prevent regressions.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Test only

Was this patch authored or co-authored using generative AI tooling?

No

allisonwang-db · 2025-05-15T01:01:09Z

sql/core/src/test/resources/sql-tests/results/sql-udf.sql.out

+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.catalyst.analysis.FunctionAlreadyExistsException


This needs investigation. Created a follow up ticket: SPARK-52148

allisonwang-db · 2025-05-15T01:02:31Z

cc @cloud-fan

cloud-fan · 2025-05-15T08:01:26Z

@allisonwang-db can you rebase your branch and regenerate the golden files? the test fails

[info] - sql-udf.sql *** FAILED *** (6 seconds, 861 milliseconds)
[info]   sql-udf.sql
[info]   Expected "...s" : "\"sum(c2) AS `[outer(sum(]c2))`\""
[info]     },
[info]     "que...", but got "...s" : "\"sum(c2) AS `[sum(outer(spark_catalog.default.t1.]c2))`\""
[info]     },
[info]     "que..." Result did not match for query #159
[info]   SELECT c1, SUM(c2) + foo3_1a(MIN(c2), MAX(c2)) + (SELECT SUM(c2)) FROM t1 GROUP BY c1 (SQLQueryTestSuite.scala:683)

xinrong-meng · 2025-05-15T21:54:16Z

Failed test org.apache.spark.sql.kafka010.KafkaContinuousSourceTopicDeletionSuite seemed irrelevant, would you please retrigger?

cloud-fan · 2025-05-16T09:17:04Z

thanks, merging to master/4.0!

### What changes were proposed in this pull request? This PR adds more SQL query tests for scalar SQL UDFs. ### Why are the changes needed? To make sure SQL UDF works with different operators and prevent regressions. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test only ### Was this patch authored or co-authored using generative AI tooling? No Closes #50898 from allisonwang-db/spark-50762-tests. Authored-by: Allison Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 458cf70) Signed-off-by: Wenchen Fan <[email protected]>

dongjoon-hyun · 2025-05-16T23:33:03Z

Hi, @allisonwang-db and @cloud-fan .

This seems to break branch-4.0.

https://github.com/apache/spark/actions/runs/15070364465/job/42364916342

- sql-udf.sql_analyzer_test *** FAILED ***
  sql-udf.sql_analyzer_test
  Expected "...g.default.foo3_1b(c2[))#xL > cast(0 as bigint))
     +- Project [c1#x, count(1)#xL, spark_catalog.default.foo3_1b(x#x) AS spark_catalog.default.foo3_1b(sum(c2))#x, sum(spark_catalog.default.foo3_1b(c2))#xL]
        +- Project [c1#x, count(1)#xL, sum(c2)#xL, sum(spark_catalog.default.foo3_1b(c2))#xL, cast(sum(c2)#xL as int) AS x#x]
           +- Aggregate [c1#x], [c1#x, count(1) AS count(1)#xL, sum(c2#x) AS sum(c2)#xL, sum(spark_catalog.default.foo3_1b(x#x)) AS sum(spark_catalog.default.foo3_1b(c2]))#xL]
              +...", but got "...g.default.foo3_1b(c2[#x))#xL > cast(0 as bigint))
     +- Project [c1#x, count(1)#xL, spark_catalog.default.foo3_1b(x#x) AS spark_catalog.default.foo3_1b(sum(c2))#x, sum(spark_catalog.default.foo3_1b(c2#x))#xL]
        +- Project [c1#x, count(1)#xL, sum(c2)#xL, sum(spark_catalog.default.foo3_1b(c2#x))#xL, cast(sum(c2)#xL as int) AS x#x]
           +- Aggregate [c1#x], [c1#x, count(1) AS count(1)#xL, sum(c2#x) AS sum(c2)#xL, sum(spark_catalog.default.foo3_1b(x#x)) AS sum(spark_catalog.default.foo3_1b(c2#x]))#xL]
              +..." Result did not match for query #152
  SELECT c1, COUNT(*), foo3_1b(SUM(c2)) FROM t1 GROUP BY c1 HAVING SUM(foo3_1b(c2)) > 0 (SQLQueryTestSuite.scala:683)

dongjoon-hyun · 2025-05-16T23:43:38Z

I created a follow-up for branch-4.0.

[SPARK-50762][SQL][TEST][FOLLOWUP][4.0] Regenerate sql-udf.sql.out #50928

### What changes were proposed in this pull request? This is a follow-up of #50898 for branch-4.0. - #50898 ### Why are the changes needed? #50898 broke `branch-4.0` CIs. - https://github.com/apache/spark/actions/runs/15070364465/job/42364916342 - https://github.com/apache/spark/actions/runs/15070303045/job/42364700177 - https://github.com/apache/spark/actions/runs/15070364465/job/42364916342 ``` - sql-udf.sql_analyzer_test *** FAILED *** sql-udf.sql_analyzer_test Expected "...g.default.foo3_1b(c2[))#xL > cast(0 as bigint)) +- Project [c1#x, count(1)#xL, spark_catalog.default.foo3_1b(x#x) AS spark_catalog.default.foo3_1b(sum(c2))#x, sum(spark_catalog.default.foo3_1b(c2))#xL] +- Project [c1#x, count(1)#xL, sum(c2)#xL, sum(spark_catalog.default.foo3_1b(c2))#xL, cast(sum(c2)#xL as int) AS x#x] +- Aggregate [c1#x], [c1#x, count(1) AS count(1)#xL, sum(c2#x) AS sum(c2)#xL, sum(spark_catalog.default.foo3_1b(x#x)) AS sum(spark_catalog.default.foo3_1b(c2]))#xL] +...", but got "...g.default.foo3_1b(c2[#x))#xL > cast(0 as bigint)) +- Project [c1#x, count(1)#xL, spark_catalog.default.foo3_1b(x#x) AS spark_catalog.default.foo3_1b(sum(c2))#x, sum(spark_catalog.default.foo3_1b(c2#x))#xL] +- Project [c1#x, count(1)#xL, sum(c2)#xL, sum(spark_catalog.default.foo3_1b(c2#x))#xL, cast(sum(c2)#xL as int) AS x#x] +- Aggregate [c1#x], [c1#x, count(1) AS count(1)#xL, sum(c2#x) AS sum(c2)#xL, sum(spark_catalog.default.foo3_1b(x#x)) AS sum(spark_catalog.default.foo3_1b(c2#x]))#xL] +..." Result did not match for query #152 SELECT c1, COUNT(*), foo3_1b(SUM(c2)) FROM t1 GROUP BY c1 HAVING SUM(foo3_1b(c2)) > 0 (SQLQueryTestSuite.scala:683) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #50928 from dongjoon-hyun/SPARK-50762. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

### What changes were proposed in this pull request? This PR adds more SQL query tests for scalar SQL UDFs. ### Why are the changes needed? To make sure SQL UDF works with different operators and prevent regressions. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test only ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50898 from allisonwang-db/spark-50762-tests. Authored-by: Allison Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

github-actions bot added the SQL label May 15, 2025

allisonwang-db commented May 15, 2025

View reviewed changes

allisonwang-db added 2 commits May 15, 2025 11:05

tests

9dd5040

update

082dfb1

allisonwang-db force-pushed the spark-50762-tests branch from 252216c to 082dfb1 Compare May 15, 2025 18:10

cloud-fan closed this in 458cf70 May 16, 2025

dongjoon-hyun mentioned this pull request May 16, 2025

[SPARK-50762][SQL][TEST][FOLLOWUP][4.0] Regenerate sql-udf.sql.out #50928

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-50762][SQL][TESTS] Add more scalar SQL UDF SQL query tests #50898

[SPARK-50762][SQL][TESTS] Add more scalar SQL UDF SQL query tests #50898

Uh oh!

allisonwang-db commented May 15, 2025 •

edited

Loading

Uh oh!

allisonwang-db May 15, 2025

Uh oh!

allisonwang-db commented May 15, 2025

Uh oh!

cloud-fan commented May 15, 2025

Uh oh!

xinrong-meng commented May 15, 2025

Uh oh!

cloud-fan commented May 16, 2025

Uh oh!

dongjoon-hyun commented May 16, 2025 •

edited

Loading

Uh oh!

dongjoon-hyun commented May 16, 2025

Uh oh!

Uh oh!

[SPARK-50762][SQL][TESTS] Add more scalar SQL UDF SQL query tests #50898

[SPARK-50762][SQL][TESTS] Add more scalar SQL UDF SQL query tests #50898

Uh oh!

Conversation

allisonwang-db commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

allisonwang-db May 15, 2025

Choose a reason for hiding this comment

Uh oh!

allisonwang-db commented May 15, 2025

Uh oh!

cloud-fan commented May 15, 2025

Uh oh!

xinrong-meng commented May 15, 2025

Uh oh!

cloud-fan commented May 16, 2025

Uh oh!

dongjoon-hyun commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented May 16, 2025

Uh oh!

Uh oh!

allisonwang-db commented May 15, 2025 •

edited

Loading

dongjoon-hyun commented May 16, 2025 •

edited

Loading