[SPARK-52597][SS][TESTS] Fix the execution failure of `StateStoreBasicOperationsBenchmark` #51304

LuciferYang · 2025-06-27T15:14:22Z

What changes were proposed in this pull request?

This pr has made the following changes to fix the StateStoreBasicOperationsBenchmark:

Following the suggestion from @zecookiez, set "spark.sql.streaming.stateStore.coordinatorReportSnapshotUploadLag" to false when initializing StateStoreConf.
When initializing the RocksDBStateStoreProvider, populate the StreamExecution.RUN_ID_KEY for the incoming Hadoop Configuration.

Why are the changes needed?

Fix the execution failure of StateStoreBasicOperationsBenchmark:

build/sbt "sql/Test/runMain org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark"
[error] Exception in thread "main" java.lang.AssertionError: assertion failed
[error] 	at scala.Predef$.assert(Predef.scala:264)
[error] 	at org.apache.spark.sql.execution.streaming.state.StateStoreProvider$.getRunId(StateStore.scala:673)
[error] 	at org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider.init(RocksDBStateStoreProvider.scala:394)
[error] 	at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.newRocksDBStateProvider(StateStoreBasicOperationsBenchmark.scala:484)
[error] 	at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.$anonfun$runPutBenchmark$3(StateStoreBasicOperationsBenchmark.scala:92)
[error] 	at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.scala:18)
[error] 	at scala.collection.immutable.List.foreach(List.scala:334)
[error] 	at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.$anonfun$runPutBenchmark$2(StateStoreBasicOperationsBenchmark.scala:87)
[error] 	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[error] 	at org.apache.spark.benchmark.BenchmarkBase.runBenchmark(BenchmarkBase.scala:42)
[error] 	at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.runPutBenchmark(StateStoreBasicOperationsBenchmark.scala:83)
[error] 	at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.runBenchmarkSuite(StateStoreBasicOperationsBenchmark.scala:55)
[error] 	at org.apache.spark.benchmark.BenchmarkBase.main(BenchmarkBase.scala:72)
[error] 	at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark.main(StateStoreBasicOperationsBenchmark.scala)

Does this PR introduce any user-facing change?

No

How was this patch tested?

It has been locally confirmed that the StateStoreBasicOperationsBenchmark can be executed successfully.

Was this patch authored or co-authored using generative AI tooling?

No

LuciferYang · 2025-06-27T15:20:39Z

...test/scala/org/apache/spark/sql/execution/benchmark/StateStoreBasicOperationsBenchmark.scala

    val storeConf = new StateStoreConf(sqlConf)

+    val configuration = new Configuration
+    configuration.set(StreamExecution.RUN_ID_KEY, UUID.randomUUID().toString)


This modification is also needed; otherwise, the execution will still encounter errors:

build/sbt "sql/Test/runMain org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark" [error] Exception in thread "main" java.lang.AssertionError: assertion failed [error] at scala.Predef$.assert(Predef.scala:264) [error] at org.apache.spark.sql.execution.streaming.state.StateStoreProvider$.getRunId(StateStore.scala:673) [error] at org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider.init(RocksDBStateStoreProvider.scala:394) [error] at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.newRocksDBStateProvider(StateStoreBasicOperationsBenchmark.scala:484) [error] at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.$anonfun$runPutBenchmark$3(StateStoreBasicOperationsBenchmark.scala:92) [error] at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.scala:18) [error] at scala.collection.immutable.List.foreach(List.scala:334) [error] at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.$anonfun$runPutBenchmark$2(StateStoreBasicOperationsBenchmark.scala:87) [error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) [error] at org.apache.spark.benchmark.BenchmarkBase.runBenchmark(BenchmarkBase.scala:42) [error] at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.runPutBenchmark(StateStoreBasicOperationsBenchmark.scala:83) [error] at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark$.runBenchmarkSuite(StateStoreBasicOperationsBenchmark.scala:55) [error] at org.apache.spark.benchmark.BenchmarkBase.main(BenchmarkBase.scala:72) [error] at org.apache.spark.sql.execution.benchmark.StateStoreBasicOperationsBenchmark.main(StateStoreBasicOperationsBenchmark.scala)

Actually according to the original comment, there should be a random runid set when it's not found:

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala

Line 667 in c09ac23

* Get the runId from the provided hadoopConf. If it is not found, generate a random UUID.

If according to the comment, maybe we should generate a random one there and do a logWarning, so we could remove this random generation at test time?

Based on the description, it seems so, but the code implementation doesn't appear to match. It looks like when @ericm-db added the assertions, the comments weren't modified accordingly.

@ericm-db Maybe should submit a follow-up for SPARK-52188 to fix this comment?

If according to the comment, maybe we should generate a random one there and do a logWarning, so we could remove this random generation at test time?

Based on the submission timestamps, it appears that the assertions were added after the comments. Since I'm not very familiar with this part of the code, I hope the original author, @ericm-db, can come up with a solution.

It seems that in the original logic, only test scenarios would automatically generate a UUID. So, personally, I think we should set it manually here, and the comment is already outdated. WDYT? @WweiL

I see, thank you for the explanation! Yea I agree with you, let me just do a follow up on the comment there

Thank you @WweiL ~

Can you merge this, thank you!

#51307

Can you merge this, thank you!

#51307

done

LuciferYang · 2025-06-27T15:23:57Z

I've submitted a job that executes via GitHub Action. Let's verify the effect of the modifications:

https://github.com/LuciferYang/spark/actions/runs/15929832530/job/44936065130

LuciferYang · 2025-06-27T15:24:17Z

cc @HeartSaVioR @zecookiez

zecookiez

Looks good, just saw that the benchmark job is producing proper output too. Thanks for putting in this change! 😃

WweiL

LGTM

### What changes were proposed in this pull request? #50924 removed the logic of generating a random id in `StateStoreProvider.getRunId` didn't update the comment. Following discussion here, update the comment #51304 (comment) ### Why are the changes needed? Code readability ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No need ### Was this patch authored or co-authored using generative AI tooling? No Closes #51307 from WweiL/SPARK-52188-followup. Authored-by: Wei Liu <[email protected]> Signed-off-by: yangjie01 <[email protected]>

HeartSaVioR

+1, thanks for the quick turnaround, everyone!

LuciferYang · 2025-06-28T03:11:06Z

Merged into master. Thanks @HeartSaVioR @zecookiez and @WweiL

init

79dfc8c

github-actions bot added the SQL label Jun 27, 2025

LuciferYang commented Jun 27, 2025

View reviewed changes

LuciferYang mentioned this pull request Jun 27, 2025

[SPARK-51358] [SS] Introduce snapshot upload lag detection through StateStoreCoordinator #50123

Closed

zecookiez approved these changes Jun 27, 2025

View reviewed changes

WweiL mentioned this pull request Jun 27, 2025

[SPARK-52188] [FOLLOWUP] Update comment in getRunId #51307

Closed

WweiL approved these changes Jun 27, 2025

View reviewed changes

HeartSaVioR approved these changes Jun 28, 2025

View reviewed changes

LuciferYang closed this in dd497fb Jun 28, 2025

[SPARK-52597][SS][TESTS] Fix the execution failure of StateStoreBasicOperationsBenchmark #51304

[SPARK-52597][SS][TESTS] Fix the execution failure of StateStoreBasicOperationsBenchmark #51304

Uh oh!

Conversation

LuciferYang commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

LuciferYang Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WweiL Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

WweiL Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WweiL Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

WweiL Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LuciferYang Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LuciferYang commented Jun 27, 2025

Uh oh!

zecookiez left a comment

Choose a reason for hiding this comment

Uh oh!

WweiL left a comment

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR left a comment

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Jun 28, 2025

Uh oh!

Uh oh!

[SPARK-52597][SS][TESTS] Fix the execution failure of `StateStoreBasicOperationsBenchmark` #51304

[SPARK-52597][SS][TESTS] Fix the execution failure of `StateStoreBasicOperationsBenchmark` #51304

LuciferYang commented Jun 27, 2025 •

edited

Loading

LuciferYang Jun 27, 2025 •

edited

Loading

LuciferYang Jun 27, 2025 •

edited

Loading

WweiL Jun 27, 2025 •

edited

Loading

LuciferYang commented Jun 27, 2025 •

edited

Loading