Skip to content

[Flink] Optimize CDC sink serde with Fury#307

Merged
xuchen-plus merged 6 commits intolakesoul-io:mainfrom
xuchen-plus:flink_serde_opt
Aug 30, 2023
Merged

[Flink] Optimize CDC sink serde with Fury#307
xuchen-plus merged 6 commits intolakesoul-io:mainfrom
xuchen-plus:flink_serde_opt

Conversation

@xuchen-plus
Copy link
Contributor

@xuchen-plus xuchen-plus commented Aug 25, 2023

Fury is an opensourced serialization library using JIT to improve performance.

In LakeSoul's CDC sync job, we need to pass before and after RowData as well as RowType in each record. From Flink's flamegraph we can confirm these objects are causing excessive serde burden.

Using Fury, single core benchmark shows ~80% improvement on end-to-end throughput (numRecordsInPerSecond from 5800 to 10400).

Before:

img_v2_c9f87b2a-667c-43f9-b970-f378f6fd006g

After using Fury:

img_v2_757278f5-424d-4b8d-8359-dca3a151eddg

@xuchen-plus xuchen-plus added enhancement New feature or request flink flink support into lakesoul labels Aug 25, 2023
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
Signed-off-by: chenxu <chenxu@dmetasoul.com>
@xuchen-plus xuchen-plus merged commit c3271e1 into lakesoul-io:main Aug 30, 2023
@xuchen-plus xuchen-plus deleted the flink_serde_opt branch August 30, 2023 07:02
@chaokunyang
Copy link

This is great! Glad to see fury speed up lake soul performance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request flink flink support into lakesoul

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants