Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

Add TableStats to TableFragmentsInfo. #559

Merged
merged 1 commit into from
Jul 31, 2023
Merged

Conversation

ienkovich
Copy link
Contributor

Chunks metadata is used at every work unit execution and in many cases stats are used. But the main reason for stats usage is range info requests for column references. And to compute the range we simply merge all chunk stats.

This patch introduces TableStats which can be used for column range info computation. While table stats originally are computed using chunk stats, we might avoid it in some cases by propagating input table stats to the resulting table. E. g. simple projection, sort, and shuffle do not modify table stats and therefore stats can be simply assigned to the result. This way we can avoid chunk stats computations that are directly used only to skip fragments during filter execution.

This PR only introduces new stats and doesn't utilize them yet. It also adds checks to ExpressionRange computation to make sure that new stats provide us with the same range as existing ones. There will be another PR to stop using chunk stats in range computations and propagate table stats through shuffling to avoid stats computation for partitioned data.

@ienkovich
Copy link
Contributor Author

@alexbaden @kurapov-peter
Looks like this one was missed

Copy link
Contributor

@alexbaden alexbaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. My only concern is making sure it is clear how to get table stats vs chunk stats from storage - I find the current way of getting chunk stats somewhat convoluted (unless it goes through expression range, which has a nice outward API).

@ienkovich ienkovich merged commit a8074d6 into main Jul 31, 2023
@ienkovich ienkovich deleted the ienkovich/table-stats branch July 31, 2023 19:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants