Skip to content

Commit 2ddd2be

Browse files
author
Tom McCormick
committed
Add comprehensive ORC batching tests demonstrating stripe size, batch size, and compression interactions.
Tests show ORC batching is based on stripes (like Parquet row groups), with near-perfect 1:1 mapping achievable using large stripe sizes (2-5MB) and hard-to-compress data, achieving 0.91-0.97 ratios between stripe size and actual file size.
1 parent 71143f6 commit 2ddd2be

File tree

1 file changed

+1257
-63
lines changed

1 file changed

+1257
-63
lines changed

0 commit comments

Comments
 (0)