Open
Description
(you don't have to strictly follow this form)
Describe the situation
SELECT avg(i) FROM file('/data/t.parquet') group by round(log10(i));
chdb costs 400s, clickhouse local costs 100s
How to reproduce
- Which ClickHouse server version to use 23.6
- Which interface to use, if matters CLI.py
- Non-default settings, if any
CREATE TABLE
statements for all tables involved
select number::int i FROM numbers_mt(1,1000000000)t into outfile '/data/t.parquet';
- Sample data for all these tables, use clickhouse-obfuscator if necessary
- Queries to run that lead to slow performance
SELECT avg(i) FROM file('/data/t.parquet') group by round(log10(i));
Expected performance
What are your performance expectation, why do you think they are realistic? Has it been working faster in older ClickHouse releases? Is it working faster in some specific other system?
I hope chdb runs as fast as clickhouse local.
Additional context
Add any other context about the problem here.
btw
select number::int i FROM numbers_mt(1,1000000000)t into outfile '/data/t.parquet';
chdb runs as fast as clickhouse local