-
Notifications
You must be signed in to change notification settings - Fork 14
[ArrowResultSet] Support LIMIT and OFFSET #199
Conversation
8a72bce
to
e6ae372
Compare
? results_->entryCount() | ||
: std::min(size_t(top_n_), results_->entryCount()); | ||
// To get into a count all limit/offset parameters and make size, not index. | ||
const size_t first_entry = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iteration through the whole result set might not be the best way to count the number of rows. Let's look at different cases.
- Columnar conversion. You just need a total number of rows to convert.
ResultSet::rowCount
should provide you with the actual value. It also has caching and more efficient parallel algorithms to get the value more efficiently. - Row-wise conversion. Here we have a loop that actually uses logical positions in the buffer for start and end indexes. That means, that we have to determine those positions in advance, and your solution should give the correct result. The problem here is that we would have a serial scan over the whole table just to get the number of rows and then make another scan (now in parallel) to build arrow buffers. That means for all cases (even when there are no OFFSET and LIMIT expressions) we have an additional serial scan. What we can do here is add versioning for row-wise conversion. Use the existing
parallel_for
loop when we have no limit and/or offset and go single-threaded when we have those parameters set. You can add offset and limit args to theconvert_rowwise
function for that.
92c7119
to
7968a3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch uses the right idea and should work for ResultSets with single storage. But I think it doesn't handle multiple storages correctly for columnar conversion. Also, tests don't include multiple storages case. Please, work in this direction.
omniscidb/QueryEngine/ResultSet.cpp
Outdated
if (isTruncated()) { | ||
rows_to_fetch = std::min(rows_to_fetch, keep_first_); | ||
} | ||
retval.emplace_back(storage_->getUnderlyingBuffer() + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is incorrect for cases when drop_first_ + keep_first_ > storage_->binSearchRowCount()
. It would reference past the storage buffer. Big offsets might force you to skip one or more storages but your code doesn't do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be incorrect in case of drop_first_ >= storage_->binSearchRowCount()
, if only sum is larger I am expecting data form offset to the end of table. (also I am checking it in OrderedLimitOverSizeSelection
test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is getting a little hard to understand. Maybe it's better to wrap in some method, not sure what's the best approach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storage_
doesn't hold the whole table, so it's legal to drop more rows than storage_
has.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be failed with an error or empty selection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, why it's illegal? We simply should skip first storage and use next one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't say it was illegal, I said the opposite. I mentioned missing storage skip in my first comment.
config().rs.enable_columnar_output = true; | ||
config().rs.enable_lazy_fetch = false; | ||
auto res = runSqlQuery( | ||
"SELECT * FROM test INNER JOIN join_table ON test.i=join_table.i offset 4 limit 2;", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all non-groupby tests use test_chunked
or run tests twice, once on test
and another one on test_chunked
. The chunked table holds the same data but is split into different fragments. This will produce ResultSets with multiple storages and improve your testing coverage.
if (is_truncated && seg_row_count <= offset) { | ||
continue; | ||
} | ||
if (is_truncated && seg_row_count > offset + limit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to look for the next valid entry if converted enough rows. So please move this exit condition to the very beginning of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree,
7968a3b
to
2b6de3e
Compare
const auto col_count = results->colCount(); | ||
CHECK_EQ(value_seg.size(), col_count); | ||
CHECK_EQ(null_bitmap_seg.size(), col_count); | ||
const auto local_entry_count = end_entry - start_entry; | ||
size_t seg_row_count = 0; | ||
size_t limit = 0; | ||
size_t offset = end_entry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This default value is misleading. I know it is never used but still rises the question of why this value is used for initialization with no good answer. I guess simply initializing the limit and offset with values from results
would be better.
omniscidb/QueryEngine/ResultSet.cpp
Outdated
auto curr_storage_rows_to_fetch = storage_->binSearchRowCount(); | ||
size_t curr_storage_start_row = drop_first_; | ||
size_t curr_storage_end_row = | ||
keep_first_ != 0 ? keep_first_ : curr_storage_rows_to_fetch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the name of the variable, I guess it should hold the index past the last row to fetch. But this is not the case when the limit is set.
omniscidb/QueryEngine/ResultSet.cpp
Outdated
curr_storage_start_row -= curr_storage_rows_to_fetch; | ||
} else { | ||
size_t to_fetch = std::min(curr_storage_rows_to_fetch - curr_storage_start_row, | ||
curr_storage_end_row - curr_storage_start_row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curr_storage_end_row
holds limit, curr_storage_start_row
holds offset, so you compute limit - offset
which is not what you need. Consider storage with 4 rows, limit=2, and offset=2. You would fetch 0 rows then. BTW it also means your tests do not have enough coverage here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, error can be triggered only when limit/offset in first storage. If it's in next one everything ok.
omniscidb/QueryEngine/ResultSet.cpp
Outdated
storage_uptr->getUnderlyingBuffer() + storage_uptr->getColOffInBytes(column_idx); | ||
size_t row_count = storage_uptr->binSearchRowCount(); | ||
curr_storage_rows_to_fetch = row_count; | ||
if (curr_storage_end_row == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curr_storage_end_row == 0
means we have to stop, right? If we got all the required rows from the first storage, then curr_storage_end_row
should be 0 here and we should stop.
Overall, it's hard to read the code because of misleading variable names. curr_storage_rows_to_fetch
doesn't hold the number of rows to fetch. curr_storage_start_row
and curr_storage_end_row
sound like iterators to something but in fact they are not. expected_row_count
is named and used as if it holds a total number of rows to fetch, but it holds the total number of rows in the result set instead and (total_handled_rows >= expected_row_count)
check is useless.
Could you please just use rows_to_fetch
and rows_to_skip
variables instead of the 5 variables you currently have? Initialize them like
size_t rows_to_skip = getOffset();
size_t rows_to_fetch = getLimit() ? std::min(getLimit(), rowCount() - getOffset()) : rowCount() - getOffset();
Then use them to compute a number of rows to skip/fetch from each storage and adjust them accordingly, stopping when rows_to_fetch
is zero. I believe that would make the code more readable.
Also, please pay attention to your tests. Try to trigger current errors in tests before fixing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
auto res = runSqlQuery( | ||
"SELECT * FROM test_chunked offset 3 limit 7;", ExecutorDeviceType::CPU, true); | ||
|
||
ASSERT_EQ(res.getRows()->rowCount(), (int64_t)3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely see problems in the current columnar conversion implementation. That makes me think these tests don't actually give you the required result set layouts. Please add asserts for isChunkedZeroCopyColumnarConversionPossible()
, getLimit()
, and getOffset()
to make sure you have what you expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
9f9d04e
to
e68da90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks!
@@ -145,7 +145,7 @@ void compare_columns(const std::array<TYPE, len>& expected, | |||
const arrow::ArrayVector& chunks = actual->chunks(); | |||
|
|||
TYPE null_val = null_builder<TYPE>(); | |||
|
|||
auto total_row_count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to compute a number of rows, simply use actual->length()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, done
50a19bd
to
f7ec4e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f7ec4e3
to
0c3c804
Compare
0c3c804
to
19562ca
Compare
This commit takes into account the limit and offset parameters in the ArrowResultSet. Also added several tests to check limit/offset with ascending/descending ordering and joins to ResultSetArrowConversion suit. Resolves: #183 Signed-off-by: Dmitrii Makarenko <[email protected]>
19562ca
to
2331bae
Compare
This commit takes into account the limit and offset parameters in the ArrowResultSet.
Also added several tests to check limit/offset with ascending/descending
ordering and joins to ResultSetArrowConversion suit.
Resolves: #183
Signed-off-by: Dmitrii Makarenko [email protected]