Skip to content

Commit ba95b79

Browse files
pbhandar2facebook-github-bot
authored andcommitted
OSS test fix (#436)
Summary: Fix NvmCacheTest.EvictToNvmGet failing in OSS CI. ``` /home/runner/work/CacheLib/CacheLib/cachelib/allocator/nvmcache/tests/NvmCacheTests.cpp:252: Failure Expected equality of these values: 0 nvm.getNumActiveHandles() Which is: 1 I0318 19:45:43.067236 184036 Driver.cpp:84] Driver: finish scheduler I0318 19:45:43.067307 184036 Driver.cpp:86] Driver: finish scheduler successful I0318 19:45:43.067613 184036 ThreadPoolJobScheduler.cpp:86] JobScheduler: join threads for reader_pool I0318 19:45:43.069042 184036 ThreadPoolJobScheduler.cpp:90] JobScheduler: join threads for reader_pool successful I0318 19:45:43.072589 184036 ThreadPoolJobScheduler.cpp:86] JobScheduler: join threads for writer_pool I0318 19:45:43.074025 184036 ThreadPoolJobScheduler.cpp:90] JobScheduler: join threads for writer_pool successful [ FAILED ] NvmCacheTest.EvictToNvmGet (433 ms) [----------] 1 test from NvmCacheTest (433 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test suite ran. (433 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] NvmCacheTest.EvictToNvmGet 1 FAILED TEST ``` Two bugs: 1. Unsigned underflow in loop index. The loop uses the i-- > 0 idiom which already decrements i before the body runs (values nKeys+99 down to 0). The extra index = i - 1 double-decremented, so on the last iteration (i=0), index = (unsigned)0 - 1 = UINT_MAX. This created a spurious lookup for a nonsensical key and skipped key1123. * **Fix**: index = i. 2. Race between async NVM callback and handle count check. When a key is fetched from NVM, the async callback runs onGetComplete →removeFromFillMap → ~GetCtx → wakeUpWaiters → baton_.post(). The baton wakes the test thread, but GetCtx::it (the fill handle) is destroyed after the baton post, still inside the same callback. Until that destruction completes, the async thread's handle count is +1. If the test thread reaches `getNumActiveHandles()` before the callback fully returns, it observes the stale +1. * **Fix**: call flushNvmCache() before checking handle counts. This calls scheduler_->finish() which busy-waits until all navy jobs (including the callback that destroys GetCtx::it) have fully completed. Reviewed By: byahn0996 Differential Revision: D97184057
1 parent 3aefa9d commit ba95b79

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

cachelib/allocator/nvmcache/tests/NvmCacheTests.cpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -215,11 +215,11 @@ TEST_F(NvmCacheTest, EvictToNvmGet) {
215215
const auto nEvictions = this->evictionCount() - evictBefore;
216216
ASSERT_LT(0, nEvictions);
217217

218-
// read from ram cache first so that we will not cause evictions
218+
// Read from ram cache first so that we will not cause evictions
219219
// to navy for items that are still in ram-cache until we start
220-
// reading items from navy
220+
// reading items from navy.
221221
for (unsigned int i = nKeys + 100; i-- > 0;) {
222-
unsigned int index = i - 1;
222+
unsigned int index = i;
223223
auto key = folly::sformat("key{}", index);
224224
auto hdl = this->fetch(key, false /* ramOnly */);
225225
hdl.wait();
@@ -248,6 +248,10 @@ TEST_F(NvmCacheTest, EvictToNvmGet) {
248248
}
249249
}
250250

251+
// Flush to ensure all async navy callbacks (including GetCtx destruction
252+
// which decrements handle counts) have fully completed.
253+
nvm.flushNvmCache();
254+
251255
// Reads are done. We should be at "0" active handle count across all threads.
252256
ASSERT_EQ(0, nvm.getNumActiveHandles());
253257
ASSERT_EQ(0, nvm.getHandleCountForThread());

0 commit comments

Comments
 (0)