Commit ba95b79
OSS test fix (#436)
Summary:
Fix NvmCacheTest.EvictToNvmGet failing in OSS CI.
```
/home/runner/work/CacheLib/CacheLib/cachelib/allocator/nvmcache/tests/NvmCacheTests.cpp:252: Failure
Expected equality of these values:
0
nvm.getNumActiveHandles()
Which is: 1
I0318 19:45:43.067236 184036 Driver.cpp:84] Driver: finish scheduler
I0318 19:45:43.067307 184036 Driver.cpp:86] Driver: finish scheduler successful
I0318 19:45:43.067613 184036 ThreadPoolJobScheduler.cpp:86] JobScheduler: join threads for reader_pool
I0318 19:45:43.069042 184036 ThreadPoolJobScheduler.cpp:90] JobScheduler: join threads for reader_pool successful
I0318 19:45:43.072589 184036 ThreadPoolJobScheduler.cpp:86] JobScheduler: join threads for writer_pool
I0318 19:45:43.074025 184036 ThreadPoolJobScheduler.cpp:90] JobScheduler: join threads for writer_pool successful
[ FAILED ] NvmCacheTest.EvictToNvmGet (433 ms)
[----------] 1 test from NvmCacheTest (433 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (433 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] NvmCacheTest.EvictToNvmGet
1 FAILED TEST
```
Two bugs:
1. Unsigned underflow in loop index. The loop uses the i-- > 0 idiom which already decrements i before the body runs (values nKeys+99 down to 0). The extra index = i - 1 double-decremented, so on the last iteration (i=0), index = (unsigned)0 - 1 = UINT_MAX. This created a spurious lookup for a nonsensical key and skipped key1123.
* **Fix**: index = i.
2. Race between async NVM callback and handle count check. When a key is fetched from NVM, the async callback runs onGetComplete →removeFromFillMap → ~GetCtx → wakeUpWaiters → baton_.post(). The baton wakes the test thread, but GetCtx::it (the fill handle) is destroyed after the baton post, still inside the same callback. Until that destruction completes, the async thread's handle count is +1. If the test thread reaches `getNumActiveHandles()` before the callback fully returns, it observes the stale +1.
* **Fix**: call flushNvmCache() before checking handle counts. This calls scheduler_->finish() which busy-waits until all navy jobs (including the callback that destroys GetCtx::it) have fully completed.
Reviewed By: byahn0996
Differential Revision: D971840571 parent 3aefa9d commit ba95b79
1 file changed
+7
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
218 | | - | |
| 218 | + | |
219 | 219 | | |
220 | | - | |
| 220 | + | |
221 | 221 | | |
222 | | - | |
| 222 | + | |
223 | 223 | | |
224 | 224 | | |
225 | 225 | | |
| |||
248 | 248 | | |
249 | 249 | | |
250 | 250 | | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
251 | 255 | | |
252 | 256 | | |
253 | 257 | | |
| |||
0 commit comments