-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Description
Our caching strategy is the following: Try to get the value from cache. After a prescribed timeout, query the database directly. We've noticed in prod, that after a single key timeout, all subsequent keys will timeout too.
I managed to recreate the issue by creating a poorly-behaved proxy that silently drops a specific key, but passes on all other keys. I don't claim the root issue is coming from a poorly-behaved proxy behaving the exact same way as I coded, just that it perfectly replicates our issue in prod.
Here's the proxy: proxy.js
And here's the example that replicates our issue: client.js
How to run the test-case:
docker run docker run -p 6379:6379 redis:8.4.0-bookworm
# in a new shell
node proxy.js
# in a new shell
node client.jsThe weird behavior:
After the first timeout, all subsequent requests will time-out. The redis-server receives and replies to the subsequent requests, but the redis-client doesn't seem to return the response.
What didn't work:
This is the initial timeout logic that breaks after the first timeout.
async function getWithTimeout(key, defaultValue, timeoutMs = 2000) {
try {
const timeoutPromise = new Promise((_, reject) =>
setTimeout(() => reject(new Error('Redis operation timeout')), timeoutMs)
);
const getPromise = client.get(key);
const result = await Promise.race([getPromise, timeoutPromise]);
return result !== null ? result : defaultValue;
} catch (err) {
console.error(`Error fetching key "${key}":`, err.message);
return "timeout"
}
}We also tried the abortControler method, but it just hangs in the dropped key:
async function getWithTimeout(key, defaultValue, timeoutMs = 2000) {
const ac = new AbortController();
const t = setTimeout(() => ac.abort(), 1000);
try {
const result = await client.withCommandOptions({abortSignal: ac.signal}).get(key);
return result !== null ? result : defaultValue;
} catch (err) {
console.error(`Error fetching key "${key}":`, err.message);
return "timeout"
} finally {
clearTimeout(t);
}
}What kinda worked:
Setting clientConfig.socker.socketTimeout: 2000 force-disconnects the socket after 2s, but all subsequent requests fail with The client is closed errors. Reconnecting the client is then better with the next method.
What works, but it doesn't feel right:
If there's a timeout, the retry logic will disconnect from the server, and open a new connection.
async function getWithTimeout(key, defaultValue, timeoutMs = 2000) {
try {
const result = await client.withCommandOptions({abortSignal: ac.signal}).get(key);
return result !== null ? result : defaultValue;
} catch (err) {
console.error(`Error fetching key "${key}":`, err.message);
try {
// HERE
client.destroy()
await client.connect()
}
catch (e) { }
return "timeout"
}
}While this solves our immediate issue, it feels like we're throwing the baby out with the bathwater.
This issue feels like it should be handled at the library level, but it may as well be expected behavior. In that case, perhaps the documentation could be updated to clarify this gotcha.
Node.js Version
v22.17.1
Redis Server Version
8.4.0
Node Redis Version
5.10.0
Platform
Linux
Logs
In client.js:
User data: 0: default value 0
User data: 1: default value 1
User data: 2: default value 2
User data: 3: default value 3
User data: 4: default value 4
User data: 5: default value 5
User data: 6: default value 6
User data: 7: default value 7
User data: 8: default value 8
User data: 9: default value 9
Error fetching key "key:10": Redis operation timeout
User data: 10: timeout
Error fetching key "key:11": Redis operation timeout
User data: 11: timeout
Error fetching key "key:12": Redis operation timeout
User data: 12: timeout
Error fetching key "key:13": Redis operation timeout
User data: 13: timeout
Error fetching key "key:14": Redis operation timeout
User data: 14: timeout
Error fetching key "key:15": Redis operation timeout
User data: 15: timeout
Error fetching key "key:16": Redis operation timeout
User data: 16: timeout
Error fetching key "key:17": Redis operation timeout
User data: 17: timeout
Error fetching key "key:18": Redis operation timeout
User data: 18: timeout
Error fetching key "key:19": Redis operation timeout
User data: 19: timeout
In the proxy.js:
[ 'CLIENT', 'SETINFO', 'LIB-VER', '5.10.0' ]
[ 'GET', 'key:0' ]
[ 'GET', 'key:1' ]
[ 'GET', 'key:2' ]
[ 'GET', 'key:3' ]
[ 'GET', 'key:4' ]
[ 'GET', 'key:5' ]
[ 'GET', 'key:6' ]
[ 'GET', 'key:7' ]
[ 'GET', 'key:8' ]
[ 'GET', 'key:9' ]
Blocking GET key:10
[ 'GET', 'key:11' ]
[ 'GET', 'key:12' ]
[ 'GET', 'key:13' ]
[ 'GET', 'key:14' ]
[ 'GET', 'key:15' ]
[ 'GET', 'key:16' ]
[ 'GET', 'key:17' ]
[ 'GET', 'key:18' ]
[ 'GET', 'key:19' ]
[ 'QUIT' ]