Proposal: Hash Field Expiration (HFE) #3432
PragmaTwice
started this conversation in
General
Replies: 2 comments 2 replies
-
|
Nice proposal.
For the HLEN, I'm wondering if it would be better to have an option like |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
NOTE: The proposal is updated. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Motivation
Redis 7.4 introduced per-field expiration for hashes (
HEXPIRE,HPERSIST,HPTTL, etc.). This proposal describes how to implement HFE in Kvrocks on top of the existing hash encoding in RocksDB.Goals
Design
Metadata
We extend the existing hash metadata with four new fields (
mode,persist,lower,upper):sizemode0= legacy (compatible with old data),1= field expiration mode.persistexpire == 0). Default0.lower0.upper0.The number of TTL candidates is derived from metadata:
ttl_candidatesis not necessarily the number of physical TTL subkeys. It may include expired TTL subkeys that havealready been dropped by the compaction filter without updating metadata. This is intentional:
sizemay over-count untilthe next repair, while
persistremains a reliable count of persistent physical fields because compaction never removespersistent subkeys.
The metadata must maintain these invariants:
0 <= persist <= size.ttl_candidates == 0means there are no known TTL candidates; in that casesize == persistandlower == upper == 0.lowerandupperare conservative hints for TTL candidates. Writes may expand the interval, but they must not tightenit unless they can prove there are no TTL candidates left. Exact bounds are recomputed by
SCAN_AND_REPAIR.Migration rule:
mode=0(extra fields absent, metadata decoded as legacy).hash-encoding-mode.mode=1hash created with only persistent fields hassize == persistandlower == upper == 0.Subkey Encoding
When
mode=1, each subkey value is prefixed with an 8-byte expiration timestamp (ms).0means the field is persistent.HLEN Algorithm
The fast-path branches avoid scanning. The last branch scans all subkeys, removes expired ones, and updates metadata.
Metadata Update Rules
Only write paths and repair paths that observe a physical subkey can update counters for that field. A missing subkey
must not be assumed to be a deleted TTL subkey: a single-field path cannot distinguish a field that never existed from a
TTL ghost removed by compaction.
In the transition table below, "TTL field" means a physical subkey with
expire != 0. It can be live or alreadyexpired. A "TTL ghost" means a TTL candidate that remains in metadata after compaction has removed the physical subkey.
Single-field writes cannot identify or subtract TTL ghosts; only
SCAN_AND_REPAIRcan recompute them.For a touched field:
size++,persist++size++, expandlower/upperpersist--, expandlower/upperpersist++size--,persist--size--lower/upperif the new timestamp falls outside the current hintsExpanding
lower/upperwith a new TTL timestamptsmeans:When a write removes the last TTL candidate (
size == persist), it must resetlower = upper = 0. Otherwise, writesmust leave stale
lower/uppervalues in place unless they need to expand the interval for a newly written TTL.Command semantics still decide whether an expired physical TTL field is treated as existing. Ordinary read commands treat
expired physical TTL fields as logically missing and do not update metadata or subkeys. Write paths such as
HEXPIREandHPERSISTalso treat expired physical TTL fields as logically missing; when they observe one, they may delete thephysical subkey as cleanup by using the "TTL field -> deleted field" transition, but must not turn it into a persistent
field or refresh its TTL.
Overwrite-style writes such as
HSET,HSETNXwhen allowed, andHINCRBY/HINCRBYFLOATmay replace an expiredphysical TTL field with a new persistent value. In that case they use the "TTL field -> persistent field" transition.
For
HEXPIRE, setting an expiration on a live persistent field uses the "Persistent field -> TTL field" transition.Updating a live TTL field keeps
sizeandpersistunchanged and only expands the hints if needed. A zero TTL deletesthe live field immediately and uses the corresponding delete transition.
For
HPERSIST, a live TTL field uses the "TTL field -> persistent field" transition. A persistent field is unchanged.For
HDEL, deleting a persistent field decrements bothsizeandpersist; deleting a TTL field decrements onlysize.Expired Field Cleanup
Passive repairing — foreground commands (e.g.
HLEN) use the fast-path conditions above and triggerSCAN_AND_REPAIRwhen the exact live field count cannot be proven from metadata alone.Compaction filter — drops expired TTL subkeys by checking the 8-byte expire prefix, but does NOT update metadata.
This avoids data races and keeps compaction fast. After compaction,
sizemay still include removed TTL candidates.persistremains valid because persistent subkeys are not removed by the compaction filter. Metadata is corrected by thenext passive repair.
Active repairing (optional, may not be in the first version) — a background thread with an expiration index in a separate CF:
Scans entries up to
nowand deletes corresponding fields. Trade-off:HEXPIRE/HSETEXneed an extra write to maintain this index.The encoding here is just a rough draft; we still need to consider slots and namespaces for this CF.
New Config Options
Example of HLEN fast-path & repair logic
To illustrate why actual slow scans are infrequent, consider a hash with two TTL fields:
field1(expire=5) andfield2(expire=10).Initial metadata:
lower=5,upper=10,size=2,persist=0.now=0or2:HLENdirectly returns2(Fast path:now < lower).now=6: TriggersSCAN_AND_REPAIR. It removesfield1, updates metadata tolower=10, upper=10, size=1, persist=0, and returns1.now=7or9:HLENdirectly returns1(Fast path:now < lower).now=11: Directly deletes the key and returns0(Fast path:now > upperandpersist == 0).As shown, thanks to the lazy bounds (
lowerandupper),SCAN_AND_REPAIRis only triggered when absolutely necessary, minimizing full scans in most scenarios.Example of compaction-safe metadata
Consider a hash with one expired TTL field:
If the compaction filter removes
field1, metadata is unchanged:If a later
HSET key field1 valuesees the field as missing, it must use the "Missing field -> persistent field"transition:
sizenow over-counts by one TTL candidate, butpersist=1prevents the HLEN fast path from deleting the whole key whennow > upper. The nextSCAN_AND_REPAIRscans physical subkeys and rewrites metadata to:HLEN spec
Beta Was this translation helpful? Give feedback.
All reactions