You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The configuration parameters @confMergePolicy@, @confSizeRatio@, and @confWriteBufferAlloc@ affect how the table organises its data.
189
-
To understand what effect these parameters have, one must have a basic understand of how an LSM-tree stores its data.
189
+
To understand what effect these parameters have, one must have a basic understanding of how an LSM-tree stores its data.
190
190
The physical entries in an LSM-tree are key–operation pairs, which pair a key with an operation such as an @Insert@ with a value or a @Delete@.
191
191
These key–operation pairs are organised into /runs/, which are sequences of key–operation pairs sorted by their key.
192
192
Runs are organised into /levels/, which are unordered sequences or runs.
@@ -331,10 +331,10 @@ description:
331
331
which balances the performance of lookups against the in-memory size of the table.
332
332
333
333
Tables maintain a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) in memory for each run on disk.
334
-
These Bloom filter are probablilistic datastructure that are used to track which keys are present in their corresponding run.
334
+
These Bloom filters are probablilistic datastructure that are used to track which keys are present in their corresponding run.
335
335
Querying a Bloom filter returns either \"maybe\" meaning the key is possibly in the run or \"no\" meaning the key is definitely not in the run.
336
336
When a query returns \"maybe\" while the key is /not/ in the run, this is referred to as a /false positive/.
337
-
While the database executes a lookup operation, any Bloom filter query that returns a false positive causes the database to unnecessarily read a run from disk.
337
+
While the database executes a lookup operation, any Bloom filter query that returns a false positive causes the database to unnecessarily read a page from disk.
338
338
The probabliliy of these spurious reads follow a [binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution) \(\text{Binomial}(r,\text{FPR})\)
339
339
where \(r\) refers to the number of runs and \(\text{FPR}\) refers to the false-positive rate of the Bloom filters.
340
340
Hence, the expected number of spurious reads for each lookup operation is \(r\cdot\text{FPR}\).
@@ -401,7 +401,7 @@ description:
401
401
Ordinary indexes are designed for any use case.
402
402
403
403
Ordinary indexes store one serialised key per page of memory.
404
-
The total in-memory size of all indexes is \(K \cdot \frac{n}{P}\) bits,
404
+
The average total in-memory size of all indexes is \(K \cdot \frac{n}{P}\) bits,
405
405
where \(K\) refers to the average size of a serialised key in bits.
406
406
407
407
[@CompactIndex@]
@@ -410,7 +410,7 @@ description:
410
410
Compact indexes store the 64 most significant bits of the minimum serialised key of each page of memory.
411
411
This requires that serialised keys are /at least/ 64 bits in size.
412
412
Compact indexes store 1 additional bit per page of memory to resolve collisions, 1 additional bit per page of memory to mark entries that are larger than one page, and a negligible amount of memory for tie breakers.
413
-
The total in-memory size of all indexes is \(66 \cdot \frac{n}{P}\) bits.
413
+
The average total in-memory size of all indexes is \(66 \cdot \frac{n}{P}\) bits.
414
414
415
415
==== Fine-tuning: Disk Cache Policy #fine_tuning_disk_cache_policy#
0 commit comments