From 865e5ee78d29982159842fb3190e6c4ab3f6a017 Mon Sep 17 00:00:00 2001
From: Samiul Monir <samiul.monir@elastic.co>
Date: Mon, 28 Apr 2025 12:44:22 -0400
Subject: [PATCH 1/3] adding default value for oversampling in the
 documentation

---
 solutions/search/vector/knn.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md
index ef7b5ed46..9c3709cd5 100644
--- a/solutions/search/vector/knn.md
+++ b/solutions/search/vector/knn.md
@@ -901,7 +901,7 @@ Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elas
 
 When using [quantized vectors](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing:
 
-* **Oversampling**: Retrieve more candidates per shard.
+* **Oversampling**: Retrieve more candidates per shard. Starting in `9.1.0`, the default value for oversample is `3.0`.
 * **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates.
 
 As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines:

From 251291c6c5fd8b78cee0ba66a66c26db440406dc Mon Sep 17 00:00:00 2001
From: Samiul Monir <samiul.monir@elastic.co>
Date: Tue, 29 Apr 2025 09:50:09 -0400
Subject: [PATCH 2/3] updating the documentation for bbq and oversample

---
 solutions/search/vector/knn.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md
index 9c3709cd5..3490962b6 100644
--- a/solutions/search/vector/knn.md
+++ b/solutions/search/vector/knn.md
@@ -901,7 +901,7 @@ Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elas
 
 When using [quantized vectors](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing:
 
-* **Oversampling**: Retrieve more candidates per shard. Starting in `9.1.0`, the default value for oversample is `3.0`.
+* **Oversampling**: Retrieve more candidates per shard. The default is `3.0` in `bbq`.
 * **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates.
 
 As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines:
@@ -913,7 +913,7 @@ All forms of quantization will result in some accuracy loss and as the quantizat
 
 * `int8` requires minimal if any rescoring
 * `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss.
-* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required.
+* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required. As noted above, we default to an oversampling factor of `3.0`.
 
 You can use the `rescore_vector` [preview] option to automatically perform reranking. When a rescore `oversample` parameter is specified, the approximate kNN search will:
 

From 85feae8731e81e47b5ddab252e9950846e2ce7f1 Mon Sep 17 00:00:00 2001
From: Samiul Monir <samiul.monir@elastic.co>
Date: Mon, 5 May 2025 16:06:36 -0400
Subject: [PATCH 3/3] Update documentation wording and adding version

---
 solutions/search/vector/knn.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/solutions/search/vector/knn.md b/solutions/search/vector/knn.md
index 3490962b6..1072439f8 100644
--- a/solutions/search/vector/knn.md
+++ b/solutions/search/vector/knn.md
@@ -901,7 +901,7 @@ Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elas
 
 When using [quantized vectors](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing:
 
-* **Oversampling**: Retrieve more candidates per shard. The default is `3.0` in `bbq`.
+* **Oversampling**: Retrieve more candidates per shard. Starting in `9.1.0`, the default oversampling factor is 3, but only for the `bbq` quantization method. Other quantization methods must explicitly specify an oversample value either in the field mapping or at query time.
 * **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates.
 
 As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines: