spring-projects · mp911de · Apr 10, 2025 · Apr 10, 2025 · Apr 28, 2025 · Apr 29, 2025
diff --git a/pom.xml b/pom.xml
@@ -5,7 +5,7 @@
 
 	<groupId>org.springframework.data</groupId>
 	<artifactId>spring-data-commons</artifactId>
-	<version>4.0.0-SNAPSHOT</version>
+	<version>4.0.0-SEARCH-RESULT-SNAPSHOT</version>
 
 	<name>Spring Data Core</name>
 	<description>Core Spring concepts underpinning every Spring Data module.</description>

diff --git a/src/main/antora/modules/ROOT/nav.adoc b/src/main/antora/modules/ROOT/nav.adoc
@@ -7,6 +7,7 @@
 ** xref:repositories/query-methods.adoc[]
 ** xref:repositories/definition.adoc[]
 ** xref:repositories/query-methods-details.adoc[]
+** xref:repositories/vector-search.adoc[]
 ** xref:repositories/create-instances.adoc[]
 ** xref:repositories/custom-implementations.adoc[]
 ** xref:repositories/core-domain-events.adoc[]

diff --git a/src/main/antora/modules/ROOT/pages/repositories/vector-search.adoc b/src/main/antora/modules/ROOT/pages/repositories/vector-search.adoc
@@ -0,0 +1,167 @@
+[[vector-search]]
+= Vector Search
+
+With the rise of Generative AI, Vector databases have gained strong traction in the world of databases.
+These databases enable efficient storage and querying of high-dimensional vectors, making them well-suited for tasks such as semantic search, recommendation systems, and natural language understanding.
+
+Vector search is a technique that retrieves semantically similar data by comparing vector representations (also known as embeddings) rather than relying on traditional exact-match queries.
+This approach enables intelligent, context-aware applications that go beyond keyword-based retrieval.
+
+In the context of Spring Data, vector search opens new possibilities for building intelligent, context-aware applications, particularly in domains like natural language processing, recommendation systems, and generative AI.
+By modelling vector-based querying using familiar repository abstractions, Spring Data allows developers to seamlessly integrate similarity-based vector-capable databases with the simplicity and consistency of the Spring Data programming model.
+
+ifdef::vector-search-intro-include[]
+include::{vector-search-intro-include}[]
+endif::[]
+
+[[vector-search.model]]
+== Vector Model
+
+To support vector search in a type-safe and idiomatic way, Spring Data introduces the following core abstractions:
+
+* <<vector-search.model.vector,`Vector`>>
+* <<vector-search.model.search-result,`SearchResults<T>` and `SearchResult<T>`>>
+* <<vector-search.model.scoring,`Score`, `Similarity` and Scoring Functions>>
+
+[[vector-search.model.vector]]
+=== `Vector`
+
+The `Vector` type represents an n-dimensional numerical embedding, typically produced by embedding models.
+In Spring Data, it is defined as a lightweight wrapper around an array of floating-point numbers, ensuring immutability and consistency.
+This type can be used as an input for search queries or as a property on a domain entity to store the associated vector representation.
+
+====
+[source,java]
+----
+Vector vector = Vector.of(0.23f, 0.11f, 0.77f);
+----
+====
+
+Using `Vector` in your domain model removes the need to work with raw arrays or lists of numbers, providing a more type-safe and expressive way to handle vector data.
+This abstraction also allows for easy integration with various vector databases and libraries.
+It also allows for implementing vendor-specific optimizations such as binary or quantized vectors that do not map to a standard floating point (`float` and `double` as of https://en.wikipedia.org/wiki/IEEE_754[IEEE 754]) representation.
+A domain object can have a vector property, which can be used for similarity searches.
+Consider the following example:
+
+ifdef::vector-search-model-include[]
+include::{vector-search-model-include}[]
+endif::[]
+
+NOTE: Associating a vector with a domain object results in the vector being loaded and stored as part of the entity lifecycle, which may introduce additional overhead on retrieval and persistence operations.
+
+[[vector-search.model.search-result]]
+=== Search Results
+
+The `SearchResult<T>` type encapsulates the results of a vector similarity query.
+It includes both the matched domain object and a relevance score that indicates how closely it matches the query vector.
+This abstraction provides a structured way to handle result ranking and enables developers to easily work with both the data and its contextual relevance.
+
+ifdef::vector-search-repository-include[]
+include::{vector-search-repository-include}[]
+endif::[]
+
+In this example, the `searchByCountryAndEmbeddingNear` method returns a `SearchResults<Comment>` object, which contains a list of `SearchResult<Comment>` instances.
+Each result includes the matched `Comment` entity and its relevance score.
+
+Relevance score is a numerical value that indicates how closely the matched vector aligns with the query vector.
+Depending on whether a score represents distance or similarity a higher score can mean a closer match or a more distant one.
+
+The scoring function used to calculate this score can vary based on the underlying database, index or input parameters.
+
+[[vector-search.model.scoring]]
+=== Score, Similarity, and Scoring Functions
+
+The `Score` type holds a numerical value indicating the relevance of a search result.
+It can be used to rank results based on their similarity to the query vector.
+The `Score` type is typically a floating-point number, and its interpretation (higher is better or lower is better) depends on the specific similarity function used.
+Scores are a by-product of vector search and are not required for a successful search operation.
+Score values are not part of a domain model and therefore represented best as out-of-band data.
+
+Generally, a Score is computed by a `ScoringFunction`.
+The actual scoring function used to calculate this score can depends on the underlying database and can be obtained from a search index or input parameters.
+
+Spring Data support declares constants for commonly used functions such as:
+
+Euclidean Distance:: Calculates the straight-line distance in n-dimensional space involving the square root of the sum of squared differences.
+Cosine Similarity:: Measures the angle between two vectors by calculating the Dot product first and then normalizing its result by dividing by the product of their lengths.
+Dot Product:: Computes the sum of element-wise multiplications.
+
+The choice of similarity function can impact both the performance and semantics of the search and is often determined by the underlying database or index being used.
+Spring Data adopts to the database's native scoring function capabilities and whether the score can be used to limit results.
+
+ifdef::vector-search-scoring-include[]
+include::{vector-search-scoring-include}[]
+endif::[]
+
+[[vector-search.methods]]
+== Vector Search Methods
+
+Vector search methods are defined in repositories using the same conventions as standard Spring Data query methods.
+These methods return `SearchResults<T>` and require a `Vector` parameter to define the query vector.
+The actual implementation depends on the actual internals of the underlying data store and its capabilities around vector search.
+
+NOTE: If you are new to Spring Data repositories, make sure to familiarize yourself with the xref:repositories/core-concepts.adoc[basics of repository definitions and query methods].
+
+Generally, you have the choice of declaring a search method using two approaches:
+
+* Query Derivation
+* Declaring a String-based Query
+
+Vector Search methods must declare a `Vector` parameter to define the query vector.
+
+[[vector-search.method.derivation]]
+=== Derived Search Methods
+
+A derived search method uses the name of the method to derive the query.
+Vector Search supports the following keywords to run a Vector search when declaring a search method:
+
+.Query predicate keywords
+[options="header",cols="1,3"]
+|===============
+|Logical keyword|Keyword expressions
+|`NEAR`|`Near`, `IsNear`
+|`WITHIN`|`Within`, `IsWithin`
+|===============
+
+ifdef::vector-search-method-derived-include[]
+include::{vector-search-method-derived-include}[]
+endif::[]
+
+Derived search methods are typically easier to read and maintain, as they rely on the method name to express the query intent.
+However, a derived search method requires either to declare a `Score`, `Range<Score>` or `ScoreFunction` as second argument to the `Near`/`Within` keyword to limit search results by their score.
+
+[[vector-search.method.string]]
+=== Annotated Search Methods
+
+Annotated methods provide full control over the query semantics and parameters.
+Unlike derived methods, they do not rely on method name conventions.
+
+ifdef::vector-search-method-annotated-include[]
+include::{vector-search-method-annotated-include}[]
+endif::[]
+
+With more control over the actual query, Spring Data can make fewer assumptions about the query and its parameters.
+For example, `Similarity` normalization uses the native score function within the query to normalize the given similarity into a score predicate value and vice versa.
+If an annotated query does not define e.g. the score, then the score value in the returned `SearchResult<T>` will be zero.
+
+[[vector-search.method.sorting]]
+=== Sorting
+
+By default, search results are ordered according to their score.
+You can override sorting by using the `Sort` parameter:
+
+.Using `Sort` in Repository Search Methods
+====
+[source,java]
+----
+interface CommentRepository extends Repository<Comment, String> {
+
+  SearchResults<Comment> searchByEmbeddingNearOrderByCountry(Vector vector, Score score);
+
+  SearchResults<Comment> searchByEmbeddingWithin(Vector vector, Score score, Sort sort);
+}
+----
+====
+
+Please note that custom sorting does not allow expressing the score as a sorting criteria.
+You can only refer to domain properties.
diff --git a/src/main/java/org/springframework/data/domain/Page.java b/src/main/java/org/springframework/data/domain/Page.java
@@ -69,4 +69,5 @@ static <T> Page<T> empty(Pageable pageable) {
 	 */
 	@Override
 	<U> Page<U> map(Function<? super T, ? extends U> converter);
+
 }
diff --git a/src/main/java/org/springframework/data/domain/Range.java b/src/main/java/org/springframework/data/domain/Range.java
@@ -223,7 +223,7 @@ public boolean contains(T value, Comparator<T> comparator) {
 	/**
 	 * Apply a mapping {@link Function} to the lower and upper boundary values.
 	 *
-	 * @param mapper must not be {@literal null}. If the mapper returns {@code null}, then the corresponding boundary
+	 * @param mapper must not be {@literal null}. If the mapper returns {@literal null}, then the corresponding boundary
 	 *          value represents an {@link Bound#unbounded()} boundary.
 	 * @return a new {@link Range} after applying the value to the mapper.
 	 * @param <R> target type of the mapping function.
@@ -430,7 +430,7 @@ public boolean isInclusive() {
 		/**
 		 * Apply a mapping {@link Function} to the boundary value.
 		 *
-		 * @param mapper must not be {@literal null}. If the mapper returns {@code null}, then the boundary value
+		 * @param mapper must not be {@literal null}. If the mapper returns {@literal null}, then the boundary value
 		 *          corresponds with {@link Bound#unbounded()}.
 		 * @return a new {@link Bound} after applying the value to the mapper.
 		 * @param <R>

diff --git a/src/main/java/org/springframework/data/domain/Score.java b/src/main/java/org/springframework/data/domain/Score.java
@@ -0,0 +1,118 @@
+/*
+ * Copyright 2025 the original author or authors.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.springframework.data.domain;
+
+import java.io.Serializable;
+
+import org.springframework.util.ObjectUtils;
+
+/**
+ * Value object representing a search result score computed via a {@link ScoringFunction}.
+ * <p>
+ * Encapsulates the numeric score and the scoring function used to derive it. Scores are primarily used to rank search
+ * results. Depending on the used {@link ScoringFunction} higher scores can indicate either a higher distance or a
+ * higher similarity. Use the {@link Similarity} class to indicate usage of a normalized score across representing
+ * effectively the similarity.
+ * <p>
+ * Instances of this class are immutable and suitable for use in comparison, sorting, and range operations.
+ *
+ * @author Mark Paluch
+ * @since 4.0
+ * @see Similarity
+ */
+public sealed class Score implements Serializable permits Similarity {
+
+	private final double value;
+	private final ScoringFunction function;
+
+	Score(double value, ScoringFunction function) {
+		this.value = value;
+		this.function = function;
+	}
+
+	/**
+	 * Creates a new {@link Score} from a plain {@code score} value using {@link ScoringFunction#unspecified()}.
+	 *
+	 * @param score the score value without a specific {@link ScoringFunction}.
+	 * @return the new {@link Score}.
+	 */
+	public static Score of(double score) {
+		return of(score, ScoringFunction.unspecified());
+	}
+
+	/**
+	 * Creates a new {@link Score} from a {@code score} value using the given {@link ScoringFunction}.
+	 *
+	 * @param score the score value.
+	 * @param function the scoring function that has computed the {@code score}.
+	 * @return the new {@link Score}.
+	 */
+	public static Score of(double score, ScoringFunction function) {
+		return new Score(score, function);
+	}
+
+	/**
+	 * Creates a {@link Range} from the given minimum and maximum {@code Score} values.
+	 *
+	 * @param min the lower score value, must not be {@literal null}.
+	 * @param max the upper score value, must not be {@literal null}.
+	 * @return a {@link Range} over {@link Score} bounds.
+	 */
+	public static Range<Score> between(Score min, Score max) {
+		return Range.from(Range.Bound.inclusive(min)).to(Range.Bound.inclusive(max));
+	}
+
+	/**
+	 * Returns the raw numeric value of the score.
+	 *
+	 * @return the score value.
+	 */
+	public double getValue() {
+		return value;
+	}
+
+	/**
+	 * Returns the {@link ScoringFunction} that was used to compute this score.
+	 *
+	 * @return the associated scoring function.
+	 */
+	public ScoringFunction getFunction() {
+		return function;
+	}
+
+	@Override
+	public boolean equals(Object o) {
+		if (!(o instanceof Score other)) {
+			return false;
+		}
+		if (value != other.value) {
+			return false;
+		}
+		return ObjectUtils.nullSafeEquals(function, other.function);
+	}
+
+	@Override
+	public int hashCode() {
+		return ObjectUtils.nullSafeHash(value, function);
+	}
+
+	@Override
+	public String toString() {
+		return function instanceof UnspecifiedScoringFunction ? Double.toString(value)
+				: "%s (%s)".formatted(Double.toString(value), function.getName());
+	}
+
+}
-Original file line number
+Diff line change
@@ Expand Up / @@ -69,4 +69,5 @@ static <T> Page<T> empty(Pageable pageable) { @@
     	 */
     	@Override
     	<U> Page<U> map(Function<? super T, ? extends U> converter);
     }