Skip to content

Implement a useful re-ranking strategy through azureSearchUseSemanticSearch #2093

@bobome-ola

Description

@bobome-ola

Motivation

The current codebase has the azureSearchUseSemanticSearch parameter available. It is used for activating semantic ranking in Azure AI Search.

Because of the current implementation, that parameter doesn't provide any value:

  • When azureSearchUseSemanticSearch is "False", the maximum number of chunks retrieved is based on azureSearchTopK. The retrieved chunks are then sent to the LLM.
  • When azureSearchUseSemanticSearch is "True", the maximum number of chunks retrieved is still based on azureSearchTopK, Then a rerank is applied but because all chunks are then sent to the LLM, it defeats the purpose of reranking.

A proper logic, when azureSearchUseSemanticSearch is "True" would be:

  • Retrieve a maximum of 50 chunks after an initial RRF-ranked search result (hybrid search needed)
  • Rerank the chunks by using "title", "keyword" and "content". In this case, "title" would be the title of the document, "keyword" would be the keyword(s) of the document and "content" would be the chunk.
  • Keep a maximum of azureSearchTopK chunks by selecting the chunks with the highest search.rerankerScore (scores range from 4 to 0 (high to low), where a higher score indicates higher relevance) and send them to the LLM.

Tasks

To be filled in by the engineer picking up the issue

  • Task 1
  • Task 2
  • ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions