-
Notifications
You must be signed in to change notification settings - Fork 622
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Motivation
Currently the chunking is possible using Layout, Page, Fixed-size and Paragraph strategies with possible overlap. I would suggest an additional strategy focused solely on quality: LLM chunking. A LLM is called to have coherent and relevant chunks, each chunk expressing the same idea or concept or thought.
In all chunking strategies, a LLM can be used to generate additional metadata to improve the reranking by Azure AI Search when semantic search is True. The LLM would populate for each chunk:
- "Title" by summarizing the chunk into one short sentence
- "Keyword" by extracting the main keywords of the chunk
Please note that in order to make the reranking useful, ticket 2093 needs to be implemented.
Tasks
To be filled in by the engineer picking up the issue
- Task 1
- Task 2
- ...
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request