Conversation
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
| """Trigger manual context summarization via a pipeline frame.""" | ||
| logger.info("Tool called: summarize_conversation") | ||
|
|
||
| summarization_config = LLMContextSummarizationConfig() |
There was a problem hiding this comment.
Having the user instantiate an LLMContextSummarizationConfig with no arguments just to then grab default values off of it feels a little awkward.
I wonder if LLMContextSummaryRequestFrame could internally create a LLMContextSummarizationConfig as a defaults provider in the case the user hasn't provided all fields?
| summarization_config = LLMContextSummarizationConfig() | ||
| request_frame = LLMContextSummaryRequestFrame( | ||
| request_id=str(uuid.uuid4()), | ||
| context=params.context, |
There was a problem hiding this comment.
Our public frame-based APIs that affect the context (like LLMMessagesAppendFrame, etc) don't generally require the user to have to pass the context. For one, it's already clear which context they intend to update. Secondly, since it's a frame-based API, the effects occur at the time the frame is processed; an API that takes a context might imply that it will act on a "snapshot" of the context given to it at that time, which isn't how we want users thinking about it.
I understand why you chose to make LLMContextSummaryRequestFrame the public interface for this new manual summarization functionality: it already exists, and is used internally. But the use-case we want to enable is subtly different.
LLMContextSummaryRequestFrameis used ask an LLM to generate a summary of a given context- What we need is a new way to trigger the pipeline (the aggregator? the summarizer?) to then trigger the same code path that an auto summarization would trigger. What if we had a new
LLMSummarizeContextframe thatLLMContextSummarizer.process_framehandled, causing it to trigger a request summarization (with frame-provided options)? This approach would have a few advantages:- it would make greater re-use of more existing code paths
- it would not require users to generate a
request_ideither - it would update the
LLMContextSummarizer's internal bookkeeping (like_summarization_in_progress), which I assume is valuable
There was a problem hiding this comment.
See a related idea in another comment about refactoring LLMContextSummarizationConfig, below.
There was a problem hiding this comment.
Ah...the more I think about it, we really do need to make sure LLMContextSummarizer's bookkeeping is updated on a manual summarization, since you can have both automatic and manual summarization going on in an app.
There was a problem hiding this comment.
Yep. I believe I have implemented all the suggestions that you did here.
| @@ -56,17 +63,23 @@ def __init__( | |||
| *, | |||
| context: LLMContext, | |||
| config: Optional[LLMContextSummarizationConfig] = None, | |||
There was a problem hiding this comment.
I think LLMContextSummarizationConfig could use a refactor to separate out params related to triggering auto-summarization (e.g. max_context_tokens) and those relevant to the summary generation itself and therefore relevant to both auto and manual summarization (e.g. summarization_prompt, min_messages_after_summary). Otherwise this will be messy to maintain.
Maybe we have something like:
LLMAutoContextSummarizationConfig, which contains the auto trigger related fields, as well as a nested field that is of type:LLMContextSummaryConfig, which contains the summary generation related fields
There was a problem hiding this comment.
I would then rename this field auto_summary_config or something to distinguish it from the new summary config, which will be able to differ between each manual summarization request
There was a problem hiding this comment.
Actually, another benefit of the above suggested refactor: we could then use LLMContextSummaryConfig directly in the new frame that users use to request a summary 💡 .
There was a problem hiding this comment.
I believe I have implemented all the suggestions that you did here. Let me know if this is what you had in mind.
| # Context summarization — always create the summarizer so that manually | ||
| # pushed LLMContextSummaryRequestFrame results are always handled. | ||
| # Auto-triggering based on thresholds is only enabled when | ||
| # enable_context_summarization is True. |
There was a problem hiding this comment.
I think we should rename to auto_context_summarization, and rename context_summarization_config to auto_context_summarization_config, to disambiguate between the per-summarization-request configs the user can provide.
There was a problem hiding this comment.
I have renamed it to enable_auto_context_summarization and auto_context_summarization_config.
We usually use the enable_ prefix for features that we are turning on, so I thought it would be better to do the same here.
|
@markbackman, @kompfner, I think it’s ready for another round of review. I have also already merged in the latest changes from main, which include Mark’s updates to context summarization. |
| """Trigger manual context summarization via a pipeline frame.""" | ||
| logger.info("Tool called: summarize_conversation") | ||
| await params.result_callback({"status": "summarization_requested"}) | ||
| await params.llm.queue_frame(LLMSummarizeContextFrame()) |
| Args: | ||
| context: The LLM context to monitor and summarize. | ||
| config: Configuration for summarization behavior. If None, uses default config. | ||
| config: Auto-summarization configuration controlling both trigger |
There was a problem hiding this comment.
nit: do we want to call this auto_config or something other than config, which might imply LLMContextSummaryConfig? (I don't feel super strong about this, as it's internal)
| a per-request :class:`~pipecat.utils.context.llm_context_summarization.LLMContextSummaryConfig`. | ||
| """ | ||
| if self._summarization_in_progress: | ||
| logger.debug(f"{self}: Summarization already in progress, ignoring manual request") |
markbackman
left a comment
There was a problem hiding this comment.
LGTM! Nice clean up 👏
kompfner
left a comment
There was a problem hiding this comment.
Excellent! Left a couple of tiny nits. This looks great! No need to loop back.
…ine to trigger on-demand context summarization (e.g. from a function call tool).
…MContextSummaryConfig and LLMAutoContextSummarizationConfig.
…g on-demand summarization triggered via a function call tool.
7250b99 to
d077a81
Compare
|
Thank you @kompfner , @markbackman for the great review. 🙌🚀 |
Summary
Added
LLMSummarizeContextFrame: push this frame anywhere in the pipeline to trigger on-demand context summarization (e.g. from a function call tool).Refactored
LLMContextSummarizationConfiginto two focused classes:LLMContextSummaryConfig— summary generation params (target_context_tokens,min_messages_after_summary,summarization_prompt); shared by both auto and manual modes.LLMAutoContextSummarizationConfig— auto-trigger thresholds (max_context_tokens,max_unsummarized_messages) plus a nestedsummary_config: LLMContextSummaryConfig.Renamed
LLMAssistantAggregatorParamsfields for clarity:enable_context_summarization→enable_auto_context_summarizationcontext_summarization_config→auto_context_summarization_configLLMContextSummarizationConfigis now deprecated (emitsDeprecationWarning); oldLLMAssistantAggregatorParamsfield names are kept with deprecation warnings and auto-migration for one release cycle.Added example
54b-context-summarization-manual-openai.pydemonstrating on-demand summarization triggered via a function call tool.Breaking Changes
LLMAssistantAggregatorParams.enable_context_summarizationrenamed toenable_auto_context_summarization.LLMAssistantAggregatorParams.context_summarization_configrenamed toauto_context_summarization_configand now acceptsLLMAutoContextSummarizationConfiginstead ofLLMContextSummarizationConfig.Migration
Testing
Run the new example to see manual summarization in action:
uv run examples/foundational/54b-context-summarization-manual-openai.pyAsk the bot to "summarize the conversation" — it will call the
summarize_conversationfunction, which pushes anLLMContextSummaryRequestFrameinto the pipeline. The LLM service generates the summary in a background task and the assistant aggregator compresses the context when the result arrives.