-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Support asynchronous/reactive function calling #1778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey, as @tzolov mentioned, reactive/asynchronous signatures for functions are currently not supported, but it looks like a desired functionality having in mind streaming scenarios. I suggest closing the discussion #1757 and keeping this issue and related discussions here. As you noticed: ArrayList<Object> res = new ArrayList<>();
// ...
.subscribe(res::add);
return new Response(res); this construct immediately returns an object which is incomplete and another As a solution for the time being you don't need to introduce List<Map<String, Object>> data = CompletableFuture.supplyAsync(
() -> this.dataQueryRepository.query()
// ...
.collectList()
.block())
.join(); Instead, this.dataQueryRepository.query()
// ...
.collectList()
.block()) should get you the same result. For the Spring AI's internal implementation, it's worth keeping in mind that in case when an imperative user function is executed in an event loop, it's worth offloading the blocking call into a bounded-elastic return Mono.fromCallable(() -> userBlockingFunction())
.subscribeOn(Schedulers.boundedElastic()); regardless of whether the function is using reactive APIs or not, since just performing blocking calls in imperative code will degrade the performance and stall the event loop. |
Ok, thank you for your reply @chemicL . I hope you can remind me under this issues if there is any progress in asynchronous/reactive function calling. |
Hey folks - keen to see this implemented as well. I spent some time looking into this myself, and quickly found that the entire stack needs to become reactive. Not neccessarily a bad idea (eg., calls to the actual LLM are typically quite long, and would benefit from being non-blocking), but it's not a trivial amount of effort, hence I appreciate it might take a while to arrive. I also wonder if JVM's new virtual threads can help deliver similar benefit here, without the same level of refactor effort? Unsure. In the interim, I thought I'd share how we've worked around this, in case anyone else hits this issue. Here's a sketch of the implementation we've used. I've tried to simplify it down to the key parts: // This is our wrapper around ChatClient
// to help make testing easier - unrelated
fun sendToAi(
// inputs..
): Mono<ChatResponse> {
// TODO : This could/should move to reactive as well...
val chatResponse:ChatResponse = chatClient
.chatResponse()!!
// Execute tools if neccessary..
val chatResponseMono = if (response.hasToolCalls()) {
executeTools(/** inputs */)
} else Mono.just(response)
return chatResponseMono
}
// Execute tools. Returns Mono<ChatResponse>
// to allow tool-calling to be async
private fun executeTools(
): Mono<ChatResponse> {
val toolCallingManager = ToolCallingManager.builder().build()
return Mono.just(prompt to chatResponse)
.flatMap {
invokeTool(toolCallingManager, prompt, chatResponse, session)
}.flatMap { toolResponsePrompt ->
// TODO : Make async
// Send the result of the tool back to the main agent to interpret the result
// and decide what to do next.
// It might complete, or it might decide to invoke another tool
val toolCallingResponse = chatClient
.prompt(toolResponsePrompt)
.call()
.chatResponse()!!
}
// If there's more work to do, recurse...
if (toolCallingResponse.hasToolCalls()) {
executeTools(toolParams, toolResponsePrompt, toolCallingResponse, spec, session)
} else {
Mono.just(toolCallingResponse)
}
}
// This is the key function - we may route to a human for intervention,
// before routing back into the function calling loop
private fun invokeTool(
// inputs...
): Mono<Prompt> {
val toolExecutionResult = toolCallingManager.executeToolCalls(prompt, chatResponse)
return if (requiresUserIntervention(toolExecutionResult)) {
routeToUser(toolExecutionResult, session, prompt)
} else {
Mono.just(Prompt(toolExecutionResult.conversationHistory(), prompt.options))
}
}
private fun routeToUser( /* inputs */):Mono<Prompt> {
// send to the user, process response, and return the next prompt.
} Hope that helps. And as always, huge thanks to the maintainers for their efforts! |
hello, while i use block() in tool calling here is my code, how i can use this method successfully |
when i use AI give me
|
I did some work to implement non blocking tool calls on top of Spring AI: |
Uh oh!
There was an error while loading. Please reload this page.
Please do a quick search on GitHub issues first, the feature you are about to request might have already been requested.
Expected Behavior
The current FunctionCallback only supports blocking programming. If the bottom layer of the provided function callback method uses the streaming programming of reactor, it can only be blocked by block or asynchronous thread pool.
Current Behavior
I hope to provide a FunctionCallback that can be called by reactor streaming programming. For example, bi function < request, tool context, Mono >
Context
This is more friendly for applications built entirely by webflux, and using blocking FunctionCallback in webflux will bring many difficult problems.
The text was updated successfully, but these errors were encountered: