-
Notifications
You must be signed in to change notification settings - Fork 17
[WIP] feat: implement streaming methods for chat models #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
c789fe4
to
1ac5966
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements streaming methods for UiPath LangChain chat models by adding _stream
and _astream
methods to two chat model classes. This enables real-time response streaming capabilities for both synchronous and asynchronous usage patterns.
- Adds streaming method implementations to two chat model classes
- Includes proper type annotations and documentation for the new streaming methods
- Updates version number to reflect the new functionality
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.
File | Description |
---|---|
src/uipath_langchain/chat/models.py | Implements _stream and _astream methods for two chat model classes with proper typing and documentation |
src/uipath_langchain/_cli/_runtime/_conversation.py | Adds handling for string content in message mapping |
pyproject.toml | Updates version from 0.0.134 to 0.0.135 |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
response = self._call(self.url, payload, self.auth_headers) | ||
|
||
# For non-streaming response, yield single chunk | ||
chat_result = self._create_chat_result(response) | ||
chunk = ChatGenerationChunk( | ||
message=AIMessageChunk( | ||
content=chat_result.generations[0].message.content, | ||
additional_kwargs=chat_result.generations[0].message.additional_kwargs, | ||
response_metadata=chat_result.generations[0].message.response_metadata, | ||
usage_metadata=chat_result.generations[0].message.usage_metadata, # type: ignore | ||
) | ||
) | ||
yield chunk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The streaming implementation returns a single chunk instead of true streaming. Consider implementing actual streaming by making a streaming request to the API or clearly document this as a fallback implementation.
Copilot uses AI. Check for mistakes.
response = await self._acall(self.url, payload, self.auth_headers) | ||
|
||
# For non-streaming response, yield single chunk | ||
chat_result = self._create_chat_result(response) | ||
chunk = ChatGenerationChunk( | ||
message=AIMessageChunk( | ||
content=chat_result.generations[0].message.content, | ||
additional_kwargs=chat_result.generations[0].message.additional_kwargs, | ||
response_metadata=chat_result.generations[0].message.response_metadata, | ||
usage_metadata=chat_result.generations[0].message.usage_metadata, # type: ignore | ||
) | ||
) | ||
yield chunk | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async streaming implementation also returns a single chunk instead of true streaming. This duplicates the same non-streaming behavior as the sync version.
response = await self._acall(self.url, payload, self.auth_headers) | |
# For non-streaming response, yield single chunk | |
chat_result = self._create_chat_result(response) | |
chunk = ChatGenerationChunk( | |
message=AIMessageChunk( | |
content=chat_result.generations[0].message.content, | |
additional_kwargs=chat_result.generations[0].message.additional_kwargs, | |
response_metadata=chat_result.generations[0].message.response_metadata, | |
usage_metadata=chat_result.generations[0].message.usage_metadata, # type: ignore | |
) | |
) | |
yield chunk | |
# Assume _acall returns an async iterator over streamed response chunks | |
async for chunk_data in self._acall(self.url, payload, self.auth_headers, stream=True): | |
# Convert each streamed chunk to ChatGenerationChunk | |
chat_result = self._create_chat_result(chunk_data) | |
yield ChatGenerationChunk( | |
message=AIMessageChunk( | |
content=chat_result.generations[0].message.content, | |
additional_kwargs=chat_result.generations[0].message.additional_kwargs, | |
response_metadata=chat_result.generations[0].message.response_metadata, | |
usage_metadata=chat_result.generations[0].message.usage_metadata, # type: ignore | |
) | |
) |
Copilot uses AI. Check for mistakes.
response = self._call(self.url, payload, self.auth_headers) | ||
|
||
# For non-streaming response, yield single chunk | ||
chat_result = self._create_chat_result(response) | ||
chunk = ChatGenerationChunk( | ||
message=AIMessageChunk( | ||
content=chat_result.generations[0].message.content, | ||
additional_kwargs=chat_result.generations[0].message.additional_kwargs, | ||
response_metadata=chat_result.generations[0].message.response_metadata, | ||
usage_metadata=chat_result.generations[0].message.usage_metadata, # type: ignore | ||
tool_calls=getattr( | ||
chat_result.generations[0].message, "tool_calls", None | ||
), | ||
) | ||
) | ||
yield chunk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the third instance of duplicated non-streaming logic in streaming methods. The code pattern is repeated across multiple methods with only minor variations.
Copilot uses AI. Check for mistakes.
response = await self._acall(self.url, payload, self.auth_headers) | ||
|
||
# For non-streaming response, yield single chunk | ||
chat_result = self._create_chat_result(response) | ||
chunk = ChatGenerationChunk( | ||
message=AIMessageChunk( | ||
content=chat_result.generations[0].message.content, | ||
additional_kwargs=chat_result.generations[0].message.additional_kwargs, | ||
response_metadata=chat_result.generations[0].message.response_metadata, | ||
usage_metadata=chat_result.generations[0].message.usage_metadata, # type: ignore | ||
tool_calls=getattr( | ||
chat_result.generations[0].message, "tool_calls", None | ||
), | ||
) | ||
) | ||
yield chunk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fourth instance of the same duplicated non-streaming implementation. Consider extracting this chunk creation logic into a shared helper method to reduce code duplication.
Copilot uses AI. Check for mistakes.
1ac5966
to
fb4a5cf
Compare
TODO:
Development Package