Bringing Evals into the Langchain::Assistant #853
andreibondarev
started this conversation in
Ideas
Replies: 2 comments 2 replies
-
@bborn I created this discussion thread to talk about how we could integrate the evals. Maybe we could flesh things out here before implementing? |
Beta Was this translation helpful? Give feedback.
1 reply
-
@bborn Take a glance: #855 (comment) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
It's time to introduce a light-weight way to run evals on the Langchain::Assistant execution output. Regardless of what kind of metrics are being evaluated I'd like to figure out a good DSL for how it's integrated into the Langchain::Assistant.
Agent interactions are generally:
Given a collection of AI agent inputs and corresponding ideal outputs -- we should be able to run our AI agent through this dataset and compare.
A few questions to consider:
Beta Was this translation helpful? Give feedback.
All reactions