What return type for constrained generation? #1546

RobinPicard · 2025-04-17T15:52:03Z

RobinPicard
Apr 17, 2025
Maintainer

After text has been generated by the model, we can either return the raw string to the user or modify it to match the output type initially provided by the user. The latter would concern for instance the cases in which the user provides a json dict or a Pydantic model (they would then receive respectively a dict and Pydantic model instance).

Deciding whether to modify the output type instead of always returning a string is not straightforward and we should carefully consider the pros and cons of doing so.

Pros:

It feels intuitive that if you provide a Pydantic type, you also receive a Pydantic type. It's what users would expect.
It makes users' code shorter/simpler as they don't have to manually turn the string output into the type they were using.

Cons:

Some more complex cases will be hard for us to manage and may lead to buggy/unexpected behavior (especially when the output type contains Union).
It makes the life of users paradoxically harder as they have to know the associated return types for different output types and some will be unintuitive. For instance, anint output type will give an int, but the regex for an integer will give an str. Other example: an Enum of booleans would return a str, not a bool.
Turning the string return value into the user's favorite format is almost always a one-liner, so it's not very costly for the user to do it themselves.

What do you think? Are there arguments I did not list out above?

rlouf · 2025-04-17T18:12:31Z

rlouf
Apr 17, 2025
Maintainer

There’s a third possibility which is to do it for simple types and not for others. But I’m afraid this may make things even more confusing. I’d suggest we hold off for now and reconsider should this be a recurrent ask from the community.

Another thing you didn’t mention is code complexity and how hard it is going to maintain.

0 replies

cpfiffer · 2025-04-17T21:33:20Z

cpfiffer
Apr 17, 2025

I would say that this is an important issue to address, in part because receiving only strings that the users have to handle themselves can be annoying, especially since they are currently used to Outlines helping them cast types to the types they expect.

Where possible, we should make attempts to return input types that match output types. I understand that it is more code to maintain and can come with some headaches, but it is a critical part of the developer experience.

It feels intuitive that if you provide a Pydantic type, you also receive a Pydantic type. It's what users would expect.

If I had to pick any case to treat separately, it would be a Pydantic model. IMO it doesn't make any sense for the user to provide a simple, explicit class and receive a string. I would be quite mad if I had to go look up a completely separate step that is (a) easy to handle and (b) previously handled by Outlines.

I would consider providing Pydantic outputs the bare minimum interface.

Failing to provide this functionality would be extremely annoying for users and would cripple our image as the best user interface for structured generation.

That said, I agree that there are a lot of complicated cases that don't have obviously correct approaches. Enums, literals, unions, etc are all annoying to think through and could probably be deferred as they are more complicated and sort of a new interface.

Turning the string return value into the user's favorite format is almost always a one-liner, so it's not very costly for the user to do it themselves.

One liners are still annoying, especially if you did not have to write them before.

Imagine a user stumbling into this and having to write validation calls everywhere. I would expect some/all moderately serious devs to write a wrapper function like

def generate(prompt, model, output_class):
    thing = model(prompt, output_class)
    return output_class.model_validate_json(thing)

This is tantamount to a large population of our users reinventing the wheel, especially when we can easily fix this with a three-liner on behalf of the user.

It makes the life of users paradoxically harder as they have to know the associated return types for different output types and some will be unintuitive. For instance, anint output type will give an int, but the regex for an integer will give an str. Other example: an Enum of booleans would return a str, not a bool.

I don't actually think this is that big an issue, at least for the regex case. If you are using regex, you are not assuming that you'll get an integer, as regex is strictly a string tool. Regex in => string out.

The point on enums/literals though is an important one, and I agree that we should likely just return a string in these case. It's not really clear what we should be doing without brute-forcing the output and trying to force it into every possible enum/literal value. Hard problem.

I do think the general ethos here should be to try to match the input type as well as you can, especially when it is easy to do (as with the Pydantic case). If we can figure out enums/literals/unions then I would be delighted as well, but there does not seem an obvious technical or interface solution to me.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What return type for constrained generation? #1546

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What return type for constrained generation? #1546

Uh oh!

RobinPicard Apr 17, 2025 Maintainer

Replies: 2 comments

Uh oh!

rlouf Apr 17, 2025 Maintainer

Uh oh!

cpfiffer Apr 17, 2025

RobinPicard
Apr 17, 2025
Maintainer

rlouf
Apr 17, 2025
Maintainer

cpfiffer
Apr 17, 2025