Skip to content

Add Multi model support to java-cfenv #299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

cpage-pivotal
Copy link

@cpage-pivotal cpage-pivotal commented Jul 4, 2025

These changes fully support both the single-model and multi-model plans in the Tanzu Platform GenAI tile.

It uses the /v1/models endpoint for model discovery as described here:
https://techdocs.broadcom.com/us/en/vmware-tanzu/platform-services/genai-on-tanzu-platform-for-cloud-foundry/10-2/ai-cf/how-to-guides-discover-models-and-send-openai-requests-to-them.html

*/
public class GenAIModelInfo {

public enum Capability {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally wouldn't make this an enum. We're expecting the capabilities returned by the ai-server to increase over time. If this is an enum we need to ensure this is updated in lock-step with ai-server.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have made Capability a non-enum class.


return models.stream()
.filter(model -> model.hasCapability(requiredCapability))
.findFirst();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this approach, its only possible to get the first model that matches

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. I originally experimented with allowing for other matching options (getting the last model that matches, preferring a certain model, etc), but in practice this doesn't work well. It's not intuitive how an end user configures that sort of preference, and for first/last model matching, the user can't reliably predict the ordering of models anyway.

I don't think it's best practice for a multi-model plan to publish multiple models with the same capability. If this does happen, and an end user needs to exert precise control over which model is selected, I think that needs to be done through code, not through an auto-config library like java-cfenv.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're actually assuming that multiple models within an endpoint with the same capabilities will be used quite frequently, especially when using things like mcp sampling where we want to select a model based on labels rather than an exact model name.

We're also expecting these models to change over time and we want to allow apps to be able to handle this without restarting - hence the introduction of the GenaiLocator which can be used to dynamically be query the models available rather than statically wiring in a model that requires a restart to pick up any changes.

Copy link
Author

@cpage-pivotal cpage-pivotal Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all good, but again, I don't think MCP Sampling or the selection of individual models is handled elegantly within an autoconfig startup library like java-cfenv.

I think java-cfenv is for developers who want an easy button default option on startup. Fine-grained model selection (and certainly runtime model selection) needs to be handled in other code.

Similarly, java-cfenv doesn't provide a mechanism for selecting a specific RabbitMQ server if multiple instances are bound on startup. That's outside the scope of java-cfenv.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other option would be to refuse to bind any models if multiple models with matching capabilities are found. I can implement that if you like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants