Skip to content

Define + document core API surface for language models, provide language bindings #8767

Open
@GregoryComer

Description

@GregoryComer

🚀 The feature, motivation and pitch

In order to make language models easy to use on ExecuTorch, we likely need to define the "core" components for running LLMs on ET. This is an evolving area, of course, but there is a lot of interest now on running decoder-only auto-regressive generation, which we want to provide a clean solution for. The most important use, in my opinion, is providing a streamlined path for runtime integration of HF models, though the more general, the better.

In a nutshell, I hope to be able to give users "the" way to run LLMs on ET. Don't make them make technical decisions when we can avoid it. We can have lower-level composable components and power-user APIs, but I hope to handle the majority of use cases by being able to tell users, "get your model from HF, export with Optimum, and here's the runner APIs in C++, Java, and Obj-C/Swift". Minimal decision making or lower-level understanding required.

The Android + iOS bindings are likely most critical here, but I want to ensure that we standardize on the components and API design before building these.

Requirements

  1. Define "core" runtime components needed to run text generation decoder models, particularly with a focus on models from HF through Optimum.
  2. Move all reusable components out of the examples directory (may already be done).
  3. Clearly document these components.
  4. Provide API bindings for these components from Java and Objective-C / Swift. Ideally, the API surface should match as closely as possible while respecting the appropriate language conventions, such that there is parity in usage and capability between Java, Obj-C/Swift, and C++ runners.

CC @guangy10 @larryliu0820 for core APIs, @kirklandsign @shoumikhin for language bindings, @byjlw @mergennachin for usability workstream

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @mergennachin @cccclai @helunwencser @jackzhxng @byjlw

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: llmIssues related to LLM examples and apps, and to the extensions/llm/ codemodule: user experienceIssues related to reducing friction for userstriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    Status

    To triage

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions