Skip to content

Add GenerateFromPoS in Android LLAMA API #8290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kirklandsign opened this issue Feb 6, 2025 · 4 comments
Open

Add GenerateFromPoS in Android LLAMA API #8290

kirklandsign opened this issue Feb 6, 2025 · 4 comments
Assignees
Labels
module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@kirklandsign
Copy link
Contributor

🚀 The feature, motivation and pitch

Context: #8218 (comment)

Synced with @cmodi-meta in December.

  1. ET team would like to use GenerateFromPoS() since it will be faster for the second prompt in a conversation compared to Generate(). Makes sense since it helps with prefill.
  2. We're unclear whether Generate() will be completely remove or not, when the change will take place, impacts,..etc.
  3. @cmodi-meta shared our thoughts on still keeping the Generate() there and just have it call GenerateFromPoS() so there's not 2 different logic paths. We aligned on this question to funnel with rest of ET team.
  4. @kirklandsign keep us in the loop. Timing-wise it won't happen this month. Suspecting next release will be in Feb (maybe?).

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

@kirklandsign kirklandsign added module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Feb 6, 2025
@larryliu0820
Copy link
Contributor

still keeping the Generate() there and just have it call GenerateFromPoS() so there's not 2 different logic paths

Makes sense to me

@cmodi-meta
Copy link
Contributor

@kirklandsign I was wondering if you had a timeline on when this change was planning on happening?

@kirklandsign
Copy link
Contributor Author

My understanding is we need to update irunner first? https://github.com/pytorch/executorch/blob/main/extension/llm/runner/irunner.h#L118

@cmodi-meta
Copy link
Contributor

And probably which would trickle to other runners like https://github.com/pytorch/executorch/blob/main/examples/models/llama/runner/runner.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: android Issues related to Android code, build, and execution triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Blocked
Development

No branches or pull requests

3 participants