Kotlin-first llama.cpp integration for on-device and remote LLM inference.
Kotlin. LLMs. On your terms.
- โ Kotlin Multiplatform: shared code across Android, iOS, and desktop
- โ Offline inference via llama.cpp (compiled with Kotlin/Native bindings)
- โ Remote inference via optional HTTP client (e.g. llamatik-server)
- โ Embeddings support for vector search & retrieval
- โ Text generation (non-streaming and streaming)
- โ Context-aware generation (system + conversation history)
- โ Works with GGUF models (e.g. Mistral, Phi, LLaMA)
- โ Lightweight and dependency-free runtime
- ๐ง On-device chatbots
- ๐ Local RAG systems
- ๐ฐ๏ธ Hybrid AI apps with fallback to remote LLMs
- ๐ฎ Game AI, assistants, and dialogue generators
Llamatik provides three core modules:
llamatik-core: Native C++ llama.cpp integration via Kotlin/Nativellamatik-client: Lightweight HTTP client to connect to remote llama.cpp-compatible backendsllamatik-backend: Lightweight llama.cpp HTTP server
All backed by a shared Kotlin API so you can switch between local and remote seamlessly.
- iOS Deployment Target 16.6
Llamatik is published on Maven Central.
- Add to your
settings.gradle.kts:
dependencyResolutionManagement {
repositories {
google()
mavenCentral()
}
}- Add to your
build.gradle.kts:
commonMain.dependencies {
implementation("com.llamatik:library:0.8.1")
}The public Kotlin API is defined in LlamaBridge (an expect object with platform-specific actual implementations).
@Suppress("EXPECT_ACTUAL_CLASSIFIERS_ARE_IN_BETA_WARNING")
expect object LlamaBridge {
// Utilities
@Composable
fun getModelPath(modelFileName: String): String // copy asset/bundle model to app files dir and return absolute path
fun shutdown() // free native resources
// Embeddings
fun initModel(modelPath: String): Boolean // load embeddings model
fun embed(input: String): FloatArray // return embedding vector
// Text generation (non-streaming)
fun initGenerateModel(modelPath: String): Boolean // load generation model
fun generate(prompt: String): String
fun generateWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String
): String
// Text generation (streaming)
fun generateStream(prompt: String, callback: GenStream)
fun generateStreamWithContext(
systemPrompt: String,
contextBlock: String,
userPrompt: String,
callback: GenStream
)
// Convenience streaming overload (callbacks)
fun generateStreamWithContext(
system: String,
context: String,
user: String,
onDelta: (String) -> Unit,
onDone: () -> Unit,
onError: (String) -> Unit
)
}
interface GenStream {
fun onDelta(text: String)
fun onComplete()
fun onError(message: String)
}// 1) Resolve model paths (place GGUF in androidMain/assets)
val embPath = LlamaBridge.getModelPath("mistral-embed.Q4_0.gguf")
val genPath = LlamaBridge.getModelPath("phi-2.Q4_0.gguf")
// 2) Load models
LlamaBridge.initModel(embPath)
LlamaBridge.initGenerateModel(genPath)
// 3a) Embeddings
val vec: FloatArray = LlamaBridge.embed("Kotlin โค๏ธ llama.cpp")
// 3b) Non-streamed generation
val reply: String = LlamaBridge.generate("Write a haiku about Kotlin.")
// 3c) Streaming generation with callbacks
LlamaBridge.generateStreamWithContext(
system = "You are a concise assistant.",
context = "Project: Llamatik readme refresh.",
user = "List 3 key features.",
onDelta = { delta -> /* append to UI */ },
onDone = { /* enable send */ },
onError = { err -> /* show error */ }
)- Call
shutdown()on app teardown to release native resources. getModelPath()is@Composableto allow platform-specific asset access where needed.- Use GGUF models compatible with your build of llama.cpp (quantization, context size, etc.).
Please go to the Backend README.md for more information.
If you want to try how LLamatik works you can download the App on the App Store or Google Play Store.
The following is a list of some of the public apps using Llamatik and are published on the Google Play Store and App Store.
Want to add your app? Found an app that no longer works or no longer uses Llamatik? Please submit a pull request on GitHub to update this page!
Llamatik is 100% open-source and actively developed.
Contributions, bug reports, and feature suggestions are welcome!
Built with โค๏ธ for the Kotlin community.
