"Turing complete" LLM and its implementation in Outlines #1545

903124 · 2025-04-17T04:32:56Z

903124
Apr 17, 2025

Introduction

It’s the thing that makes LLMs act like computers. - dottxtai

Turing completeness is what makes a programming language (or other unintentional stuff like Powerpoint or Minecraft) run an arbitrary program. Informally it requires a program able to assess unlimited memory, able to make decision on memory and able to run indefinitely. Currrently, LLM still does not able to write a consistent output regarding to its context so it cannot be called as "Turing complete", it would be good if it's possible to design patterns assuming it does.

Current progress

Next I'd outline some of the progress made to make it possible by properties of a Turing Complete machine

Operation base on previous output

For now there are some capabilities by LLM to determine operation required notably "Agent usage", though not that reliably and so lots of its usage still rely on very large close source commercial models. More about it will be discussed further in next session

Run indefinitely until halt by trigger

In #1407 and #1480 it was proposed that the generation of text continue until hitting token during the generation.

Assess data from memory

I've made #1516 demonstrating a prototype which store LLM generation as variable in memory and possible to reference it later.

Summarizing the above, Outlines is actually pretty close to what "Turing completeness" does. But then it raises a question: why looking for that when programming languages are already to achieve it in the first hand?

Why "computational" structured output can help

Take smartphones as an example. Say you want to look up today's deal in food delivery app and check what's the best option among few restaurant. The steps it takes for AI/LLM are:

Open the app and select targeted shops
Loop over them and store items in memory
Summarize from memory and output final answer.

In this case that requires multiple steps of "tools usage" you'd expect a large enough model to avoid a mode collapse or decoherence of models. However most smartphones are only able to carry a 3B/7B models and current pricing for API model with computer vision are quite expensive (e.g. Claude computer usage model is costly enough to see little adaption).

This lead to one question: how to squeeze the performance of small models especially when pre-training saturate quickly and add reasoning would not help much in this scale? One of the method is generating structured output by creating "mask" for test time and limit the tools the model called, while breaking down the problem into smaller steps for the model to handle. That's what a "Turing" LLM can help while doing more than calling programming language like Python.

What is the end game here?

The DSL aspect of Outlines could be useful if LLM train on such DSL, it would be possible to enhance the capabilities of models while keeping it entirely local for inference.
Hope this piece could shred some light on future of structured output in LLM!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"Turing complete" LLM and its implementation in Outlines #1545

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

"Turing complete" LLM and its implementation in Outlines #1545

Uh oh!

903124 Apr 17, 2025

Introduction

Current progress

Operation base on previous output

Run indefinitely until halt by trigger

Assess data from memory

Why "computational" structured output can help

What is the end game here?

Replies: 0 comments

903124
Apr 17, 2025