You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It’s the thing that makes LLMs act like computers. - dottxtai
Turing completeness is what makes a programming language (or other unintentional stuff like Powerpoint or Minecraft) run an arbitrary program. Informally it requires a program able to assess unlimited memory, able to make decision on memory and able to run indefinitely. Currrently, LLM still does not able to write a consistent output regarding to its context so it cannot be called as "Turing complete", it would be good if it's possible to design patterns assuming it does.
Current progress
Next I'd outline some of the progress made to make it possible by properties of a Turing Complete machine
Operation base on previous output
For now there are some capabilities by LLM to determine operation required notably "Agent usage", though not that reliably and so lots of its usage still rely on very large close source commercial models. More about it will be discussed further in next session
Run indefinitely until halt by trigger
In #1407 and #1480 it was proposed that the generation of text continue until hitting token during the generation.
Assess data from memory
I've made #1516 demonstrating a prototype which store LLM generation as variable in memory and possible to reference it later.
Summarizing the above, Outlines is actually pretty close to what "Turing completeness" does. But then it raises a question: why looking for that when programming languages are already to achieve it in the first hand?
Why "computational" structured output can help
Take smartphones as an example. Say you want to look up today's deal in food delivery app and check what's the best option among few restaurant. The steps it takes for AI/LLM are:
Open the app and select targeted shops
Loop over them and store items in memory
Summarize from memory and output final answer.
In this case that requires multiple steps of "tools usage" you'd expect a large enough model to avoid a mode collapse or decoherence of models. However most smartphones are only able to carry a 3B/7B models and current pricing for API model with computer vision are quite expensive (e.g. Claude computer usage model is costly enough to see little adaption).
This lead to one question: how to squeeze the performance of small models especially when pre-training saturate quickly and add reasoning would not help much in this scale? One of the method is generating structured output by creating "mask" for test time and limit the tools the model called, while breaking down the problem into smaller steps for the model to handle. That's what a "Turing" LLM can help while doing more than calling programming language like Python.
What is the end game here?
The DSL aspect of Outlines could be useful if LLM train on such DSL, it would be possible to enhance the capabilities of models while keeping it entirely local for inference.
Hope this piece could shred some light on future of structured output in LLM!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Introduction
Turing completeness is what makes a programming language (or other unintentional stuff like Powerpoint or Minecraft) run an arbitrary program. Informally it requires a program able to assess unlimited memory, able to make decision on memory and able to run indefinitely. Currrently, LLM still does not able to write a consistent output regarding to its context so it cannot be called as "Turing complete", it would be good if it's possible to design patterns assuming it does.
Current progress
Next I'd outline some of the progress made to make it possible by properties of a Turing Complete machine
Operation base on previous output
For now there are some capabilities by LLM to determine operation required notably "Agent usage", though not that reliably and so lots of its usage still rely on very large close source commercial models. More about it will be discussed further in next session
Run indefinitely until halt by trigger
In #1407 and #1480 it was proposed that the generation of text continue until hitting token during the generation.
Assess data from memory
I've made #1516 demonstrating a prototype which store LLM generation as variable in memory and possible to reference it later.
Summarizing the above, Outlines is actually pretty close to what "Turing completeness" does. But then it raises a question: why looking for that when programming languages are already to achieve it in the first hand?
Why "computational" structured output can help
Take smartphones as an example. Say you want to look up today's deal in food delivery app and check what's the best option among few restaurant. The steps it takes for AI/LLM are:
In this case that requires multiple steps of "tools usage" you'd expect a large enough model to avoid a mode collapse or decoherence of models. However most smartphones are only able to carry a 3B/7B models and current pricing for API model with computer vision are quite expensive (e.g. Claude computer usage model is costly enough to see little adaption).
This lead to one question: how to squeeze the performance of small models especially when pre-training saturate quickly and add reasoning would not help much in this scale? One of the method is generating structured output by creating "mask" for test time and limit the tools the model called, while breaking down the problem into smaller steps for the model to handle. That's what a "Turing" LLM can help while doing more than calling programming language like Python.
What is the end game here?
The DSL aspect of Outlines could be useful if LLM train on such DSL, it would be possible to enhance the capabilities of models while keeping it entirely local for inference.
Hope this piece could shred some light on future of structured output in LLM!
Beta Was this translation helpful? Give feedback.
All reactions