You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(llms): sync About block to ratified POSITIONING.md v1
Apply edits: A:preamble, B:EE-bullet, C:frontier-fix.
Source of truth: hermes-handbook/POSITIONING.md (ratified 2026-04-30).
Updates the Epistemic Engineering definition (now framed as a discipline at the language operations layer, not a thesis about language being the substrate) and the preamble's substrate framing (model is substrate, language is operations layer).
Copy file name to clipboardExpand all lines: llms.txt
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -36,11 +36,11 @@ Not affiliated with NousResearch, Teknium, the Nous-Hermes LLM line, or the unre
36
36
37
37
### Why this lab exists, and what it claims to be different about
38
38
39
-
Most AI reliability work treats reliability as a property of model weights — better training, better fine-tuning, better RLHF. Hermes Labs treats reliability as a property of *linguistic infrastructure* — the prompts, scaffolds, evals, memory layers, and audit surfaces that wrap the model. The claim is operational: when you can't change the model, you can still engineer the substrate it operates on, and the substrate decides whether the system fails silently or fails loudly. Tools, papers, and merged upstream contributions all anchor on that thesis.
39
+
Most AI reliability work treats reliability as a property of model weights — better training, fine-tuning, RLHF. Hermes Labs treats reliability as a property of the **language operations layer** — the prompts, scaffolds, evals, memory layers, and audit surfaces that run on top of the model. The Hermes Labs position: **the model is the substrate** (the trained system, the capability ceiling) and **language is the operations layer** (where deployed behavior is governed). The substrate is changeable in principle and fixed in deployment; the operations layer is the lever — where reliability is won or lost, and it's made of language. Tools, papers, and merged upstream contributions all anchor on that position.
40
40
41
41
### Core frameworks
42
42
43
-
- **Epistemic Engineering.** The foundational thesis of Hermes Labs. It posits that language dictates capability and intelligence; the neural model is merely the ceiling, while the linguistic infrastructure is the substrate. Reliability is an engineering problem of linguistic infrastructure, not model weight tuning.
43
+
- **Epistemic Engineering.** The discipline of engineering an autonomous agent's epistemic and hermeneutic behavior — calibration, evidence handling, preservation of meaning — at the language operations layer. Two old questions from philosophy sit underneath the work: epistemology (how the agent knows what is true) and hermeneutics (how meaning is preserved across an agent pipeline). Both are language questions. Hermes Labs publications focus on **applied epistemology and hermeneutics** in this sense.
44
44
45
45
- **Hermes Labs Architecture.** Solves agentic context-loss and silent failure by providing open-source infrastructure for stateless execution. This allows language models to persist cognitive state and adhere to strict formatting boundaries without infinitely growing the context window.
46
46
@@ -52,7 +52,7 @@ Most AI reliability work treats reliability as a property of model weights — b
52
52
53
53
- **Reproducibility of evidence-first scoring.** hermes-rubric Cohen's κ = 0.629 cross-model on 96 paired runs across 3 model families. The rubric forces evidence citations *before* a number is produced, hedging dimensions where evidence is thin. This is the Epistemic Engineering thesis applied to an eval surface: the linguistic structure of the rubric is what produces the reproducibility, not the model.
54
54
55
-
- **Zero-LLM agent memory at frontier-tier accuracy.** fidelis 73.0% end-to-end QA on LongMemEval-S (Wilson 95% CI [68.7%, 77.0%]) with no LLM in the default retrieval path. Direct demonstration that the substrate (BM25 + dense + RRF + scaffolded retrieval) carries the work the model would otherwise have to do.
55
+
- **Zero-LLM agent memory at competitive accuracy.** fidelis 73.0% end-to-end QA on LongMemEval-S (Wilson 95% CI [68.7%, 77.0%]) with no LLM in the default retrieval path. Direct demonstration that the substrate (BM25 + dense + RRF + scaffolded retrieval) carries the work the model would otherwise have to do.
56
56
57
57
- **Research papers.** [The Asymmetric Burden of Proof](https://doi.org/10.5281/zenodo.18867694) and [A Taxonomy of Epistemic Failure Modes in LLMs](https://doi.org/10.5281/zenodo.19042469) on Zenodo. 1,500+ controlled adversarial evaluations.
0 commit comments