Skip to content

guprobr/aMELia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

aMELia Qt6 v9.19.8

Amelia is a local-first Qt6/C++ coding and cloud assistant that talks to a local Ollama server, stores its state under ~/.amelia_qt6, indexes a local knowledge base, and can optionally use sanitized external web search through SearXNG.

This build rolls forward the existing bootstrap, indexing, transcript, Prompt Lab, notification, and progress-bar work, and adds a Knowledge Base collection model with preserved folder structure, a tree-view browser, a hard-locked Knowledge Base root and safer workspace-jail boundaries under ~/.amelia_qt6, stronger transcript code-block handling, first-run service prompts, a full JSON configuration editor, and a context-aware document-study budget policy that now respects Ollama num_ctx end-to-end, plus a generic one-shot fallback retry when Ollama reports that the model runner stopped unexpectedly during a large grounded request. Version 9.19.8 keeps the earlier indexing RAM fixes, hard-disables Knowledge Base interaction while a prompt or reindex is in flight, fixes the document-study num_ctx reserve bug, keeps numbered procedure leads attached to following command/config lines across semantic block building and PDF page breaks, strengthens section-preview stitching, adds an exact-extraction retrieval mode for exhaustive scraper-style prompts so Amelia can emit ordered raw chunk windows instead of only lossy section summaries, follows the active Qt/system palette far more closely across widgets, labels, transcript cards, diagnostics, and in-app notifications, broadens snippet extraction for large documents and external search results, fixes the palette-helper compile regression in streamed assistant rendering, and repairs stray literal \n\n layout artifacts in final markdown output. aMELia is also allegorically considered a MEL: Model Enhancement Lab.

NOTE: prompt transcripts are first generated in markdown but after it finishes, they should be properly formatted.

What changed in v9.19.8

  • fixes the mainwindow.cpp streamed-assistant compile regression by passing the active palette into the new palette-aware transcript color helpers
  • improves final transcript sanitization so stray literal \n, \n\n, \t, and fence-adjacent escaped layout tokens render as real spacing instead of leaking into the visible answer
  • preserves quoted string escapes inside code blocks and inline code while normalizing display-only escaped layout outside those quoted regions

What changed in v9.19.7

  • removes most fixed widget/label colors and reworks the transcript, diagnostics, and toast rendering to derive colors from the active Qt/system palette instead of assuming a dark theme
  • rebuilds transcript and diagnostics rendering on palette/style changes so Amelia follows light/dark or accent changes more naturally while running
  • broadens external search snippet parsing to accept more result fields (content, snippet, description, text, summary, and descriptions[]) and keeps longer sanitized excerpts
  • improves large-document exact extraction by preferring near-fit full-file coverage more often, adding intrinsic actionability scoring for commands/YAML/config/procedure chunks, and widening raw-window sampling across long files
  • improves hit excerpts so matches can show both the first and later matching regions of the same chunk instead of truncating too aggressively around the first hit

What changed in v9.19.6

  • fixes the ragindexer.cpp build break caused by malformed multiline QStringLiteral(...) insertion in the procedural-lead helpers
  • rewrites those helpers to use valid single-literal regex construction and proper QLatin1Char('\n') splitting
  • preserves the earlier exact-extraction and chunk-boundary behavior without requiring further logic changes

What changed in v9.19.5

  • fixed the document-study num_ctx reserve calculation so large grounded requests no longer fall back to the old safeNumCtx / 8 floor in common 32768-context setups
  • added an exact-extraction retrieval mode for scraper-style prompts such as extract all, gather all actionable snippets, preserve YAML, and similar exhaustive requests
  • exact-extraction mode emits ordered raw chunk windows from the selected file, biased toward actionable hits plus evenly spaced spans across the document, instead of relying only on outline/section summaries
  • improved semantic chunk building so a procedural lead like 4. Run ...: stays attached to the following command/config block instead of being split just because the next line looks code-like
  • softened PDF page-break boundaries so [[PAGE N]] markers no longer force a mid-procedure semantic split by default
  • improved section preview stitching so procedure headers can pull in more following chunks before balanced trimming, reducing missing command lines after page breaks
  • Knowledge Base controls remain locked while prompt generation or indexing is active

Reindex note

  • the new exact-extraction retrieval path is a runtime-only change and works immediately after upgrade
  • the semantic block/page-break fixes improve how new chunks are built, so reindex once after upgrading if you want existing cached documents to benefit from the better chunk boundaries

Ubuntu packages

Required to build Amelia

sudo apt update
sudo apt install -y \
  build-essential \
  cmake \
  qt6-base-dev \
  qt6-tools-dev \
  qt6-tools-dev-tools \
  qt6-svg-dev \
  qt6-imageformats-plugins \
  poppler-utils \
  curl \
  git

Why these matter:

  • qt6-base-dev -> Qt Core / Widgets / Network / Concurrent / tray integration
  • qt6-tools-dev and qt6-tools-dev-tools -> standard Qt6 dev tooling on Ubuntu
  • qt6-svg-dev / qt6-imageformats-plugins -> SVG logo rendering and runtime image support
  • poppler-utils -> provides pdftotext, which Amelia uses to ingest PDFs
  • curl -> convenient for testing Ollama and SearXNG endpoints

Build

mkdir -p build
cd build
cmake ..
cmake --build . -j$(nproc)
cmake --install .

Desktop install

cmake --install . installs:

  • desktop entry: ${CMAKE_INSTALL_PREFIX}/share/applications/amelia_qt6.desktop
  • icon: ${CMAKE_INSTALL_PREFIX}/share/icons/hicolor/scalable/apps/amelia_qt6.svg
  • example config: ${CMAKE_INSTALL_PREFIX}/share/amelia_qt6/config/config.example.json

Starting Ollama

Native install on Ubuntu/Linux

curl -fsSL https://ollama.com/install.sh | sh
sudo systemctl enable --now ollama
sudo systemctl status ollama

Pull the recommended default generation model and the dedicated embedding model:

ollama pull gpt-oss:20b
ollama pull embeddinggemma:latest

gpt-oss:20b is the recommended default in Amelia because it is available directly in the Ollama library and is designed for powerful reasoning and developer use cases. On Windows, Amelia pairs well with Ollama's Vulkan GPU path when your driver / hardware stack supports it.

If your machine is CPU-only in practice, or if ollama ps shows the generation model staying on 100% CPU, smaller reasoning-capable alternatives such as qwen3:8b or deepseek-r1:8b are often a better fit for document-study mode than forcing very large grounded prompts through a 20B model on CPU.

Quick API tests:

curl http://localhost:11434/api/generate -d '{
  "model": "gpt-oss:20b",
  "prompt": "hello"
}'
curl http://localhost:11434/api/embed -d '{
  "model": "embeddinggemma:latest",
  "input": "hello"
}'

If your Ollama runtime is older and responds with 404 on /api/embed, Amelia automatically retries the legacy /api/embeddings route.

Large document study / big PDFs

For document-study prompts against very large manuals or PDFs, Amelia v9.16.2 now does more than scale budgets from source size. It also derives a safe retrieved-context budget from Ollama's configured num_ctx, then applies that cap all the way through document-packet assembly. This avoids the previous failure mode where ChatController computed a reasonable target but the packet formatter quietly expanded it again for very large books.

The effective policy is now:

  • estimate document size from indexed characters and chunk counts
  • compute a safe retrieved-context ceiling from ollamaNumCtx
  • keep a reserve for system/developer text, history, and the model's answer
  • scale representative coverage and section sweep density to the available budget
  • hard-trim each document-study packet so the formatter cannot outgrow the runtime budget
  • preserve both the beginning and end of oversized prompt sections instead of left-trimming away the tail of the document
  • supplement heading-based section anchors with evenly distributed document spans so late sections and appendixes still receive explicit coverage even when heading extraction is sparse

By default Amelia uses an auto runtime profile for these limits. If your Ollama setup is CPU-only or unstable under heavy loads, you can force a more conservative policy with:

export AMELIA_OLLAMA_RUNTIME_PROFILE=cpu

If your Ollama runtime is genuinely stable on GPU and you want Amelia to be a little less conservative:

export AMELIA_OLLAMA_RUNTIME_PROFILE=gpu

Recommended Ollama-side tuning for large document prompts:

OLLAMA_CONTEXT_LENGTH=65536 OLLAMA_NUM_PARALLEL=1 OLLAMA_MAX_LOADED_MODELS=1 OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_KEEP_ALIVE=30m ollama serve

For CPU-only systems or for Windows hosts where Vulkan is unstable or unavailable, Amelia usually behaves best with:

export AMELIA_OLLAMA_RUNTIME_PROFILE=cpu

For GPU-backed systems where ollama ps confirms the chat model is actually offloaded to GPU, you can let Amelia spend a little more of num_ctx on retrieved context:

export AMELIA_OLLAMA_RUNTIME_PROFILE=gpu

Notes:

  • If Ollama accepts a large grounded request but later returns model runner has unexpectedly stopped, Amelia v9.17.5 retries once with think=false, a lower request num_ctx, and a smaller balanced local-context packet. This fallback is generic and applies to any large document-heavy request; it does not hardcode subject-specific knowledge.
  • OLLAMA_CONTEXT_LENGTH is the main capacity knob for large grounded prompts.
  • OLLAMA_NUM_PARALLEL=1 is important for big prompts because parallel request handling multiplies KV/context memory pressure.
  • OLLAMA_MAX_LOADED_MODELS=1 keeps other models from competing for VRAM or RAM while a large prompt is running.
  • OLLAMA_FLASH_ATTENTION=1 reduces memory pressure at larger context sizes on supported backends.
  • OLLAMA_KV_CACHE_TYPE=q8_0 is a good first compromise when you need more room. If you are desperate for headroom, q4_0 saves more memory but may reduce answer fidelity.
  • OLLAMA_KEEP_ALIVE helps amortize reload cost, but it does not increase prompt capacity.
  • AMELIA_OLLAMA_RUNTIME_PROFILE=cpu tells Amelia to spend a smaller fraction of num_ctx on retrieved context, which is usually the safer choice when ollama ps shows the chat model on CPU.
  • AMELIA_OLLAMA_RUNTIME_PROFILE=gpu lets Amelia be less conservative only when Ollama is truly GPU-backed.

For truly massive corpora (for example, thousands of pages), no single prompt budget is enough to preserve the entire source verbatim. The correct approach is hierarchical coverage: outline extraction, representative section sweeps, and grounded answer synthesis over staged context packets. Amelia now leans further in that direction automatically.

Ollama in Docker

CPU-only quick start:

docker run -d \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Then pull the recommended chat model and embedding model inside the container:

docker exec -it ollama ollama pull gpt-oss:20b
docker exec -it ollama ollama pull embeddinggemma:latest

Starting SearXNG search container

Quick container setup:

mkdir -p ./searxng/config ./searxng/data

docker pull docker.io/searxng/searxng:latest

docker run --name searxng -d \
  -p 8080:8080 \
  -v "$(pwd)/searxng/config:/etc/searxng" \
  -v "$(pwd)/searxng/data:/var/cache/searxng" \
  docker.io/searxng/searxng:latest

Amelia expects, by default:

"searxngUrl": "http://127.0.0.1:8080/search"

If you prefer another host port, update Amelia's config accordingly.

Runtime layout

Amelia stores runtime data in:

  • ~/.amelia_qt6/config.json
  • ~/.amelia_qt6/conversations/
  • ~/.amelia_qt6/conversations_index.json
  • ~/.amelia_qt6/memories.json
  • ~/.amelia_qt6/state.json
  • ~/.amelia_qt6/rag_cache.json
  • ~/.amelia_qt6/knowledge/
  • ~/.amelia_qt6/knowledge/collections/
  • ~/.amelia_qt6/knowledge/.amelia_kb_manifest.json
  • ~/.amelia_qt6/workspace/
  • ~/.amelia_qt6/workspace/runtime/

Preferred user config path:

  • ~/.amelia_qt6/config.json

Notes about existing configs

Changing defaults in source files does not overwrite an existing user config.

If you already have:

  • ~/.amelia_qt6/config.json

then its values still win. Update that file manually if you want the new defaults on an existing installation.

Note: knowledgeRoot is now normalized under Amelia's active dataRoot, so it can no longer relocate the Knowledge Base outside Amelia's own storage jail.

Knowledge-base behavior

Amelia behaves better with large KBs because:

  • cached KB state can load first
  • stale-cache detection uses a lighter source-level comparison
  • incremental refresh rebuilds only changed/new files
  • prompt preparation no longer blocks the UI thread while retrieval/outline prep runs

Prompt Lab and transcript helpers still present

This build keeps the existing Prompt Lab and transcript helpers, including richer presets, KB-asset references, browse helpers, recipe copy, colored transcript rendering, fenced code formatting, answer copy, and code-block copy actions.

Troubleshooting

I do not receive desktop notifications

Check:

  • enableDesktopNotifications in ~/.amelia_qt6/config.json
  • whether your desktop environment exposes a system tray / notification service
  • whether tray popups are blocked by the shell or Do Not Disturb mode

Amelia falls back to QApplication::alert() when native tray popups are not available, but that fallback is less visible than a real notification balloon.

PDFs do not index

Make sure pdftotext exists:

which pdftotext

If not:

sudo apt install poppler-utils

New defaults did not take effect

Your existing user config is overriding the source defaults. Edit:

~/.amelia_qt6/config.json

Amelia still feels slow with a huge KB

Main things to check:

  • model size in Ollama
  • number of indexed files and chunk count
  • whether the KB is currently refreshing in the background
  • whether your local disk is slow
  • whether Ollama is CPU-only instead of GPU-backed

All aMELia Qt6 features

  • Local-first desktop app built with C++ and Qt6
  • Local Ollama integration for model generation, model refresh, backend probing, and model selection
  • Persistent local state under ~/.amelia_qt6 for config, conversations, memories, summaries, KB cache, collection manifests, and workspace jail data
  • Session management with create, restore, list, and delete conversation workflows
  • Rich transcript view with colored role cards, Markdown rendering, fenced-code rendering, clickable code-copy links, and clipboard copy of the last answer
  • Transcript sanitization that neutralizes raw HTML-like tags before Markdown rendering to avoid broken layouts
  • Exact code-block transcript handling with stable copy links and stronger indentation preservation
  • Manual Memory capture plus persisted memory storage / clearing
  • Prompt-safe memory reuse for stored memories that you save manually, so reused memory text is trimmed and filtered before it is re-injected into later prompts
  • Per-memory deletion UI from the structured Memory tab
  • Memory details panel with description, confidence, pin state, and timestamps for the selected memory
  • Auto-memory disabled by default in this build to avoid prompt-loop feedback; use Manual Memory when you want to persist something intentionally
  • Knowledge Base ingestion from files and folders with preserved collection structure
  • Knowledge Base collections with immutable IDs, user-facing unique labels, rename support, manifest-backed grouping, and a KB root locked under Amelia's data root
  • Knowledge Base inspection with source summary, searchable tree view, collection/folder expanders, sorting by name or file type, remove-selected, and clear-KB actions
  • Knowledge Base prioritization with Use once and Pin actions plus an active-priority panel near the prompt box
  • Incremental indexing so changed assets can be refreshed without rebuilding the entire cache
  • Content-hash reuse so touched-but-unchanged assets can skip reparsing and re-embedding
  • Shared chunk embedding reuse so duplicate chunk text across assets can borrow cached embeddings instead of calling Ollama again
  • Partial-safe cancellation so user-canceled reindexes keep finished work and discard only the in-flight file
  • Tree-view asset moves so Knowledge Base files can be dragged to another collection or folder without re-importing them
  • Asynchronous PDF ingestion and non-blocking KB analysis
  • Semantic retrieval with a real Ollama embedding path plus automatic local fallback
  • Structure-aware chunking that preserves headings, code fences, page markers, and list regions more faithfully
  • Grounded local-source panel showing local evidence used for answers
  • Sanitized external search through SearXNG, with an explicit per-prompt allow checkbox
  • External-source panel showing sanitized external evidence
  • Privacy preview panel showing what context is being shared with the backend
  • Outline planning and outline-first document / procedure generation support
  • Prompt Lab with presets, local asset helpers, KB-asset references, notes / constraints, recipe composition, clipboard copy, and input injection
  • Backend summary panel for runtime/backend/config visibility
  • Diagnostics panel for operational logs and optional reasoning-trace capture
  • Reasoning trace toggle for backend thinking streams when exposed by the selected model/backend
  • Desktop notifications for meaningful task lifecycle events, excluding model refresh/change toasts
  • System tray controls with Show / Hide / Exit actions
  • Busy indicator and response progress bar for long-running operations and streamed answer progress
  • Bootstrap dialog shown immediately at startup while initialization completes
  • Tooltips across the UI for buttons, tabs, lists, and major controls
  • Config-driven behavior with user-overridable defaults in ~/.amelia_qt6/config.json
  • Optional external grounding controls including domain allowlist and timeout configuration
  • Operational diagnostics for backend, search, RAG, startup, planner, memory, and related categories

Cache / index regeneration notes

  • Most code changes do not require a manual forced cache wipe, but Amelia will automatically invalidate older KB caches when the chunking strategy changes.
  • This build upgrades the KB cache format to amelia-rag-cache-v3 and stores per-file content hashes plus per-chunk fingerprints for faster reuse on later reindexes.
  • Moving or renaming assets inside the Knowledge Base does change their stored path / collection metadata, so Amelia refreshes the KB index after those operations.
  • Cancel-index support remains backward-compatible with the partial-safe cache write path.

Recent UI additions

  • The Memory tab now shows persisted entries in a structured table and supports Delete selected for one-at-a-time cleanup.
  • Knowledge Base tab supports live filename/path filtering for indexed assets.
  • Diagnostics includes an optional Capture reasoning trace toggle. When enabled, Amelia asks Ollama for backend thinking streams when supported and also records explicit tagged reasoning notes if the model emits them. This remains intentionally separate from any hidden internal chain-of-thought.
  • Session list includes Delete selected to remove an individual saved conversation from history.
  • Knowledge Base supports Use once and Pin actions so indexed assets can be prioritized for retrieval. One-shot priorities are consumed by the next prompt; pinned assets stay active until cleared. Active priorities are shown in a dedicated panel near the prompt box.
  • Knowledge Base is now the second inspection tab for a faster review workflow.
  • The external-search checkbox now defaults to off on fresh installs/configs.
  • The transcript renderer now sanitizes raw HTML-like fragments before Markdown rendering.

  • For large document-study prompts, Amelia now omits FULL_DOCUMENT_TEXT entirely and relies on the DOCUMENT_OUTLINE_MAP plus SECTION_COVERAGE_PACKET, which prevents huge PDFs from crowding out late sections and overloading Ollama.

  • Document-study payloads are now slimmer overall: fewer coverage hits, a much smaller retrieved-hit sidecar, and a lower local-context budget tuned for stability instead of giant front-loaded packets.

  • Heavy document-study requests now force think=false for the active Ollama call, reducing backend load and avoiding runner crashes on large HLD/manual summaries.

  • Document-study prompts now build a SECTION_COVERAGE_PACKET instead of spending most of the budget on a single front-trimmed full-document blob.

  • Major top-level sections are mapped to chunk anchors and the prompt budget is distributed across those sections, so late chapters survive much more reliably.

  • For document-study requests, the ordinary retrieved-hit appendix is now trimmed much harder so it does not crowd out the section sweep.

  • Prompt diagnostics now also report section_packets so you can verify the new path in one run.

About

An offline/online local-first AI built as an app in C++ and Qt6 using Ollama

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages