feat: parallelize text and multimodal processing in process_document_complete by ndcorder · Pull Request #227 · HKUDS/RAG-Anything

ndcorder · 2026-03-21T08:47:06Z

Text insertion and multimodal processing in process_document_complete run sequentially right now but don't share any state. This runs them concurrently with asyncio.gather so total time is max(text, multimodal) instead of text + multimodal.

Uses return_exceptions=True so if one branch fails the other still completes.

LarFii · 2026-03-24T07:07:51Z

Thanks for your contribution!

I found one P1 issue. In raganything/processor.py, the new parallel branch uses asyncio.gather(..., return_exceptions=True) and only logs per-task failures. That means process_document_complete() no longer propagates text/multimodal failures, so the outer error handler is skipped and the method can still emit on_document_complete / "processing complete" even when one branch failed. This is a behavioral regression from main and can silently report partially ingested documents as successful. I think the PR should preserve fail-fast semantics here, or at minimum aggregate task errors and raise after both tasks finish. The new test test_one_branch_failing_does_not_block_the_other currently locks in the broken behavior rather than detecting it.

These two steps are independent so there's no reason to wait for text insertion before starting multimodal. Uses asyncio.gather with return_exceptions so one failing doesn't kill the other.

…owed asyncio.gather(..., return_exceptions=True) made process_document_complete silently report partially ingested documents as successful: the outer try/except never saw the failure, so on_document_error was skipped and on_document_complete still fired. Keep return_exceptions=True so a failure in one branch does not cancel the sibling (preserves the parallelism win), but after gather collect all task exceptions and re-raise. If both branches fail, raise a single RuntimeError that names the secondary failures and chains the first exception via __cause__. Update the regression test to assert that: - the non-failing branch still completes - the failure propagates out of process_document_complete - on_document_error fires and on_document_complete does not Add a new test covering the both-branches-failing aggregation path. Made-with: Cursor

LarFii · 2026-04-21T09:57:36Z

@ndcorder Thanks again for putting time into this. After a closer review (and another pass by Codex on the latest commit) we found that the parallelization as implemented has correctness issues that go deeper than just the swallowed-exception fix:

text-only documents lose multimodal_processed=True — the else branch calls _mark_multimodal_processing_complete(doc_id) before asyncio.gather(...) runs the text insertion. At that point LightRAG hasn't created doc_status yet, so the mark call silently no-ops.
mixed text + multimodal documents race on doc_status — _update_doc_status_with_chunks_type_aware does a read/append/write on chunks_list, while LightRAG's ainsert does a full overwrite of the same record. In main this works because multimodal runs strictly after text insertion; under asyncio.gather any interleaving (multimodal reads None and skips, or text overwrites multimodal's chunks_list/multimodal_processed) is observable. This is not fixable with a lock because LightRAG's internal upsert is not partial-merge-aware.

The "text and multimodal are independent" assumption in the PR description doesn't hold for the integration step — only the description-generation phase is truly independent.

Separately, RAG-Anything's multimodal pipeline is now being merged into LightRAG itself (HKUDS/LightRAG#2830 merged into the dev branch on 2026-03-25, with parser-alignment notes in docs/RAGAnythingParserAlignment.md). A lot of process_document_complete's shape will change once the upstream-side integration lands here, so it doesn't make sense to ship a partial parallelization of the current code.

Closing this PR for now. If you'd like to revisit parallelism after the upstream integration settles, the right shape is probably:

split _process_multimodal_content into generate_multimodal_descriptions (pure LLM, parallelizable) and integrate_multimodal_into_lightrag (must be serialized after text insertion)
gather(text_insert, generate_multimodal_descriptions) → then await integrate_multimodal_into_lightrag

Really appreciate the contribution and the clean test suite — sorry this one isn't going in.

ndcorder force-pushed the feat/parallel-text-multimodal branch from eede248 to ab1167c Compare March 21, 2026 09:31

ndcorder and others added 2 commits April 21, 2026 17:30

run text insertion and multimodal processing concurrently

1211fd9

These two steps are independent so there's no reason to wait for text insertion before starting multimodal. Uses asyncio.gather with return_exceptions so one failing doesn't kill the other.

LarFii force-pushed the feat/parallel-text-multimodal branch from ab1167c to 5a703a3 Compare April 21, 2026 09:38

LarFii closed this Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: parallelize text and multimodal processing in process_document_complete#227

feat: parallelize text and multimodal processing in process_document_complete#227
ndcorder wants to merge 2 commits intoHKUDS:mainfrom
ndcorder:feat/parallel-text-multimodal

ndcorder commented Mar 21, 2026 •

edited

Loading

Uh oh!

LarFii commented Mar 24, 2026

Uh oh!

LarFii commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ndcorder commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LarFii commented Mar 24, 2026

Uh oh!

LarFii commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ndcorder commented Mar 21, 2026 •

edited

Loading