Feature request: support high-speed import by merging multiple .dump
files
#845
Replies: 2 comments
-
I built a small Rust tool meilisearch-dumper that does exactly this - generates dump files from JSON while keeping all the index settings intact. ./meilisearch-dumper --index aaa --files aaa.json --index bbb --files bbb.json This resolves the default 100MB HTTP request size limitation in Meilisearch (although it can be modified, it still incurs performance overhead during the ingestion process) and improves data import efficiency. However, it comes with the limitation of requiring a brand-new Meilisearch instance. |
Beta Was this translation helpful? Give feedback.
-
Thanks for sharing your use case @adysec. Unfortunately, it's a bit niche and not a current priority for us. We'll keep the issue open so others can upvote and share their own use cases too. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I need to import a very large number of documents, so I spun up several self-hosted Meilisearch instances (each one ingests different indexes) to improve write throughput.
Now I’m trying to merge all those indexes into a single self-hosted Meilisearch instance.
Importing via the HTTP API is far too slow; there’s no performance gain compared with writing the data directly.
After reading the docs and running experiments, I found that only the
/dumps
endpoint exports fast enough—but Meilisearch doesn’t let me import multiple.dump
files at once.What I tried
1.Unpack the
.dump files
(they’re justtar.gz
archives), e.g.dumps/aaa.dump
&dumps/bbb.dump
To
dumps/aaa/index/aaa
&dumps/bbb/index/bbb
2.Merge the contents of their
index/
directories, e.g.dumps/aaa/index/aaa
anddumps/aaa/index/bbb
3.Inside
dumps/aaa
, runto create a merged test.dump.
4.Import with
Result: both aaa and bbb indexes appear in the target instance, but the searchableAttributes and displayedAttributes settings of bbb are lost and must be re-set manually. After resetting them I haven’t noticed any problems, but I don’t know whether hidden integrity issues remain.
Why this matters
This “unpack-merge-re-pack” workflow is extremely fast—orders of magnitude faster than the HTTP API—yet clearly unintended and a bit error-prone
Feature request
Make this high-speed dump-merge workflow an officially supported method so users can efficiently
Ideally, add a second mode to
meilisearch-importer
:Beta Was this translation helpful? Give feedback.
All reactions