diff --git a/.vuepress/config.js b/.vuepress/config.js index 07995dc801..d2ffea0dc1 100644 --- a/.vuepress/config.js +++ b/.vuepress/config.js @@ -271,21 +271,6 @@ module.exports = { }, ], }, - { - title: '๐Ÿงช Experimental', - collapsable: false, - path: '/learn/experimental/overview.html', - children: [ - { - title: 'Overview', - path: '/learn/experimental/overview', - }, - { - title: 'Auto-batching', - path: '/learn/experimental/auto-batching', - }, - ], - }, { title: '๐Ÿ‘ Contributing', path: '/learn/contributing/overview.html', diff --git a/.vuepress/public/_redirects b/.vuepress/public/_redirects index 7a1e7f8256..72e8755ddc 100644 --- a/.vuepress/public/_redirects +++ b/.vuepress/public/_redirects @@ -196,3 +196,6 @@ # Rename indexation to indexing /learn/advanced/indexation.html /learn/advanced/indexing.html + +# Remove autobatching +/learn/experimental/auto-batching.html /learn/core_concepts/documents.html diff --git a/learn/advanced/indexing.md b/learn/advanced/indexing.md index fba3c1796f..1f057a45bb 100644 --- a/learn/advanced/indexing.md +++ b/learn/advanced/indexing.md @@ -28,16 +28,14 @@ Multi-threading is unfortunately not possible in machines with only one processo ## Improving indexing performance -If you encounter performance issues during the indexing we recommend trying the following points: +If you encounter performance issues during indexing, we recommend trying the following: - Make sure you are using the latest [stable version of Meilisearch](https://github.com/meilisearch/meilisearch/releases). New releases often include performance improvements that can significantly increase indexing speed. -- indexing is a memory-intensive and multi-threaded operation. This means **the more memory and processor cores available, the faster Meilisearch will index new documents**. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM. +- Indexing is a memory-intensive and multi-threaded operation. This means **the more memory and processor cores available, the faster Meilisearch will index new documents**. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM. - **Bigger HTTP payloads are processed more quickly than smaller payloads**. For example, adding the same 100,000 documents in two batches of 50,000 documents will be quicker than adding them in four batches of 25,000 documents. By default, Meilisearch sets the maximum payload size to 100MB, but [you can change this value if necessary](/learn/configuration/instance_options.md#payload-limit-size). That said, **the bigger the payload is, the higher the memory consumption will be**. An instance may crash if it requires more RAM than is currently available in a machine. - - If you want to speed up indexing but don't wish to batch documents manually, consider giving our [experimental auto-batcher](/learn/experimental/auto-batching.md) a try. - - **Meilisearch should not be your main database**. The more documents you add, the longer will indexing and search take, so you should only index documents you want to retrieve when searching. - By default, all document fields are searchable. We strongly recommend changing this by [updating the `searchableAttributes` list](/reference/api/settings.md#update-searchable-attributes) so it only contains fields you want to search in. The fewer fields Meilisearch needs to index, the faster is the indexing process. diff --git a/learn/configuration/instance_options.md b/learn/configuration/instance_options.md index 4f52c9c50c..01dea3ea05 100644 --- a/learn/configuration/instance_options.md +++ b/learn/configuration/instance_options.md @@ -106,6 +106,19 @@ If no master key is provided in a `development` environment, all routes will be [Learn more about Meilisearch's use of security keys.](/learn/security/master_api_keys.md) +### Disable auto-batching + +::: warning +๐Ÿšฉ This is a CLI flag and does not take any values. Assigning a value will throw an error. ๐Ÿšฉ +::: + +**Environment variable**: `MEILI_DISABLE_AUTO_BATCHING` +**CLI option**: `--disable-auto-batching` + +Deactivates auto-batching when provided. + +[Learn more about auto-batching.](/learn/core_concepts/documents.md#auto-batching) + ### Disable analytics ::: warning diff --git a/learn/core_concepts/documents.md b/learn/core_concepts/documents.md index 9dffe422f4..92ecce9383 100644 --- a/learn/core_concepts/documents.md +++ b/learn/core_concepts/documents.md @@ -124,3 +124,19 @@ Since CSV does not support arrays or nested objects, `cast` cannot be converted ::: note If you don't specify the data type for an attribute, it will default to `:string`. ::: + +### Auto-batching + +Auto-batching combines consecutive document addition requests into a single batch and processes them together. This significantly speeds up the indexing process. + +Meilisearch batches document addition requests when they: + +- Target the same index +- Have the same update method (i.e., [POST](/reference/api/documents.md#add-or-replace-documents) or [PUT](/reference/api/documents.md#add-or-update-documents)) +- Are immediately consecutive + +Tasks within the same batch share the same values for `startedAt`, `finishedAt`, and `duration`. + +If a task fails due to an invalid document, it will be removed from the batch. The rest of the batch will still process normally. If an [`internal`](/reference/api/overview.md#errors) error occurs, the whole batch will fail and all tasks within it will share the same `error` object. + +You can deactivate auto-batching using the `--disable-auto-batching` command-line flag or the `MEILI_DISABLE_AUTO_BATCHING` environment variable. This is useful in cases where you want to avoid any potential bugs in the feature or reduce visibility latency. When auto-batching is disabled, the whole queue takes longer to process, but each individual task will be processed earlier (until a certain number of processed tasks). diff --git a/learn/experimental/auto-batching.md b/learn/experimental/auto-batching.md deleted file mode 100644 index 469a2f9a3c..0000000000 --- a/learn/experimental/auto-batching.md +++ /dev/null @@ -1,54 +0,0 @@ -# Auto-batching - -::: warning - -๐Ÿšจ This is an experimental feature ๐Ÿšจ -Using it may result in unexpected crashes, bugs, or holes in the space-time continuum. -You have been warned. - -::: - -Auto-batching is an experimental feature designed to improve indexing speed. - -When auto-batching is enabled, consecutive document addition requests may be automatically combined into a batch and processed together, significantly speeding up the indexing process. - -We would appreciate your feedback on this feature. [Join the discussion](https://github.com/meilisearch/meilisearch/discussions/2070). - -## Enable auto-batching - -To enable auto-batching, start Meilisearch while supplying the `--enable-auto-batching` CLI flag: - -``` -./meilisearch --enable-auto-batching -``` - -For document addition requests to be added to the same batch, they need to: - -- Target the same index -- Have the same update method (i.e., [POST](/reference/api/documents.md#add-or-replace-documents) or [PUT](/reference/api/documents.md#add-or-update-documents)) -- Be immediately consecutive - -By default, **auto-batching will not delay processing a request in order to batch multiple requests together.** If it can process the request immediately, it will. [This behavior can be altered using a command-line option](#customization-options). - -After enabling autobatching, the field `batchUid` will appear in all [task API](/reference/api/tasks.md) responses. - -::: warning - -If even a single task in a batch fails, the entire batch will fail. - -::: - -## Customization options - -There are three command-line options allowing you to customize auto-batching behavior: - -- `--debounce-duration-sec`: the number of seconds to wait between receiving a document addition task and beginning the batching and indexation process. **Default: `0`** -- `--max-batch-size`: the maximum number of tasks per batch. **Default: unlimited** -- `--max-documents-per-batch`: the maximum number of documents in a batch. **Default: unlimited** - -::: tip - -- Giving a smaller number for `max-documents-per-batch` will reduce memory use, but slow down indexation -- Giving a larger number for `max-documents-per-batch` will increase memory use, but speed up indexation - -::: diff --git a/learn/experimental/overview.md b/learn/experimental/overview.md deleted file mode 100644 index f805d9e18d..0000000000 --- a/learn/experimental/overview.md +++ /dev/null @@ -1,15 +0,0 @@ -# Experimental features - -## What is an experimental feature? - -Meilisearch maintains a high standard for new features. The process of adding a new feature to our search engine typically begins with a specification, centers around copious unit testing, and ends with the feature's release, barring some minor iteration in subsequent versions. - -This is not necessarily the case for experimental features. **Experimental features are features that may not have been tested to our usual standards, and which may be reworked or removed entirely in future versions.** They come with an increased risk of bugs and unforeseen behavior, but some users may find the extra functionality to be worth the risk. - -## Giving feedback - -**Each experimental feature added to Meilisearch has an associated discussion on our [core GitHub repository](https://github.com/meilisearch/meilisearch/discussions/categories/general?discussions_q=category%3AGeneral+experimental).** - -The feedback from these discussions helps us evaluate the success or failure of experimental features, as well as potentially evolve them into a stable, non-experimental state. - -Whether you want to share a bug report, an opinion, or a story about your experience with the feature, we are grateful for your contribution! diff --git a/learn/getting_started/quick_start.md b/learn/getting_started/quick_start.md index 2ea2caaa1c..7a12014000 100644 --- a/learn/getting_started/quick_start.md +++ b/learn/getting_started/quick_start.md @@ -185,7 +185,9 @@ Meilisearch stores data in the form of discrete records, called [documents](/lea Currently, Meilisearch only supports [JSON, CSV, and NDJSON formats](/learn/core_concepts/documents.md#dataset-format). ::: -The previous command added documents from `movies.json` to a new index called `movies`. After adding documents, you should receive a response like this: +The previous command added documents from `movies.json` to a new index called `movies`. + +By default, Meilisearch combines consecutive document requests into a single batch and processes them together. This process is called auto-batching, and it significantly speeds up indexing. After adding documents, you should receive a response like this: ```json {