Skip to content

V0.29: updates to auto-batching #1864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 29, 2022
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 0 additions & 15 deletions .vuepress/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -271,21 +271,6 @@ module.exports = {
},
],
},
{
title: '🧪 Experimental',
collapsable: false,
path: '/learn/experimental/overview.html',
children: [
{
title: 'Overview',
path: '/learn/experimental/overview',
},
{
title: 'Auto-batching',
path: '/learn/experimental/auto-batching',
},
],
},
{
title: '👐 Contributing',
path: '/learn/contributing/overview.html',
Expand Down
6 changes: 2 additions & 4 deletions learn/advanced/indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,14 @@ Multi-threading is unfortunately not possible in machines with only one processo

## Improving indexing performance

If you encounter performance issues during the indexing we recommend trying the following points:
If you encounter performance issues during indexing, we recommend trying the following:

- Make sure you are using the latest [stable version of Meilisearch](https://github.com/meilisearch/meilisearch/releases). New releases often include performance improvements that can significantly increase indexing speed.

- indexing is a memory-intensive and multi-threaded operation. This means **the more memory and processor cores available, the faster Meilisearch will index new documents**. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM.
- Indexing is a memory-intensive and multi-threaded operation. This means **the more memory and processor cores available, the faster Meilisearch will index new documents**. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM.

- **Bigger HTTP payloads are processed more quickly than smaller payloads**. For example, adding the same 100,000 documents in two batches of 50,000 documents will be quicker than adding them in four batches of 25,000 documents. By default, Meilisearch sets the maximum payload size to 100MB, but [you can change this value if necessary](/learn/configuration/instance_options.md#payload-limit-size). That said, **the bigger the payload is, the higher the memory consumption will be**. An instance may crash if it requires more RAM than is currently available in a machine.

- If you want to speed up indexing but don't wish to batch documents manually, consider giving our [experimental auto-batcher](/learn/experimental/auto-batching.md) a try.

- **Meilisearch should not be your main database**. The more documents you add, the longer will indexing and search take, so you should only index documents you want to retrieve when searching.

- By default, all document fields are searchable. We strongly recommend changing this by [updating the `searchableAttributes` list](/reference/api/settings.md#update-searchable-attributes) so it only contains fields you want to search in. The fewer fields Meilisearch needs to index, the faster is the indexing process.
Expand Down
13 changes: 13 additions & 0 deletions learn/configuration/instance_options.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,19 @@ If no master key is provided in a `development` environment, all routes will be

[Learn more about Meilisearch's use of security keys.](/learn/security/master_api_keys.md)

### Disable auto-batching

::: warning
🚩 This is a CLI flag and does not take any values. Assigning a value will throw an error. 🚩
:::

**Environment variable**: `MEILI_DISABLE_AUTO_BATCHING`
**CLI option**: `--disable-auto-batching`

Deactivates auto-batching when provided.

[Learn more about auto-batching.](/learn/core_concepts/documents.md#auto-batching)

### Disable analytics

::: warning
Expand Down
14 changes: 14 additions & 0 deletions learn/core_concepts/documents.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,17 @@ Since CSV does not support arrays or nested objects, `cast` cannot be converted
::: note
If you don't specify the data type for an attribute, it will default to `:string`.
:::

### Auto-batching

Auto-batching combines consecutive document addition requests into a batch to be processed together. This significantly speeds up the indexing process.

For document addition requests to be added to the same batch, they need to:

- Target the same index
- Have the same update method (i.e., [POST](/reference/api/documents.md#add-or-replace-documents) or [PUT](/reference/api/documents.md#add-or-update-documents))
- Be immediately consecutive

Tasks within the same batch share the same values for `startedAt`, `finishedAt`, `duration`, and the same `error` object, if an [internal](/reference/api/overview.md#errors) error occurs. If a task fails due to an invalid document, it will not be processed with the batch and will have its own error message.

You can deactivate auto-batching using the `--disable-auto-batching` command-line flag or the `MEILI_DISABLE_AUTO_BATCHING` environment variable.
54 changes: 0 additions & 54 deletions learn/experimental/auto-batching.md

This file was deleted.

15 changes: 0 additions & 15 deletions learn/experimental/overview.md

This file was deleted.

4 changes: 3 additions & 1 deletion learn/getting_started/quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,9 @@ Meilisearch stores data in the form of discrete records, called [documents](/lea
Currently, Meilisearch only supports [JSON, CSV, and NDJSON formats](/learn/core_concepts/documents.md#dataset-format).
:::

The previous command added documents from `movies.json` to a new index called `movies`. After adding documents, you should receive a response like this:
The previous command added documents from `movies.json` to a new index called `movies`.

By default, Meilisearch combines consecutive document requests into a batch to be processed together. This process is called auto-batching, and it significantly speeds up indexing. After adding documents, you should receive a response like this:

```json
{
Expand Down