Skip to content

Commit f2d994a

Browse files
bors[bot]guimachiavellimaryamsulemani97curquiza
authored
Merge #1859
1859: v0.29 r=maryamsulemani97 a=guimachiavelli This is a staging PR for all changes related to Meilisearch v0.29. Please avoid making changes directly to this PR; instead, create new child branches based off this one. Closes meilisearch/integration-guides#213, #1854, #1853, #1852, #1851, #1840, #1839, #1838, #1837, #1846 Co-authored-by: gui machiavelli <[email protected]> Co-authored-by: Maryam Sulemani <[email protected]> Co-authored-by: gui machiavelli <[email protected]> Co-authored-by: Maryam <[email protected]> Co-authored-by: Clémentine Urquizar <[email protected]>
2 parents 16a5e33 + c434cd4 commit f2d994a

20 files changed

+174
-152
lines changed

.code-samples.meilisearch.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -986,3 +986,19 @@ getting_started_pagination: |-
986986
--data-binary '{
987987
"maxTotalHits": 500
988988
}'
989+
search_parameter_guide_matching_strategy_1: |
990+
curl \
991+
-X POST 'http://localhost:7700/indexes/movies/search' \
992+
-H 'Content-Type: application/json' \
993+
--data-binary '{
994+
"q": "big fat liar",
995+
"matchingStrategy": "last"
996+
}'
997+
search_parameter_guide_matching_strategy_2: |
998+
curl \
999+
-X POST 'http://localhost:7700/indexes/movies/search' \
1000+
-H 'Content-Type: application/json' \
1001+
--data-binary '{
1002+
"q": "big fat liar",
1003+
"matchingStrategy": "all"
1004+
}'

.vuepress/config.js

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
const ogprefix = 'og: http://ogp.me/ns#'
22
module.exports = {
3-
title: 'Meilisearch Documentation v0.28',
3+
title: 'Meilisearch Documentation v0.29',
44
description: 'Open source Instant Search Engine',
55
theme: 'default-prefers-color-scheme',
66
themeConfig: {
@@ -271,21 +271,6 @@ module.exports = {
271271
},
272272
],
273273
},
274-
{
275-
title: '🧪 Experimental',
276-
collapsable: false,
277-
path: '/learn/experimental/overview.html',
278-
children: [
279-
{
280-
title: 'Overview',
281-
path: '/learn/experimental/overview',
282-
},
283-
{
284-
title: 'Auto-batching',
285-
path: '/learn/experimental/auto-batching',
286-
},
287-
],
288-
},
289274
{
290275
title: '👐 Contributing',
291276
path: '/learn/contributing/overview.html',

.vuepress/public/_redirects

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,3 +196,6 @@
196196

197197
# Rename indexation to indexing
198198
/learn/advanced/indexation.html /learn/advanced/indexing.html
199+
200+
# Remove autobatching
201+
/learn/experimental/auto-batching.html /learn/core_concepts/documents.html

.vuepress/public/postman/meilisearch-collection.json

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"info": {
3-
"_postman_id": "453f3754-4b37-4654-8fe3-b2b3aef24048",
4-
"name": "Meilisearch v0.28",
3+
"_postman_id": "5ad97bb3-840b-40dc-9012-482a0c2c52a5",
4+
"name": "Meilisearch v0.29",
55
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
66
"_exporter_id": "8898306"
77
},
@@ -79,7 +79,7 @@
7979
"header": [],
8080
"body": {
8181
"mode": "raw",
82-
"raw": "[\n { \"id\": 2, \"title\": \"Pride and Prejudice\", \"author\": \"Jane Austin\", \"genre\": \"romance\", \"price\": 3.5 },\n { \"id\": 456, \"title\": \"Le Petit Prince\", \"author\": \"Antoine de Saint-Exupéry\", \"genre\": \"adventure\" , \"price\": 10.0 },\n { \"id\": 1, \"title\": \"Alice In Wonderland\", \"author\": \"Lewis Carroll\", \"genre\": \"fantasy\", \"price\": 25.99 },\n { \"id\": 1344, \"title\": \"The Hobbit\", \"author\": \"J. R. R. Tolkien\", \"genre\": \"fantasy\" },\n { \"id\": 4, \"title\": \"Harry Potter and the Half-Blood Prince\", \"author\": \"J. K. Rowling\", \"genre\": \"fantasy\" },\n { \"id\": 42, \"title\": \"The Hitchhiker's Guide to the Galaxy\", \"author\": \"Douglas Adams\" }\n]",
82+
"raw": "[\n { \"id\": 2, \"title\": \"Pride and Prejudice\", \"author\": \"Jane Austen\", \"genre\": \"romance\", \"price\": 3.5 },\n { \"id\": 456, \"title\": \"Le Petit Prince\", \"author\": \"Antoine de Saint-Exupéry\", \"genre\": \"adventure\" , \"price\": 10.0 },\n { \"id\": 1, \"title\": \"Alice In Wonderland\", \"author\": \"Lewis Carroll\", \"genre\": \"fantasy\", \"price\": 25.99 },\n { \"id\": 1344, \"title\": \"The Hobbit\", \"author\": \"J. R. R. Tolkien\", \"genre\": \"fantasy\" },\n { \"id\": 4, \"title\": \"Harry Potter and the Half-Blood Prince\", \"author\": \"J. K. Rowling\", \"genre\": \"fantasy\" },\n { \"id\": 42, \"title\": \"The Hitchhiker's Guide to the Galaxy\", \"author\": \"Douglas Adams\" }\n]",
8383
"options": {
8484
"raw": {
8585
"language": "json"
@@ -309,6 +309,11 @@
309309
"key": "highlightPostTag",
310310
"value": "</mark>",
311311
"disabled": true
312+
},
313+
{
314+
"key": "matchingStrategy",
315+
"value": "all",
316+
"disabled": true
312317
}
313318
]
314319
}

.vuepress/public/sample-template.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,3 +152,5 @@ synonyms_guide_1: |-
152152
getting_started_faceting: |-
153153
getting_started_pagination: |-
154154
getting_started_front_end_integration_md: |-
155+
search_parameter_guide_matching_strategy_1: |-
156+
search_parameter_guide_matching_strategy_2: |-

learn/advanced/filtering_and_faceted_search.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,10 +102,14 @@ The [`GET` route of the search endpoint](/reference/api/search.md#search-in-an-i
102102

103103
String expressions combine conditions using the following filter operators and parentheses:
104104

105-
- `NOT` only returns documents that do not satisfy a condition : `NOT genres = horror`
105+
- `NOT` returns all documents that do not satisfy a condition. The expression `NOT genres = horror` returns all documents whose `genres` do not contain `horror` and all documents missing a `genres` field
106106
- `AND` operates by connecting two conditions and only returns documents that satisfy both of them: `genres = horror AND director = 'Jordan Peele'`
107107
- `OR` connects two conditions and returns results that satisfy at least one of them: `genres = horror OR genres = comedy`
108108
- `TO` is equivalent to `>= AND <=`. The expression `release_date 795484800 TO 972129600` translates to `release_date >= 795484800 AND release_date <= 972129600`
109+
- `IN [valueA, valueB, …, valueN]` selects all documents whose chosen field contains at least one of the specified values. The expression `genres IN [horror, comedy]` returns all documents whose `genres` includes either `horror`, `comedy`, or both
110+
- `EXISTS` checks for the existence of a field. Fields with empty or null values still count as existing. The expression `release_date NOT EXISTS` returns all documents without a `release_date`
111+
112+
When creating an expression with a field name or value identical to a filter operator such as `AND` or `NOT`, you must wrap it in quotation marks: `title = "NOT" OR title = "AND"`.
109113

110114
::: tip
111115
String expressions are read left to right. `NOT` takes precedence over `AND` and `AND` takes precedence over `OR`. You can use parentheses to ensure expressions are correctly parsed.
@@ -220,6 +224,12 @@ You can use this filter when searching for `Planet of the Apes`:
220224

221225
<CodeSamples id="filtering_guide_3" />
222226

227+
`NOT director = "Tim Burton"` will include both documents that do not contain `"Tim Burton"` in its `director` field and documents without a `director` field. To return only documents that have a `director` field, expand the filter expression with the `EXISTS` operator:
228+
229+
```SQL
230+
rating >= 3 AND (NOT director = "Tim Burton" AND director EXISTS)
231+
```
232+
223233
## Filtering with `_geoRadius`
224234

225235
If your documents contain `_geo` data, you can use the `_geoRadius` built-in filter rule to filter results according to their geographic position.

learn/advanced/indexing.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,16 +28,14 @@ Multi-threading is unfortunately not possible in machines with only one processo
2828

2929
## Improving indexing performance
3030

31-
If you encounter performance issues during the indexing we recommend trying the following points:
31+
If you encounter performance issues during indexing, we recommend trying the following:
3232

3333
- Make sure you are using the latest [stable version of Meilisearch](https://github.com/meilisearch/meilisearch/releases). New releases often include performance improvements that can significantly increase indexing speed.
3434

35-
- indexing is a memory-intensive and multi-threaded operation. This means **the more memory and processor cores available, the faster Meilisearch will index new documents**. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM.
35+
- Indexing is a memory-intensive and multi-threaded operation. This means **the more memory and processor cores available, the faster Meilisearch will index new documents**. When trying to improve indexing speed, using a machine with more processor cores is more effective than increasing RAM.
3636

3737
- **Bigger HTTP payloads are processed more quickly than smaller payloads**. For example, adding the same 100,000 documents in two batches of 50,000 documents will be quicker than adding them in four batches of 25,000 documents. By default, Meilisearch sets the maximum payload size to 100MB, but [you can change this value if necessary](/learn/configuration/instance_options.md#payload-limit-size). That said, **the bigger the payload is, the higher the memory consumption will be**. An instance may crash if it requires more RAM than is currently available in a machine.
3838

39-
- If you want to speed up indexing but don't wish to batch documents manually, consider giving our [experimental auto-batcher](/learn/experimental/auto-batching.md) a try.
40-
4139
- **Meilisearch should not be your main database**. The more documents you add, the longer will indexing and search take, so you should only index documents you want to retrieve when searching.
4240

4341
- By default, all document fields are searchable. We strongly recommend changing this by [updating the `searchableAttributes` list](/reference/api/settings.md#update-searchable-attributes) so it only contains fields you want to search in. The fewer fields Meilisearch needs to index, the faster is the indexing process.

learn/advanced/storage.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,3 +71,9 @@ These metrics are highly dependent on the machine that is running Meilisearch. R
7171
It is important to note that **there is no reliable way to predict the final size of a database**. This is true for just about any search engine on the market—we're just the only ones saying it out loud.
7272

7373
Database size is affected by a large number of criteria, including settings, relevancy rules, use of facets, the number of different languages present, and more.
74+
75+
## Soft deletion
76+
77+
Meilisearch renders deleted documents inaccessible to all users but does not immediately remove them from the database. This is a common optimization technique called soft deletion. Soft deleted documents are permanently deleted during a later update, depending on your index size and the available disk space. It might be important to check how soft deletion interacts with data retention legislation relevant to your application.
78+
79+
Soft deletion also affects document updates: when you update a document, Meilisearch removes the current record and creates a new document with updated data.

learn/advanced/tokenization.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,5 +21,6 @@ Pipelines include many language-specific operations. Currently, we have four pip
2121
2. A specialized Chinese pipeline using [Jieba](https://github.com/messense/jieba-rs)
2222
3. A specialized Japanese pipeline using [Lindera](https://github.com/lindera-morphology/lindera)
2323
4. A specialized Hebrew pipeline based off the default Meilisearch pipeline. Uses [Niqqud](https://docs.rs/niqqud/latest/niqqud/) for normalization
24+
5. A specialized Thai pipeline using [dictionary-based](https://github.com/PyThaiNLP/nlpo3) segmentation
2425

2526
For more details, check out the [tokenizer contribution guide](https://github.com/meilisearch/charabia/blob/main/CONTRIBUTING.md).

learn/configuration/instance_options.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,19 @@ If no master key is provided in a `development` environment, all routes will be
106106

107107
[Learn more about Meilisearch's use of security keys.](/learn/security/master_api_keys.md)
108108

109+
### Disable auto-batching
110+
111+
::: warning
112+
🚩 This is a CLI flag and does not take any values. Assigning a value will throw an error. 🚩
113+
:::
114+
115+
**Environment variable**: `MEILI_DISABLE_AUTO_BATCHING`
116+
**CLI option**: `--disable-auto-batching`
117+
118+
Deactivates auto-batching when provided.
119+
120+
[Learn more about auto-batching.](/learn/core_concepts/documents.md#auto-batching)
121+
109122
### Disable analytics
110123

111124
::: warning

learn/core_concepts/documents.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,3 +124,19 @@ Since CSV does not support arrays or nested objects, `cast` cannot be converted
124124
::: note
125125
If you don't specify the data type for an attribute, it will default to `:string`.
126126
:::
127+
128+
### Auto-batching
129+
130+
Auto-batching combines consecutive document addition requests into a single batch and processes them together. This significantly speeds up the indexing process.
131+
132+
Meilisearch batches document addition requests when they:
133+
134+
- Target the same index
135+
- Have the same update method (i.e., [POST](/reference/api/documents.md#add-or-replace-documents) or [PUT](/reference/api/documents.md#add-or-update-documents))
136+
- Are immediately consecutive
137+
138+
Tasks within the same batch share the same values for `startedAt`, `finishedAt`, and `duration`.
139+
140+
If a task fails due to an invalid document, it will be removed from the batch. The rest of the batch will still process normally. If an [`internal`](/reference/api/overview.md#errors) error occurs, the whole batch will fail and all tasks within it will share the same `error` object.
141+
142+
You can deactivate auto-batching using the `--disable-auto-batching` command-line flag or the `MEILI_DISABLE_AUTO_BATCHING` environment variable. This is useful in cases where you want to avoid any potential bugs in the feature or reduce visibility latency. When auto-batching is disabled, the whole queue takes longer to process, but each individual task will be processed earlier (until a certain number of processed tasks).

learn/experimental/auto-batching.md

Lines changed: 0 additions & 54 deletions
This file was deleted.

learn/experimental/overview.md

Lines changed: 0 additions & 15 deletions
This file was deleted.

learn/getting_started/quick_start.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ These commands launch the **latest stable release** of Meilisearch.
5151

5252
```bash
5353
# Fetch the latest version of Meilisearch image from DockerHub
54-
docker pull getmeili/meilisearch:v0.28
54+
docker pull getmeili/meilisearch:v0.29
5555

5656
# Launch Meilisearch in development mode with a master key
5757
docker run -it --rm \
@@ -181,7 +181,9 @@ Meilisearch stores data in the form of discrete records, called [documents](/lea
181181
Meilisearch currently only accepts data in JSON, NDJSON, and CSV formats. You can read more about this in our [documents guide](/learn/core_concepts/documents.md#dataset-format).
182182
:::
183183

184-
The previous command added documents from `movies.json` to a new index called `movies`. After adding documents, you should receive a response like this:
184+
The previous command added documents from `movies.json` to a new index called `movies`.
185+
186+
By default, Meilisearch combines consecutive document requests into a single batch and processes them together. This process is called [auto-batching](/learn/core_concepts/documents.md#auto-batching), and it significantly speeds up indexing. After adding documents, you should receive a response like this:
185187

186188
```json
187189
{

learn/what_is_meilisearch/language.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Meilisearch is multilingual, featuring optimized support for:
66
- Chinese
77
- Japanese
88
- Hebrew
9+
- Thai
910

1011
We aim to provide global language support, and your feedback helps us move closer to that goal. If you notice inconsistencies in your search results or the way your documents are processed, please [open an issue in the Meilisearch repository](https://github.com/meilisearch/meilisearch/issues/new/choose).
1112

@@ -29,6 +30,7 @@ Under the hood, Meilisearch relies on tokenizers that identify the most importan
2930
- A pipeline specifically tailored for Chinese
3031
- A pipeline specifically tailored for Japanese
3132
- A pipeline specifically tailored for Hebrew
33+
- A pipeline specifically tailored for Thai
3234

3335
### My language does not use whitespace to separate words. Can I still use Meilisearch?
3436

learn/what_is_meilisearch/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ _Meilisearch helps the Rust community find crates on [crates.meilisearch.com](ht
2020
- **Blazing fast** (answers < 50 milliseconds): Priority is given to fast answers for a smooth search experience.
2121
- [Search as you type](/learn/what_is_meilisearch/features.md#search-as-you-type): Results are updated on each keystroke. To make this possible, we use [prefix-search](/learn/advanced/prefix.md#prefix-search).
2222
- [Typo tolerance](/learn/what_is_meilisearch/features.md#typo-tolerant): Understands typos and misspellings.
23-
- [Tokenization](/learn/advanced/tokenization.md) in **English**, **Chinese**, and **all languages that uses space as a word divider**.
23+
- [Tokenization](/learn/advanced/tokenization.md) in **English**, **Chinese**, and **all languages that use space as a word divider**.
2424
- **Return the whole document**: The entire document is returned upon search.
2525
- **Highly customizable search and indexing**:
2626
- [Custom ranking](/learn/core_concepts/relevancy.md): Customize the relevancy of the search engine and the ranking of the search results.

0 commit comments

Comments
 (0)