Doc Scraper removing old index on 2nd run

Initially created by @munim
[2 days ago](https://github.com/meilisearch/product/discussions/507#discussion-4285402)

Dear team,

I am trying out Meilisearch and indexing our side using docs-scraper project from Meilisearch. It worked for me at some level but when I ran the scraper again with the same command, it cleaned all the items and started from scratch. Here's what I did:

1. Created a Docker network and started Meilisearch with Docker:

```
$ docker run -it --rm \
    -p 7700:7700 \
    -e MEILI_MASTER_KEY='123'\
    -v $(pwd)/meili_data:/meili_data \
    --network="meilisearch-test-01" \
    getmeili/meilisearch:v0.28 \
    meilisearch --env="development"
```

2. Created a scraper config file mentioned in the project README

3. Started the scraper with the following command:

```
$ docker run -t --rm \
    -e MEILISEARCH_HOST_URL=http://exciting_banach:7700 \
    -e MEILISEARCH_API_KEY=123 \
    --network="meilisearch-test-01" \
    -v `pwd`/test-scraper.config.json:/docs-scraper/config.json \
    getmeili/docs-scraper:latest pipenv run ./docs_scraper config.json
```

4. It took around 30 mins to scrap 50K pages.
5. I rerun the scraper after making some changes to the config
6. Now, I see all my previous entries from Meilisearch are removed and new entries are being added.

**My question is: How can I update the entries rather than removing old entries and recreate again?**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Doc Scraper removing old index on 2nd run #236

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Doc Scraper removing old index on 2nd run #236

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions