Skip to content

Some batches never get deleted #5827

@Mubelotix

Description

@Mubelotix

Describe the bug

The batch deletion task misses batches and those become phantom batches that exist in the db, can probably be retrieved, but are NOT listed among the list of batches.

To Reproduce

  1. Register 2 tasks that get batched
  2. Delete the first
  3. Delete the second
  4. The batch still exists!

Reproducing code:

// 1. We're going to autobatch 2 document addition
    // 2. We will delete the first task
    // 3. We will delete the second task
    // 4. The batch should be gone

    let (index_scheduler, mut handle) = IndexScheduler::test(true, vec![]);

    let mut tasks = Vec::new();
    for i in 0..2 {
        let content = format!(
            r#"{{
                    "id": {},
                    "doggo": "bob {}"
                }}"#,
            i, i
        );

        let (uuid, mut file) = index_scheduler.queue.create_update_file_with_uuid(i).unwrap();
        let documents_count = read_json(content.as_bytes(), &mut file).unwrap();
        file.persist().unwrap();
        let task = index_scheduler
            .register(
                KindWithContent::DocumentAdditionOrUpdate {
                    index_uid: S("doggos"),
                    primary_key: Some(S("id")),
                    method: ReplaceDocuments,
                    content_file: uuid,
                    documents_count,
                    allow_index_creation: true,
                },
                None,
                false,
            )
            .unwrap();
        tasks.push(task);
        index_scheduler.assert_internally_consistent();
    }

    handle.advance_one_successful_batch();
    let rtxn = index_scheduler.read_txn().unwrap();
    let batches = index_scheduler.queue.batches.all_batch_ids(&rtxn).unwrap();
    assert_eq!(batches.into_iter().collect::<Vec<_>>().as_slice(), &[0]);

    index_scheduler
        .register(
            KindWithContent::TaskDeletion {
                query: String::from("whatever"),
                tasks: RoaringBitmap::from_iter([tasks[0].uid]),
            },
            None,
            false,
        )
        .unwrap();
    handle.advance_one_successful_batch();
    let rtxn = index_scheduler.read_txn().unwrap();
    let batches = index_scheduler.queue.batches.all_batch_ids(&rtxn).unwrap();
    assert_eq!(batches.into_iter().collect::<Vec<_>>().as_slice(), &[0, 1]);

    index_scheduler
        .register(
            KindWithContent::TaskDeletion {
                query: String::from("whatever"),
                tasks: RoaringBitmap::from_iter([tasks[1].uid]),
            },
            None,
            false,
        )
        .unwrap();
    handle.advance_one_successful_batch();
    let rtxn = index_scheduler.read_txn().unwrap();
    let batches = index_scheduler.queue.batches.all_batch_ids(&rtxn).unwrap();
    assert_eq!(batches.into_iter().collect::<Vec<_>>().as_slice(), &[1, 2]);

    let batch0 = index_scheduler.queue.batches.get_batch(&rtxn, 0).unwrap();
    assert!(batch0.is_none(), "Batch 0 should have been deleted");

Expected behavior
A clear and concise description of what you expected to happen.

A batch should be deleted once all its tasks have been deleted

Screenshots
If applicable, add screenshots to help explain your problem.

Meilisearch version:

branch main

Additional context
Additional information that may be relevant to the issue.
[e.g. architecture, device, OS, browser]

I'm already working on a fix, this is issue is just to keep track

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions