Skip to content

Indexing scheduling unbalanced for Kafka source #5747

Open
@rdettai

Description

@rdettai

Describe the bug
When using a SourceToScheduleType::NonSharded (e.g Kafka), the current implementation of the the indexing scheduler seems systematically collocates all pipeline of a given source into the same indexer. For the Kafka source, this prevents distributing the indexing load of a given topic across indexers.

Note that this problem was already reported here. The proposed solution of setting a small cpu_capacity does not work because the scheduler scales the capacities to fit the workload before assigning the pipelines to the nodes.

Steps to reproduce (if applicable)
See test in comments.

Expected behavior
Pipelines with high throughputs should be more or less evenly distributed across indexers.

Possible solutions

  1. measure the actual load for each Kafka source (currently hardcoded to 4CPU) and use that for scheduling. This increases the risk of entering rebalancing ping pong between the control plane and the Kafka reblancing protocol.
  2. for each source, try to first limit the max number of pipelines that can be assigned to each node according to its unscaled original capacity.
  3. (variant of 2) re-introduce a source parameter like max_num_pipelines_per_indexer so that users can at least manually force the distribution of the load for given source/topics across nodes. This parameter would be pretty hard to configure properly (and hard to maintain for fluctuating workloads)

EDIT:
4) add a "num cpu per pipeline" parameter to the source, to make it possible to inform Quickwit that some Kafka topic do not require such a large amount of cpu.
5) (variant of 4) add an "average data rate" parameter to the source, which would have the same effect as "num cpu per pipeline" but easier for the user to configure (QW internally converts the bandwidth to CPUs)

Configuration:
Main (but same behavior in 0.8).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions