Skip to content

The tail plugin matches many files regularly, resulting in the accumulation of Loki data output #945

@Ants5

Description

@Ants5

Bug Report

Describe the bug
Fluent-bit collects logs and outputs them to Loki. The tail plug-in on the input side uses *_charging-us-prod_*.log to read the log file path. Matching multiple log files (about 20) will cause the log output to Loki to be very slow, and data will accumulate on the input collection side.

To Reproduce
If the path configuration uses /var/log/containers/*_charging-us-prod_*.log, data accumulation will occur.

My collection configuration is as follows:

    [INPUT]
        Name                tail
        Tag                 charging-us-prod-loki.*
        Exclude_Path        /var/log/containers/apisix-dashboard*
        Path                /var/log/containers/*_charging-us-prod_*.log
        Parser              cri
        DB                  /var/fluent-bit/state/loki-charging-us-prod.db
        Mem_Buf_Limit       100MB
        Skip_Long_Lines     On
        Skip_Empty_Lines    On
        Refresh_Interval    20
        Rotate_Wait         20
        storage.type        filesystem
        #Read_from_Head      False

    [FILTER] 
        Name                grep
        Match               charging-us-prod.*
        Exclude             level TRACE
    [FILTER]
        Name                kubernetes
        Match               charging-us-prod-loki.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_Tag_Prefix     charging-us-prod-loki.var.log.containers.
        Merge_Log           On
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off
        Labels              Off
        Annotations         Off
    [FILTER]
        Name                nest
        Match               charging-us-prod-loki.*
        Operation           lift
        Nested_under        kubernetes
        Add_prefix          kubernetes.
    [FILTER]
        Name                record_modifier
        Match               charging-us-prod-loki.*
        Record              area   usa
        Remove_key          kubernetes.pod_id
        Remove_key          kubernetes.docker_id
        Remove_key          kubernetes.container_hash
        Remove_key          kubernetes.container_image
        #Remove_key          kubernetes.pod_name
        #Remove_key          kubernetes.namespace_name
    [FILTER]
        Name                nest
        Match               charging-us-prod-loki.*
        Operation           nest
        Wildcard            kubernetes.*
        Nest_under          kubernetes
        Remove_prefix       kubernetes.
        
    [OUTPUT]
        Name                loki
        Match               charging-us-prod-loki.*
        host                url
        port                80
        http_user           username
        http_passwd         password
        tenant_id           loki
        Labels              project=Charging,cluster=us-prod,service_name=$kubernetes['container_name'],namespace=$kubernetes['namespace_name'],pod=$kubernetes['pod_name'],container=$kubernetes['container_name'],nodename=$kubernetes['host']
        auto_kubernetes_labels on

If an input path is configured with one or two files, there will be no accumulation problem, but then I need to configure an input for each log file, and my configuration will be too long.

Screenshots

Image

Your Environment
fluent-bit version: aws-for-fluent-bit:2.32.5.20250212
kubernetes version: 1.31
loki version: 3.4.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions