Skip to content

Bytes fastfield codec mismatch #1278

@PSeitz

Description

@PSeitz

Bytes fast field consists of two indices, the raw bytes data and a fast field index to find the bytes for a doc.

When writing a bytes fast field for the first time, the index is always written with the bitpack codec

let mut doc_index_serializer =
serializer.new_u64_fast_field_with_idx(self.field, 0, self.vals.len() as u64, 0)?;

When merging indices there is generic code used, which has an auto-detection for the best codec

fn write_1_n_fast_field_idx_generic<T: MultiValueLength>(
field: Field,
fast_field_serializer: &mut CompositeFastFieldSerializer,
doc_id_mapping: &SegmentDocIdMapping,
reader_and_field_accessors: &[(&SegmentReader, T)],

The index part of the bytes field is always read as Bitpacked serialized index, which can lead to following error after a merge:

thread 'merge_thread0' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `2`: Tried to open fast field as bitpacked encoded (id=1), but got serializer with different id', C:\Users\ChillFish8\.cargo\git\checkouts\tantivy-f70b7ea03dadae9a\13a4473\src\fastfield\reader.rs:160:9

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions