Skip to content

lj_str_new hash conflict is serious when length larger than 128 #60

@stone-wind

Description

@stone-wind

bad

at lj_str_hash_128_above :

 lj_str_hash_128_above(const char* str,  uint32_t len)
  chunk_num = 16;
  chunk_sz = len / chunk_num;
  chunk_sz_log2 = log2_floor(chunk_sz);

  pos1 = get_random_pos_unsafe(chunk_sz_log2, 0);
  pos2 = get_random_pos_unsafe(chunk_sz_log2, 1);
 /* loop over 14 chunks, 2 chunks at a time */
  for (i = 0, chunk_ptr = str; i < (chunk_num / 2 - 1);
       **chunk_ptr += chunk_sz**, i++) {

    v = *cast_uint64p(chunk_ptr + pos1);
    h1 = _mm_crc32_u64(h1, v);

    v = *cast_uint64p(chunk_ptr + chunk_sz + pos2);
    h2 = _mm_crc32_u64(h2, v);
  }
  1. when len is same, the chunk_sz, chunk_sz_log2, pos1, pos2 are always same
  2. in loop, should we set chunk_ptr += 2*chunk_sz, the loop only cover half string

when i use sock:receive(len), recv the same length(527) strings which are similar(changed 36 bytes at second half), the hash conflict is very frequency, and cpu can reach high(100).

It is terrible in my test, it stop all the thing (because recv is not blocked) and can only recv 200 msg/s. it run too many str_fastcmp at lj_str_new.
if (sx->len == len && sx->hash == h && str_fastcmp(str, strdata(sx), len) == 0) {

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions