Skip to content

Error converting moonshotai/Kimi-Dev-72B #52

@artus-dev

Description

@artus-dev

Overview

  • Model Arch: Qwen2ForCausalLM
  • Exllamav3 Version: v0.0.3 and v0.0.4

Conversion of moonshotai/Kimi-Dev-72B faults for several bpw variants (6.0_H6, 4.25_H6, 4.0_H6, 3.0_H6) with the following error:
torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).

Other bpw variants (8.0_H8, 8.0_H6, 5.0_H6, 3.5_H6) were successfully converted. Model itself works correctly so there is no issue with raw weights inference.

Trace
Traceback (most recent call last):
File "/opt/exl/exllamav3/convert.py", line 11, in <module>
  main(_in_args, _job_state)
File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
  return func(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^
File "/opt/exl/exllamav3/exllamav3/conversion/convert_model.py", line 417, in main
  proxy_err = linear.convert_exl3(
              ^^^^^^^^^^^^^^^^^^^^
File "/opt/exl/exllamav3/exllamav3/modules/linear.py", line 235, in convert_exl3
  weight_q, proxy_err, out_tensors = quantize_exl3(
                                     ^^^^^^^^^^^^^^
File "/opt/exl/exllamav3/exllamav3/modules/quant/exl3_lib/quantize.py", line 781, in quantize_exl3
  H, L, su, H_diag = finalize_capture_H(H_data, quant_args, verbose)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/exl/exllamav3/exllamav3/modules/quant/exl3_lib/quantize.py", line 480, in finalize_capture_H
  L, H = block_ldl(H, 16, verbose)
         ^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/exl/exllamav3/exllamav3/modules/quant/exl3_lib/quantize.py", line 287, in block_ldl
  raise e
File "/opt/exl/exllamav3/exllamav3/modules/quant/exl3_lib/quantize.py", line 274, in block_ldl
  L = torch.linalg.cholesky(H)
      ^^^^^^^^^^^^^^^^^^^^^^^^
torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).
Conversion log (restart)
                 - [4x] Linear
             - RMSNorm
             - GatedMLP
                 - [3x] Linear
         - RMSNorm
         - Linear
 -- Loaded tokenizer
    Vocab size: 151665
 -- Resuming at: model.layers.79
 -- Loading unquantized module: model.layers.79

 -- Captured: model.layers.79

 -- Quantized: model.layers.79.self_attn.q_proj                         bpw:  6.00  proxy_err: 0.000066  o  g_sc: 0.821332  [2.97 s]

 -- Quantized: model.layers.79.self_attn.k_proj                         bpw:  6.00  proxy_err: 0.000066  o  g_sc: 0.812287  [1.36 s]

 -- Quantized: model.layers.79.self_attn.v_proj                         bpw:  6.00  proxy_err: 0.000098  o  g_sc: 0.835967  [1.36 s]

 -- Quantized: model.layers.79.self_attn.o_proj                         bpw:  6.00  proxy_err: 0.000022  o  g_sc: 0.806696  [2.35 s]

 -- Quantized: model.layers.79.mlp.up_proj                              bpw:  6.00  proxy_err: 0.000023  o  g_sc: 0.812287  [7.09 s]

 -- Quantized: model.layers.79.mlp.gate_proj                            bpw:  6.00  proxy_err: 0.000027  o  g_sc: 0.815741  [6.20 s]

 -- Quantized: model.layers.79.mlp.down_proj                            bpw:  6.00  proxy_err: 0.000008  .  g_sc: 0.821332  [14.25 s]

 -- Quantized: model.layers.79                                          bpw:  6.00  rfn: 0.003657  cos: 0.000007  sqnr: 48.787139  [68.06 s]
 -- Estimated remaining time: 3 minutes
 -- Loading unquantized module: model.norm
https://github.com/turboderp-org/exllamav3
 -- Quantized: model.norm                                               bpw: 16.00  rfn: 0.000000  cos: 0.000000  sqnr: 0.000000  [3.01 s]
 -- Estimated remaining time: 1 minute
 -- Loading unquantized module: lm_head

 -- Captured: lm_head
 !! Warning: block state has 0 inf values and 1 NaN values (out of 1,677,721,600)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions