Skip to content

save_load_state example segfaulting after adding Metal inference #1737

Closed
@JohannesGaessler

Description

@JohannesGaessler

Expected Behavior

The example saves and loads a state.

Current Behavior

The example crashes with a segmentation fault.

Environment and Context

According to git bisect the first commit that causes a segmentation fault is master-ecb-217d, the one where Metal inference was added.

Hardware:

  • Physical (or virtual) hardware you are using, e.g. for Linux:

$ lscpu

  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         43 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 7 3700X 8-Core Processor
    CPU family:          23
    Model:               113
    Thread(s) per core:  2
    Core(s) per socket:  8
    Socket(s):           1
    Stepping:            0
    Frequency boost:     enabled
    CPU(s) scaling MHz:  77%
    CPU max MHz:         4935.9370
    CPU min MHz:         2200.0000
    BogoMIPS:            7202.09
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr
                         _opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3
                          fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalign
                         sse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pst
                         ate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsav
                         ec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock
                          nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umi
                         p rdpid overflow_recov succor smca sev sev_es
Virtualization features: 
  Virtualization:        AMD-V
Caches (sum of all):     
  L1d:                   256 KiB (8 instances)
  L1i:                   256 KiB (8 instances)
  L2:                    4 MiB (8 instances)
  L3:                    32 MiB (2 instances)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-15
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Mitigation; untrained return thunk; SMT enabled with STIBP protection
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
  • Operating System, e.g. for Linux:

$ uname -a
Linux johannes-pc 6.3.0-1-MANJARO #1 SMP PREEMPT_DYNAMIC Mon Apr 3 10:46:56 UTC 2023 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:
Python 3.10.10
GNU Make 4.4.1
g++ (GCC) 12.2.1 20230201

Steps to Reproduce

git checkout master-ecb217d
make clean && make save-load-state
./save-load-state --model path/to/model.bin

Failure Logs

The GDB output for the segfault:

Thread 1 "save-load-state" received signal SIGSEGV, Segmentation fault.
0x000055555556e5fd in ggml_view_3d (ctx=0x55555569da68 <g_state+200>, a=0x7ffb83bff030, ne0=6656, ne1=6, ne2=60, nb1=13312, nb2=6815744, offset=0)
    at ggml.c:5901
5901        memcpy(offs->data, &offset, 2*sizeof(int32_t));
(gdb) bt
#0  0x000055555556e5fd in ggml_view_3d (ctx=0x55555569da68 <g_state+200>, a=0x7ffb83bff030, ne0=6656, ne1=6, ne2=60, nb1=13312, nb2=6815744, 
    offset=0) at ggml.c:5901
#1  0x000055555559b73e in llama_copy_state_data (ctx=0x5555556b22c0, dst=0x7ffac15c9010 ":\032") at llama.cpp:2751
#2  0x000055555555afa7 in main (argc=3, argv=0x7fffffffd778) at examples/save-load-state/save-load-state.cpp:59

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions