Skip to content

Refactor: update ggml library? #133

@Nexesenex

Description

@Nexesenex

Background Description

Hey IK,

It becomes harder and harder to merge your work into my fork of KoboldCPP. I'm advancing well, but now I'm hitting the ggml_library barrier.

For example, to merge :
https://github.com/ikawrakow/ik_llama.cpp/pull/9/files#diff-f028a352a33ee20b42faca7dcc389e8f0f9c9a55e016cccffed45fe90bcc13f8R5907

into a current version of KoboldCPP,
I need :

ggml-org/ggml#988

because

"grad" is not a member of ggml_tensor anymore

"static struct ggml_tensor * ggml_softcap_impl(
        struct ggml_context * ctx,
        struct ggml_tensor  * a,
        float                 s_before,
        float                 s_after,
        bool inplace) {
    GGML_ASSERT(ggml_is_padded_1d(a));

    bool is_node = false;

    if (a->grad) {   // <---------------------------
        is_node = true;
    }"

I merged and made work on my KCPP fork your first batch of IK quants (2,3,4,5,6) on Cuda, but I also meet trouble to refactor properly the cuda side of things for your more recent quants (specifically on the dot product template modification, even if I might be able to handle that one by myself with more digging into the factoring, I'm not sure).

Anyway, do you have plans to update IK_Llama's GGML Library, or even the whole Llama.CPP (I'm not asking for that last one, though) in the future? I'd love to keep using your work, and integrating it into my KCPP fork is a very good exercise for me to learn, but integrating your work into KCPP without the current ggml library is just too much for me to handle, as is to rebase everything on IK_Llama considering that KCPP mainline follows the developments of Llama.CPP, and thus of the ggml library.

Possible Refactor Approaches

For you to decide!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions