Integrate Custom KV Cache Compression Method

I am looking to integrate my custom KV cache compression method into the lm-evaluation-harness project. I have already attempted inserting the method into the _model_generate and _model_call functions. While this resulted in a change in the KV cache size, the model's accuracy remained unchanged.