CyCle1024

Follow

CyCle1024

Follow

6 followers · 19 following

Shanghai
20:13 (UTC +08:00)

Achievements

Achievements

Popular repositories Loading

AdaptiveGEMM AdaptiveGEMM Public

Forked from deepseek-ai/DeepGEMM

AdaptiveGEMM: FP8 GEMM with Adaptation to Various Lengths of Group M

Cuda 1
accelerate accelerate Public

Forked from huggingface/accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

Python
dlinfer dlinfer Public

Forked from DeepLink-org/dlinfer

Python
lmdeploy lmdeploy Public

Forked from InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python
DeepEP DeepEP Public

Forked from deepseek-ai/DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda
GroupedGEMM GroupedGEMM Public

Forked from fanshiqing/grouped_gemm

PyTorch bindings for CUTLASS and CUBLAS Grouped GEMM.

Cuda