xlite-dev

All

53 repositories

diffusers
Public
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Python
•
Apache License 2.0
•6.7k•0•0•0•Updated Jan 21, 2026Jan 21, 2026
sglang
Public
SGLang is a fast serving framework for large language models and vision language models.
Python
•
Apache License 2.0
•4.1k•0•0•0•Updated Jan 20, 2026Jan 20, 2026
vllm-omni
Public
A framework for efficient model inference with omni-modality models
Python
•
Apache License 2.0
•308•0•0•0•Updated Jan 20, 2026Jan 20, 2026
ffpa-attn
Public
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
cuda attention sdpa mla mlsys tensor-cores flash-attention deepseek deepseek-v3 deepseek-r1
Cuda
•
GNU General Public License v3.0
•13•246•1•0•Updated Jan 20, 2026Jan 20, 2026
cache-dit
Public
A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗DiTs.
Python
•
Apache License 2.0
•53•4•0•0•Updated Jan 19, 2026Jan 19, 2026
Awesome-DiT-Inference
Public
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
flux wan diffusion dit sora stable-diffusion sdxl sd15 deepcache open-sora-plan
Python
•
GNU General Public License v3.0
•25•500•0•0•Updated Jan 18, 2026Jan 18, 2026
Awesome-LLM-Inference
Public
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
mla vllm llm-inference awesome-llm flash-attention tensorrt-llm paged-attention deepseek flash-attention-3 deepseek-v3
Python
•
GNU General Public License v3.0
•336•4.9k•0•0•Updated Jan 18, 2026Jan 18, 2026
lite.ai.toolkit
Public
🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
tensorrt mnn ncnn onnx onnxruntime yolov5 tnn mnn-model yolox robustvideomatting
C++
•
GNU General Public License v3.0
•770•4.4k•1•0•Updated Jan 18, 2026Jan 18, 2026
LeetCUDA
Public
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
cuda cuda-kernels cuda-demo cuda-toolkit cuda-library cuda-kernel learn-cuda cuda-cpp hgemm flash-attention
Cuda
•
GNU General Public License v3.0
•928•9.4k•1•0•Updated Jan 18, 2026Jan 18, 2026
flux-fast
Public
A forked version of flux-fast that makes flux-fast even faster with cache-dit.
Python
•16•4•0•0•Updated Jan 5, 2026Jan 5, 2026
Qwen-Image
Public
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Python
•
Apache License 2.0
•410•1•0•0•Updated Jan 1, 2026Jan 1, 2026
Z-Image
Public
Python
•
Apache License 2.0
•565•1•0•0•Updated Dec 25, 2025Dec 25, 2025
SageAttention
Public
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Cuda
•
Apache License 2.0
•320•0•0•0•Updated Dec 3, 2025Dec 3, 2025
.github
Public
0•1•0•0•Updated Nov 25, 2025Nov 25, 2025
ImageReward
Public
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Python
•
Apache License 2.0
•83•0•0•0•Updated Oct 30, 2025Oct 30, 2025
longcat-video-fast
Public
🔥LongCat-Video 1.7x🎉 speedup: cache acceleration and 4/8-bits weight only.
longcat longcat-video
Python
•0•7•0•0•Updated Oct 28, 2025Oct 28, 2025
LongCat-Video
Public
Python
•
MIT License
•270•0•0•0•Updated Oct 28, 2025Oct 28, 2025
ComfyUI
Public
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Python
•
GNU General Public License v3.0
•11k•0•0•0•Updated Oct 27, 2025Oct 27, 2025
qwen-image-fast
Public
⚡️Qwen-Image 4.8x🎉 speedup with Hybrid Acceleration for low VRAM GPUs
qwen-image qwen-image-lightning qwen-image-edit qwen-image-api qwen-image-lora
Python
•
Apache License 2.0
•0•17•4•0•Updated Oct 24, 2025Oct 24, 2025
Kandinsky-5
Public
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Python
•
Apache License 2.0
•49•0•0•0•Updated Oct 22, 2025Oct 22, 2025
Wan2.1
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•2.3k•1•0•0•Updated Oct 17, 2025Oct 17, 2025
Wan2.2
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•1.6k•0•0•0•Updated Oct 17, 2025Oct 17, 2025
nunchaku
Public
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Python
•
Apache License 2.0
•216•2•0•0•Updated Oct 15, 2025Oct 15, 2025
DiffSynth-Studio
Public
Enjoy the magic of Diffusion models!
Python
•
Apache License 2.0
•1.1k•0•0•0•Updated Oct 13, 2025Oct 13, 2025
HunyuanImage-3.0
Public
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Python
•
Other
•128•1•0•0•Updated Oct 4, 2025Oct 4, 2025
comfyui-cache-dit
Public
cache-dit for comfyui
Python
•4•31•3•0•Updated Sep 27, 2025Sep 27, 2025
HunyuanImage-2.1
Public
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
Python
•
Other
•53•1•0•0•Updated Sep 10, 2025Sep 10, 2025
Qwen-Image-Lightning
Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Python
•
Apache License 2.0
•43•0•0•0•Updated Sep 9, 2025Sep 9, 2025
deepcompressor
Public
Model Compression Toolbox for Large Language Models and Diffusion Models
Python
•
Apache License 2.0
•86•0•0•0•Updated Aug 14, 2025Aug 14, 2025
SpargeAttn
Public
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
Cuda
•
Apache License 2.0
•82•6•0•0•Updated Aug 7, 2025Aug 7, 2025