Skip to content

sumanth-dhanya/llm-toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 

Repository files navigation

👨🏻‍💻 LLM Engineer Toolkit

This repository contains a curated list of 180+ LLM libraries category wise.

LinkedIn

Quick links

🚀 LLM Training 🧱 LLM Application Development 🩸LLM RAG
🟩 LLM Inference 🚧 LLM Serving 📤 LLM Data Extraction
🌠 LLM Data Generation 💎 LLM Agents ⚖️ LLM Evaluation
🔍 LLM Monitoring 📅 LLM Prompts 📝 LLM Structured Outputs
🛑 LLM Safety and Security 💠 LLM Embedding Models 🔖 LLM Scraping
🏁 LLM Leaderboard 📈 LLM Scaling 🛠️ LLM Tools
❇️ Others 🔓 LLM MCP ⚖️ LLM Evaluation Harness
📖 LLM Learning 🏞️ Image Tools

LLM Leaderboard

Web Page Description Link
Chatbot Arena An open platform for crowdsourced AI benchmarking Link
SEAL Evals and different concept model on wide variety of tasks. Link
Open Router shows the top open source models on their platform, ranked by their usage (token volume). This can help you evaluate open source models by popularity. Link
HF-Open-llm-leaderboard Open source model performance view Link
Berkeley function calling leaderboard evaluates the LLM's ability to call functions (aka tools) accurately. Link

LLM Training and Fine-Tuning

Training Frameworks

Library Description Link Stars
unsloth Fine-tune LLMs faster with less memory. Link GitHub Repo stars
PEFT State-of-the-art Parameter-Efficient Fine-Tuning library. Link GitHub Repo stars
TRL Train transformer language models with reinforcement learning. Link GitHub Repo stars
Transformers Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. Link GitHub Repo stars
Axolotl Tool designed to streamline post-training for various AI models. Link GitHub Repo stars
LLMBox A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation. Link GitHub Repo stars
LitGPT Train and fine-tune LLM lightning fast. Link GitHub Repo stars
Mergoo A library for easily merging multiple LLM experts, and efficiently train the merged LLM. Link GitHub Repo stars
Llama-Factory Easy and efficient LLM fine-tuning. Link GitHub Repo stars
Ludwig Low-code framework for building custom LLMs, neural networks, and other AI models. Link GitHub Repo stars
Txtinstruct A framework for training instruction-tuned models. Link GitHub Repo stars
Lamini An integrated LLM inference and tuning platform. Link GitHub Repo stars
XTuring xTuring provides fast, efficient and simple fine-tuning of open-source LLMs, such as Mistral, LLaMA, GPT-J, and more. Link GitHub Repo stars
RL4LMs A modular RL library to fine-tune language models to human preferences. Link GitHub Repo stars
DeepSpeed DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. Link GitHub Repo stars
torchtune A PyTorch-native library specifically designed for fine-tuning LLMs. Link GitHub Repo stars
PyTorch Lightning A library that offers a high-level interface for pretraining and fine-tuning LLMs. Link GitHub Repo stars

Training Datasets

Library Description Link Stars
llm-datasets Curated list of datasets and tools for post-training. Link GitHub Repo stars
llmDataHub Trending instruction finetuning datasets Link GitHub Repo stars

LLM Scraping

Web Page Description Link Stars
Firecrawl Empower your AI apps with clean data from any website. Featuring advanced scraping, crawling, and data extraction capabilities. Link GitHub Repo stars
SerpAPI crape and parse search results from Google, Bing, Baidu, Yandex, Yahoo, Home Depot, eBay and more Link GitHub Repo stars
BrightData Api to scrape data Link

LLM MCP

Library Description Link GitHub Stars
awesome-mcp Awesome collection of MCP servers . Link GitHub Repo stars
mcp-containers Containerized versions of hundreds of MCP Servers Link GitHub Repo stars
TaskMaster A task management system for AI-driven development with Claude, designed to work seamlessly with Cursor AI. Link GitHub Repo stars
DeepMCPAgent Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE. Link GitHub Repo stars

LLM Application Development

Frameworks

Library Description Link GitHub Stars
LangChain LangChain is a framework for developing applications powered by large language models (LLMs). Link GitHub Repo stars
Llama Index LlamaIndex is a data framework for your LLM applications. Link GitHub Repo stars
HayStack Haystack is an end-to-end LLM framework that allows you to build applications powered by LLMs, Transformer models, vector search, etc. Link GitHub Repo stars
Prompt flow A suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications. Link GitHub Repo stars
Griptape A modular Python framework for building AI-powered applications. Link GitHub Repo stars
Weave Weave is a toolkit for developing Generative AI applications. Link GitHub Repo stars
Llama Stack Build Llama Apps. Link GitHub Repo stars
awesom-llm-apps curated collection of Awesome LLM apps Link GitHub Repo stars
LLM-from-scratch Implement a ChatGPT-like LLM in PyTorch from scratch, step by step. official code repository for the book Build a Large Language Model (From Scratch). Link GitHub Repo stars

Multi API Access

Note: ToDo: An absraction layer such as tool gateway, can also be useful for accessing a wide range of tools.
Library Description Link GitHub Stars
LiteLLM Library to call 100+ LLM APIs in OpenAI format. Link GitHub Repo stars
AI Gateway A Blazing Fast AI Gateway with integrated Guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API. Link GitHub Repo stars
llm-gateway tracks data sent and received from these providers in a postgres database and runs PII scrubbing heuristics prior to sending. Link GitHub Repo stars
MLFlow AI Gateway high-level interface that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests. Link
TrueFoundry It is an enterprise-grade platform that enables users to access 1000+ LLMs using a unified interface while taking care of observability and governance. Link
KongHQ Simplify AI governance and ensure compliant AI innovation across your entire organization. Link
Cloudflare Cloudflare's AI Gateway Link

Routers

Library Description Link GitHub Stars
RouteLLM Framework for serving and evaluating LLM routers - save LLM costs without compromising quality. Drop-in replacement for OpenAI's client to route simpler queries to cheaper models. Link GitHub Repo stars

Memory

Library Description Link GitHub Stars
mem0 The Memory layer for your AI apps. Link GitHub Repo stars
Memoripy An AI memory layer with short- and long-term storage, semantic clustering, and optional memory decay for context-aware applications. Link GitHub Repo stars

Interface

Library Description Link GitHub Stars
Streamlit A faster way to build and share data apps. Streamlit lets you transform Python scripts into interactive web apps in minutes Link GitHub Repo stars
Gradio Build and share delightful machine learning apps, all in Python. Link GitHub Repo stars
AI SDK UI Build chat and generative user interfaces. Link
AI-Gradio Create AI apps powered by various AI providers. Link GitHub Repo stars
Simpleaichat Python package for easily interfacing with chat apps, with robust features and minimal code complexity. Link GitHub Repo stars
Chainlit Build production-ready Conversational AI applications in minutes. Link GitHub Repo stars
Reflex-dev Best frontend stack (Contributor for it too) easy FASTAPI backed Next.js web application Link GitHub Repo stars

Low Code

Library Description Link GitHub Stars
LangFlow LangFlow is a low-code app builder for RAG and multi-agent AI applications. It's Python-based and agnostic to any model, API, or database. Link GitHub Repo stars

Cache

Library Description Link GitHub Stars
GPTCache A Library for Creating Semantic Cache for LLM Queries. Slash Your LLM API Costs by 10x 💰, Boost Speed by 100x. Fully integrated with LangChain and LlamaIndex. Link GitHub Repo stars

LLM RAG

Library Description Link GitHub Stars
FastGraph RAG Streamlined and promptable Fast GraphRAG framework designed for interpretable, high-precision, agent-driven retrieval workflows. Link GitHub Repo stars
Chonkie RAG chunking library that is lightweight, lightning-fast, and easy to use. Link GitHub Repo stars
Graphiti Graphiti is a framework for building and querying temporally-aware knowledge graphs. framework supports incremental data updates, efficient retrieval, and precise historical queries without requiring complete graph recomputation, making it suitable for developing interactive, context-aware AI applications. Link GitHub Repo stars
RAGChecker A Fine-grained Framework For Diagnosing RAG. Link GitHub Repo stars
RAG to Riches Build, scale, and deploy state-of-the-art Retrieval-Augmented Generation applications. Link GitHub Repo stars
BeyondLLM Beyond LLM offers an all-in-one toolkit for experimentation, evaluation, and deployment of RAG systems. Link GitHub Repo stars
SQLite-Vec A vector search SQLite extension that runs anywhere! Link GitHub Repo stars
fastRAG fastRAG is a research framework for efficient and optimized retrieval-augmented generative pipelines. Link GitHub Repo stars
FlashRAG A Python Toolkit for Efficient RAG Research. Link GitHub Repo stars
Llmware Unified framework for building enterprise RAG pipelines with small, specialized models. Link GitHub Repo stars
Rerankers A lightweight unified API for various reranking models. Link GitHub Repo stars
RAGatouille allows us to easily use and train state-of-the-art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Link GitHub Repo stars
ColPali method leverages Vision-Language Models (VLMs) and the late interaction mechanism from ColBERT to enable precise cross-modal retrieval. Link GitHub Repo stars
Vectara Build Agentic RAG applications. Link
Morphik Technical data extraction. Ex: What is the height of screw 14-A in the instructions? Link GitHub Repo stars
Memvid Memvid compresses your knowledge base into compact video files while maintaining instant access to any piece of information. Sub-seconds retrival Link GitHub Repo stars
Superlinked Building compute layer between data and vector storage. Add Metadata to the unstructured data. high-performance search & recommendation applications that combine structured and unstructured data, Link GitHub Repo stars
ragflow Extraction from unstructured data with complicated formats. Visualization of text chunking to allow human intervention. Link GitHub Repo stars
codeRag Bult on top of mem graph . Code understanding rag's. Link GitHub Repo stars
litepali Minimal colpali for image retrival and indexing , optimized for cloud deployments. Link GitHub Repo stars
LightRAG Graph driven , Simple and Fast Retrieval-Augmented Generation Link GitHub Repo stars
RAG-Anything Graph driven , handles different modality well, built on Light-RAG by members of LightRAG, tables, equations,. Link GitHub Repo stars

LLM Inference

Library Description Link GitHub Stars
LLM Compressor Transformers-compatible library for applying various compression algorithms to LLMs. Link GitHub Repo stars
LightLLM Python-based LLM inference and serving framework with lightweight design. Link GitHub Repo stars
vLLM High-throughput and memory-efficient inference and serving engine for LLMs. Link GitHub Repo stars
torchchat Run PyTorch LLMs locally on servers, desktop, and mobile. Link GitHub Repo stars
TensorRT-LLM TensorRT-LLM is a library for optimizing Large Language Model (LLM) inference. Link GitHub Repo stars
WebLLM High-performance In-browser LLM Inference Engine. Link GitHub Repo stars

LLM Serving

Library Description Link GitHub Stars
Langcorn Serving LangChain LLM apps and agents automagically with FastAPI. Link GitHub Repo stars
LitServe Lightning-fast serving engine for any AI model of any size. Link GitHub Repo stars
Lorax serve thousands of fine-tuned models on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency. Link GitHub Repo stars
NVIDIA Dynamo high-throughput, low-latency inference framework for serving reasoning models like DeepSeek-R1! Link GitHub Repo stars

LLM Data Extraction

Library Description Link Stars
Crawl4AI Open-source LLM Friendly Web Crawler & Scraper. Link GitHub Repo stars
ScrapeGraphAI A web scraping Python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Link GitHub Repo stars
Docling Docling parses documents and exports them to the desired format with ease and speed. Link GitHub Repo stars
Dolphin Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting Link GitHub Repo stars
Llama Parse GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Link GitHub Repo stars
Crawlee A web scraping and browser automation library. Link GitHub Repo stars
MegaParse Parser for every type of document. Link GitHub Repo stars
ExtractThinker Document Intelligence library for LLMs. Link GitHub Repo stars
PyMuPDF4LLM PyMuPDF4LLM library makes it easier to extract PDF content in the format you need for LLM & RAG environments. Link
dots.ocr Single Vision-Language model. dots.ocr achieves SOTA performance for text, tables, and reading order. A compact 1.7B LLM Link GitHub Repo stars
tensorlake Tensorlake is a Document Ingestion API and a serverless platform for building data processing and orchestration APIs Link GitHub Repo stars
langextract Optimized for long documents, Structured data extraction Link GitHub Repo stars
MinerU MinerU was born during the pre-training process of InternLM. We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models Link GitHub Repo stars

LLM Data Generation

Library Description Link Stars
DataDreamer DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. Link GitHub Repo stars
fabricator A flexible open-source framework to generate datasets with large language models. Link GitHub Repo stars
Promptwright Synthetic Dataset Generation Library. Link GitHub Repo stars
EasyInstruct An Easy-to-use Instruction Processing Framework for Large Language Models. Link GitHub Repo stars
Synthetic Data Vault Python library designed to be your one-stop shop for creating tabular synthetic data. Link GitHub Repo stars
Dupeguru Tool for dat deduplication. Link GitHub Repo stars
Dedupe A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution Link GitHub Repo stars
Datasketch A python library for accurate and scalable fuzzy matching, datasketch gives you probabilistic data structures that can process and search very large amount of data super fast, with little loss of accuracy. Link GitHub Repo stars
TextDistance 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage. Link GitHub Repo stars
TheFuzz Fuzzy string matching in python Link GitHub Repo stars
Google-deduplicate repository contains code to deduplicate language model datasets as descrbed in the paper "Deduplicating Training Data Makes Language Models Better" Link GitHub Repo stars

LLM Agents

Agent is a system that leverages AI model to interact with its environment in order to achieve a user-defined goal. To evaluate an agent, identiy its failure modes and measure how often each of these failure modes happen.

Library Description Link Stars
CrewAI Framework for orchestrating role-playing, autonomous AI agents. Link GitHub Repo stars
LangGraph Build resilient language agents as graphs. Link GitHub Repo stars
Smolagents Library to build powerful agents in a few lines of code. Link GitHub Repo stars
Agno Build AI Agents with memory, knowledge, tools, and reasoning. Chat with them using a beautiful Agent UI. Link GitHub Repo stars
AutoGen An open-source framework for building AI agent systems. Link GitHub Repo stars
gradio-tools A Python library for converting Gradio apps into tools that can be leveraged by an LLM-based agent to complete its task. Link GitHub Repo stars
Composio Production Ready Toolset for AI Agents. Link GitHub Repo stars
Atomic Agents Building AI agents, atomically. Link GitHub Repo stars
Memary Open Source Memory Layer For Autonomous Agents. Link GitHub Repo stars
Browser Use Make websites accessible for AI agents. Link GitHub Repo stars
OpenWebAgent An Open Toolkit to Enable Web Agents on Large Language Models. Link GitHub Repo stars
Lagent A lightweight framework for building LLM-based agents. Link GitHub Repo stars
LazyLLM A Low-code Development Tool For Building Multi-agent LLMs Applications. Link GitHub Repo stars
Swarms The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Link GitHub Repo stars
ChatArena ChatArena is a library that provides multi-agent language game environments and facilitates research about autonomous LLM agents and their social interactions. Link GitHub Repo stars
Swarm Educational framework exploring ergonomic, lightweight multi-agent orchestration. Link GitHub Repo stars
AgentStack The fastest way to build robust AI agents. Link GitHub Repo stars
Archgw Intelligent gateway for Agents. Link GitHub Repo stars
Flow A lightweight task engine for building AI agents. Link GitHub Repo stars
Langroid Multi-Agent framework. Link GitHub Repo stars
Agentarium Framework for creating and managing simulations populated with AI-powered agents. Link GitHub Repo stars
CoAgents Framework for agents with human in the loop workflow. Link GitHub Repo stars
Camel-owl 🦉 OWL is a cutting-edge framework for multi-agent collaboration that pushes the boundaries of task automation, built on top of the CAMEL-AI Framework. Link GitHub Repo stars
langmem Memory management tool, to extract important information from conversations,also maintain long term memory. Link GitHub Repo stars
Parlant Parlant gives you all the structure you need to build customer-facing agents that behave exactly as your business requires Link GitHub Repo stars
AgentScope Designed for multi-agent, explicit message passing and workflow orchestration, NO deep encapsulation. Link GitHub Repo stars

LLM Evaluation

Library Description Link GitHub Stars
Ragas Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Link GitHub Repo stars
Giskard Open-Source Evaluation & Testing for ML & LLM systems. Link GitHub Repo stars
DeepEval LLM Evaluation Framework Link GitHub Repo stars
G-BigBench Evaluate the performance on more than 200 tasks included in BIG-bench are Link GitHub Repo stars
Lighteval All-in-one toolkit for evaluating LLMs. Link GitHub Repo stars
Trulens Evaluation and Tracking for LLM Experiments Link GitHub Repo stars
PromptBench A unified evaluation framework for large language models. Link GitHub Repo stars
LangTest Deliver Safe & Effective Language Models. 60+ Test Types for Comparing LLM & NLP Models on Accuracy, Bias, Fairness, Robustness & More. Link GitHub Repo stars
EvalPlus A rigorous evaluation framework for LLM4Code. Link GitHub Repo stars
FastChat An open platform for training, serving, and evaluating large language model-based chatbots. Link GitHub Repo stars
judges A small library of LLM judges. Link GitHub Repo stars
Evals Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks. Link GitHub Repo stars
AgentEvals Evaluators and utilities for evaluating the performance of your agents. Link GitHub Repo stars
LLMBox A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation. Link GitHub Repo stars
Opik An open-source end-to-end LLM Development Platform which also includes LLM evaluation. Link GitHub Repo stars
Camel-ai Simulate up to 1M agents to study emergent behaviors and scaling laws in complex, multi-agent environments. Link GitHub Repo stars
Evidently An open-source framework to evaluate, test and monitor ML and LLM-powered systems. Link GitHub Repo stars
NammyML Non LLM . Allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Link GitHub Repo stars
HELM Holistic Evaluation of Language Models (HELM) . Metrics for measuring various aspects beyond accuracy (e.g. efficiency, bias, toxicity) Link GitHub Repo stars
TravelPlanner TravelPlanner is a benchmark crafted for evaluating language agents in tool-use and complex planning within multiple constraints. Link GitHub Repo stars
OmniDocBench A comprehensive benchmark for document parsing and evaluation Link GitHub Repo stars

LLM Evaluation Harness

A tool that helps you evaluate on multiple benchmarks is an evaluation harness

Library Description Link GitHub Stars
ElutherAI Support over 400 benchmarks Link GitHub Repo stars
Openai-evals Run any of the approximately 500 existing benchmarks and also register a new benchmark to evaluate OpenAI models. Benchmark evaluates on math, problem solving, puzzles, Identify ASCII Link GitHub Repo stars
AgentOps Python SDK for AI agent monitoring and evaluation. Link GitHub Repo stars
AIVerify-moonshot A simple and modular tool to evaluate and red-team any LLM application. Link GitHub Repo stars

LLM Monitoring

Library Description Link Stars
Opik An open-source end-to-end LLM Development Platform which also includes LLM monitoring. Link GitHub Repo stars
LangSmith Provides tools for logging, monitoring, and improving your LLM applications. Link GitHub Repo stars
Weights & Biases (W&B) W&B provides features for tracking LLM performance. Link GitHub Repo stars
Helicone Open source LLM-Observability Platform for Developers. One-line integration for monitoring, metrics, evals, agent tracing, prompt management, playground, etc. Link GitHub Repo stars
Evidently An open-source ML and LLM observability framework. Link GitHub Repo stars
Phoenix An open-source AI observability platform designed for experimentation, evaluation, and troubleshooting. Link GitHub Repo stars
Observers A Lightweight Library for AI Observability. Link GitHub Repo stars

LLM Prompts

Library Description Link Stars
PCToolkit A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models. Link GitHub Repo stars
LLMLingua Library for compressing prompts to accelerate LLM inference. Link GitHub Repo stars
betterprompt Test suite for LLM prompts before pushing them to production. Link GitHub Repo stars
Promptify Solve NLP Problems with LLMs & easily generate different NLP Task prompts for popular generative models like GPT, PaLM, and more with Promptify. Link GitHub Repo stars
DSPy DSPy is the open-source framework for programming—rather than prompting—language models. Link GitHub Repo stars
Py-priompt Prompt design library. Link GitHub Repo stars
Promptimizer Prompt optimization library. Link GitHub Repo stars
OpenPrompt Various of prompting methods, including templating, verbalizing and optimization strategies under a unified standard. You can easily call and understand these methods. Design your own prompt-learning work Link GitHub Repo stars
TextGrad Optimize prompts with AI Link GitHub Repo stars
PromptWizard framework that employs a self-evolving mechanism where the LLM generates, critiques, and refines its own prompts and examples, continuously improving through iterative feedback and synthesis. Link GitHub Repo stars
AutoPrompt effectively addresses common issues such as prompt sensitivity and inherent prompt ambiguity issues. Link GitHub Repo stars

LLM Structured Outputs

Library Description Link Stars
Instructor Python library for working with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API. Link GitHub Repo stars
XGrammar An open-source library for efficient, flexible, and portable structured generation. Link GitHub Repo stars
Outlines Robust (structured) text generation Link GitHub Repo stars
Guidance Guidance is an efficient programming paradigm for steering language models. Link GitHub Repo stars
LMQL A language for constraint-guided and efficient LLM programming. Link GitHub Repo stars
Jsonformer A Bulletproof Way to Generate Structured JSON from Language Models. Link GitHub Repo stars

LLM Safety and Security

Metrics for evaluating systems robustness to prompt attacks:

Violation Rate : Measures percentage of successfull attacks out of all attack attempts

False Refusal Rate : How often a model refuses a query when it's possible to answer safely

Library Description Link Stars
JailbreakEval A collection of automated evaluators for assessing jailbreak attempts. Link GitHub Repo stars
EasyJailbreak An easy-to-use Python framework to generate adversarial jailbreak prompts. Link GitHub Repo stars
Guardrails Adding guardrails to large language models. Link GitHub Repo stars
LLM Guard The Security Toolkit for LLM Interactions. Link GitHub Repo stars
AuditNLG AuditNLG is an open-source library that can help reduce the risks associated with using generative AI systems for language. Link GitHub Repo stars
NeMo Guardrails NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. Link GitHub Repo stars
Garak LLM vulnerability scanner Link GitHub Repo stars
Azure-PyRIT Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems. Link GitHub Repo stars
greshake-llm-security new class of vulnerabilities and impacts stemming from "indirect prompt injection" Link GitHub Repo stars
persuasive_jailbreaker Persuasive Adversarial Prompts are human-readable, achieving a 92% Attack Success Rate on aligned LLMs, without specialized optimization. Link GitHub Repo stars
DeepTeam simulate adversarial attacks using SOTA techniques such as jailbreaking and prompt injections, to catch vulnerabilities like bias and PII Leakage that you might not otherwise be aware of. Link GitHub Repo stars
PurpleLlama Set of tools to assess and improve LLM security. Link GitHub Repo stars
Azure AI content filter Detect and mitigate harmful content in users-generated and AI-generated inputs and outputs. Including text, images and media mix Link
G-PerspectiveAPI Conversation toxicity detections. Link
OpenAI-content moderation Use the moderations endpoint to check whether text or images are potentially harmful. Link

LLM Security Evaluation

Library Description Link Stars
AdvBench Code and data of the EMNLP 2022 paper Link GitHub Repo stars

LLM Embedding Models

Library Description Link Stars
Sentence-Transformers State-of-the-Art Text Embeddings Link GitHub Repo stars
Model2Vec Fast State-of-the-Art Static Embeddings Link GitHub Repo stars
Text Embedding Inference A blazing fast inference solution for text embeddings models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5. Link GitHub Repo stars
EmbedAnything A Rust based super fast makes generating embeddings from multiple sources like image, video, or audio easy. Link GitHub Repo stars

Others

Library Description Link Stars
Text Machina A modular and extensible Python framework, designed to aid in the creation of high-quality, unbiased datasets to build robust models for MGT-related tasks such as detection, attribution, and boundary detection. Link GitHub Repo stars
LLM Reasoners A library for advanced large language model reasoning. Link GitHub Repo stars
EasyEdit An Easy-to-use Knowledge Editing Framework for Large Language Models. Link GitHub Repo stars
CodeTF CodeTF: One-stop Transformer Library for State-of-the-art Code LLM. Link GitHub Repo stars
spacy-llm This package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting, and turning unstructured responses into robust outputs for various NLP tasks. Link GitHub Repo stars
pandas-ai Chat with your database (SQL, CSV, pandas, polars, MongoDB, NoSQL, etc.). Link GitHub Repo stars
LLM Transparency Tool An open-source interactive toolkit for analyzing internal workings of Transformer-based language models. Link GitHub Repo stars
Vanna Chat with your SQL database. Accurate Text-to-SQL Generation via LLMs using RAG. Link GitHub Repo stars
mergekit Tools for merging pretrained large language models. Link GitHub Repo stars
MarkLLM An Open-Source Toolkit for LLM Watermarking. Link GitHub Repo stars
LLMSanitize An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs). Link GitHub Repo stars
Annotateai Automatically annotate papers using LLMs. Link GitHub Repo stars
LLM Reasxoner Make any LLM think like OpenAI o1 and DeepSeek R1. Link GitHub Repo stars
Claude-code-templates CLI tool for configuring and monitoring Claude Code. Link GitHub Repo stars
Awesome-LLM Curated list of open source and other tool an awesome collection Link GitHub Repo stars
Awesome-multimodal-LLM Latest advances on multi modal LLMs Link GitHub Repo stars
Awesome-ai-agents List of AI autonomous agents Link GitHub Repo stars
DDODS-ai-agents tutorials on LLMs, RAGs and real-world AI agent applications Link GitHub Repo stars
G-adk-samples A collection of sample agents built with Agent Development Kit Link GitHub Repo stars
open-llms list of open LLM's Link GitHub Repo stars

LLM tools

Webpage Description Link Docs
Paper2code PaperCoder is a multi-agent LLM system that transforms paper into a code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents. Link GitHub Repo stars
Flora Fauna Creative tool to make images into animation and other creative prototypes Link Docs
Story Doc Easily create stunning, interactive slide decks that increase engagement. Cool pitch or slide making. Link
Superblocks SAAS app builder .competitor with reflex-build Link
OpenSearch OpenSearch is a distributed search and analytics engine based on Apache Lucene. Link
Milvus A high performance and scalable vector database for ANN search Link GitHub Repo stars
LEANN Vector database, lightweight and less memory usage. Chat with your browser history. Perfect for local files. Link GitHub Repo stars
Rasa Open source machine learning framework to automate text- and voice-based conversations Link GitHub Repo stars

LLM Learning

Library Description Link Stars
Hands-On-Large-Language-Models Official code repo for the O'Reilly Book - "Hands-On Large Language Models" Link GitHub Repo stars

Image Tools

Library Description Link Stars
Hunyuan3D High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. Link GitHub Repo stars
Supervision reusable computer vision tools. Link GitHub Repo stars
PartCrater Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers Link GitHub Repo stars

Organization engineering blogs

I enjoy reading good technical blogs. Here are some of my frequent go-to engineering blogs.

  1. LinkedIn Engineering Blog
  2. Engineering Blog - DoorDash
  3. Engineering | Uber Blog
  4. The Unofficial Google Data Science Blog
  5. Pinterest Engineering Blog – Medium
  6. Netflix TechBlog
  7. Blog | LMSYS Org
  8. Blog | Anyscale
  9. Data Science and ML | Databricks Blog
  10. Together Blog
  11. Duolingo Engineering
  12. ML Technique
  13. Nvidia Research lab
  14. Karpathy's blog
  15. The G AI Blog
  16. Pytorch Sebastian
  17. Lilian Weng
  18. Hammel for MLOps
  19. ai by hand - Tom Yeh

llm-toolkit

⭐️

Please consider giving a star, if you find this repository useful.

About

A curated list of awesome LLM tools

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages