Skip to content

Commit a9643a9

Browse files
committed
Add optimize-for-gpu skill
1 parent 37a2b70 commit a9643a9

File tree

16 files changed

+8198
-10
lines changed

16 files changed

+8198
-10
lines changed

.claude-plugin/marketplace.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
},
77
"metadata": {
88
"description": "Claude scientific skills from K-Dense Inc",
9-
"version": "2.33.0"
9+
"version": "2.34.0"
1010
},
1111
"plugins": [
1212
{
@@ -82,6 +82,7 @@
8282
"./scientific-skills/neuropixels-analysis",
8383
"./scientific-skills/omero-integration",
8484
"./scientific-skills/open-notebook",
85+
"./scientific-skills/optimize-for-gpu",
8586
"./scientific-skills/opentrons-integration",
8687
"./scientific-skills/paper-2-web",
8788
"./scientific-skills/paper-lookup",

README.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# Claude Scientific Skills
22

3-
> **New: [K-Dense BYOK](https://github.com/K-Dense-AI/k-dense-byok)** — A free, open-source AI co-scientist that runs on your desktop, powered by Claude Scientific Skills. Bring your own API keys, pick from 40+ models, and get a full research workspace with web search, file handling, 100+ scientific databases, and access to all 136 skills in this repo. Your data stays on your computer, and you can optionally scale to cloud compute via [Modal](https://modal.com/) for heavy workloads. [Get started here.](https://github.com/K-Dense-AI/k-dense-byok)
3+
> **New: [K-Dense BYOK](https://github.com/K-Dense-AI/k-dense-byok)** — A free, open-source AI co-scientist that runs on your desktop, powered by Claude Scientific Skills. Bring your own API keys, pick from 40+ models, and get a full research workspace with web search, file handling, 100+ scientific databases, and access to all 137 skills in this repo. Your data stays on your computer, and you can optionally scale to cloud compute via [Modal](https://modal.com/) for heavy workloads. [Get started here.](https://github.com/K-Dense-AI/k-dense-byok)
44
55
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE.md)
6-
[![Skills](https://img.shields.io/badge/Skills-136-brightgreen.svg)](#whats-included)
6+
[![Skills](https://img.shields.io/badge/Skills-137-brightgreen.svg)](#whats-included)
77
[![Databases](https://img.shields.io/badge/Databases-100%2B-orange.svg)](#whats-included)
88
[![Agent Skills](https://img.shields.io/badge/Standard-Agent_Skills-blueviolet.svg)](https://agentskills.io/)
99
[![Works with](https://img.shields.io/badge/Works_with-Cursor_|_Claude_Code_|_Codex-blue.svg)](#getting-started)
1010
[![X](https://img.shields.io/badge/Follow_on_X-%40k__dense__ai-000000?logo=x)](https://x.com/k_dense_ai)
1111
[![LinkedIn](https://img.shields.io/badge/LinkedIn-K--Dense_Inc.-0A66C2?logo=linkedin)](https://www.linkedin.com/company/k-dense-inc)
1212
[![YouTube](https://img.shields.io/badge/YouTube-K--Dense_Inc.-FF0000?logo=youtube)](https://www.youtube.com/@K-Dense-Inc)
1313

14-
A comprehensive collection of **136 ready-to-use scientific and research skills** (covering cancer genomics, drug-target binding, molecular dynamics, RNA velocity, geospatial science, time series forecasting, 78+ scientific databases, and more) for any AI agent that supports the open [Agent Skills](https://agentskills.io/) standard, created by [K-Dense](https://k-dense.ai). Works with **Cursor, Claude Code, Codex, and more**. Transform your AI agent into a research assistant capable of executing complex multi-step scientific workflows across biology, chemistry, medicine, and beyond.
14+
A comprehensive collection of **137 ready-to-use scientific and research skills** (covering cancer genomics, drug-target binding, molecular dynamics, RNA velocity, geospatial science, time series forecasting, 78+ scientific databases, and more) for any AI agent that supports the open [Agent Skills](https://agentskills.io/) standard, created by [K-Dense](https://k-dense.ai). Works with **Cursor, Claude Code, Codex, and more**. Transform your AI agent into a research assistant capable of executing complex multi-step scientific workflows across biology, chemistry, medicine, and beyond.
1515

1616
<p align="center">
1717
<a href="https://k-dense.ai">
@@ -52,7 +52,7 @@ These skills enable your AI agent to seamlessly work with specialized scientific
5252

5353
## 📦 What's Included
5454

55-
This repository provides **136 scientific and research skills** organized into the following categories:
55+
This repository provides **137 scientific and research skills** organized into the following categories:
5656

5757
- **100+ Scientific & Financial Databases** - A unified database-lookup skill provides direct access to 78 public databases (PubChem, ChEMBL, UniProt, COSMIC, ClinicalTrials.gov, FRED, USPTO, and more), plus dedicated skills for DepMap, Imaging Data Commons, PrimeKG, and U.S. Treasury Fiscal Data. Multi-database packages like BioServices (~40 bioinformatics services), BioPython (38 NCBI sub-databases via Entrez), and gget (20+ genomics databases) add further coverage
5858
- **70+ Optimized Python Package Skills** - Explicitly defined skills for RDKit, Scanpy, PyTorch Lightning, scikit-learn, BioPython, pyzotero, BioServices, PennyLane, Qiskit, OpenMM, MDAnalysis, scVelo, TimesFM, and others — with curated documentation, examples, and best practices. Note: the agent can write code using *any* Python package, not just these; these skills simply provide stronger, more reliable performance for the packages listed
@@ -98,7 +98,7 @@ Each skill includes:
9898
- **Multi-Step Workflows** - Execute complex pipelines with a single prompt
9999

100100
### 🎯 **Comprehensive Coverage**
101-
- **136 Skills** - Extensive coverage across all major scientific domains
101+
- **137 Skills** - Extensive coverage across all major scientific domains
102102
- **100+ Databases** - Unified access to 78+ databases via database-lookup, plus dedicated data access skills and multi-database packages like BioServices, BioPython, and gget
103103
- **70+ Optimized Python Package Skills** - RDKit, Scanpy, PyTorch Lightning, scikit-learn, BioServices, PennyLane, Qiskit, OpenMM, scVelo, TimesFM, and others (the agent can use any Python package; these are the pre-documented, higher-performing paths)
104104

@@ -369,7 +369,7 @@ If so, **[K-Dense Web](https://k-dense.ai)** was built for you. It is the full A
369369

370370
| Feature | This Repo | K-Dense Web |
371371
|---------|-----------|-------------|
372-
| Scientific Skills | 136 skills | **200+ skills** (exclusive access) |
372+
| Scientific Skills | 137 skills | **200+ skills** (exclusive access) |
373373
| Setup | Manual installation | **Zero setup, works instantly** |
374374
| Compute | Your machine | **Cloud GPUs and HPC included** |
375375
| Workflows | Prompt and code | **End-to-end research pipelines** |
@@ -432,7 +432,7 @@ If so, **[K-Dense Web](https://k-dense.ai)** was built for you. It is the full A
432432

433433
## 📚 Available Skills
434434

435-
This repository contains **136 scientific and research skills** organized across multiple domains. Each skill provides comprehensive documentation, code examples, and best practices for working with scientific libraries, databases, and tools.
435+
This repository contains **137 scientific and research skills** organized across multiple domains. Each skill provides comprehensive documentation, code examples, and best practices for working with scientific libraries, databases, and tools.
436436

437437
### Skill Categories
438438

@@ -542,8 +542,9 @@ This repository contains **136 scientific and research skills** organized across
542542
- Knowledge graph: PrimeKG (precision medicine knowledge graph — genes, drugs, diseases, phenotypes)
543543
- Fiscal data: U.S. Treasury Fiscal Data (national debt, Treasury statements, auctions, exchange rates)
544544

545-
#### 🔧 **Infrastructure & Platforms** (6+ skills)
545+
#### 🔧 **Infrastructure & Platforms** (7+ skills)
546546
- Cloud compute: Modal
547+
- GPU acceleration: Optimize for GPU (CuPy, Numba CUDA, Warp, cuDF, cuML, cuGraph, KvikIO, cuCIM, cuxfilter, cuVS, cuSpatial, RAFT)
547548
- Genomics platforms: DNAnexus, LatchBio
548549
- Microscopy: OMERO
549550
- Automation: Opentrons
@@ -745,7 +746,7 @@ If you use Claude Scientific Skills in your research or project, please cite it
745746
title = {Claude Scientific Skills: A Comprehensive Collection of Scientific Tools for Claude AI},
746747
year = {2026},
747748
url = {https://github.com/K-Dense-AI/claude-scientific-skills},
748-
note = {136 skills covering databases, packages, integrations, and analysis tools}
749+
note = {137 skills covering databases, packages, integrations, and analysis tools}
749750
}
750751
```
751752

docs/scientific-skills.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@
5959
### Data Management & Infrastructure
6060
- **LaminDB** - Open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR (Findable, Accessible, Interoperable, Reusable). Provides unified platform combining lakehouse architecture, lineage tracking, feature stores, biological ontologies (via Bionty plugin with 20+ ontologies: genes, proteins, cell types, tissues, diseases, pathways), LIMS, and ELN capabilities through a single Python API. Key features include: automatic data lineage tracking (code, inputs, outputs, environment), versioned artifacts (DataFrame, AnnData, SpatialData, Parquet, Zarr), schema validation and data curation with standardization/synonym mapping, queryable metadata with feature-based filtering, cross-registry traversal, and streaming for large datasets. Supports integrations with workflow managers (Nextflow, Snakemake, Redun), MLOps platforms (Weights & Biases, MLflow, HuggingFace, scVI-tools), cloud storage (S3, GCS, S3-compatible), array stores (TileDB-SOMA, DuckDB), and visualization (Vitessce). Deployment options: local SQLite, cloud storage with SQLite, or cloud storage with PostgreSQL for production. Use cases: scRNA-seq standardization and analysis, flow cytometry/spatial data management, multi-modal dataset integration, computational workflow tracking with reproducibility, biological ontology-based annotation, data lakehouse construction for unified queries, ML pipeline integration with experiment tracking, and FAIR-compliant dataset publishing
6161
- **Modal** - Serverless cloud platform for running Python code with minimal configuration, specialized for AI/ML workloads and scientific computing. Execute functions on powerful GPUs (T4, L4, A10, A100, L40S, H100, H200, B200, B200+), scale automatically from zero to thousands of containers, and pay only for compute used. Key features include: declarative container image building with uv (recommended)/pip/apt package management, automatic autoscaling with configurable limits and buffer containers, GPU acceleration with multi-GPU support (up to 8 GPUs per container, up to 1,536 GB VRAM), persistent storage via Volumes (v1 and v2) for model weights and datasets, secret management for API keys and credentials, scheduled jobs with cron expressions, web endpoints for deploying serverless APIs (FastAPI, ASGI, WSGI, WebSockets), parallel execution with `.map()` for batch processing, input concurrency and dynamic batching for I/O-bound workloads, and resource configuration (CPU cores, memory, ephemeral disk up to 3 TiB). Supports custom Docker images, Micromamba/Conda environments, integration with Hugging Face/Weights & Biases, and distributed multi-GPU training. Free tier includes $30/month credits. Use cases: ML model deployment and inference (LLMs, image generation, speech, embeddings), GPU-accelerated training and fine-tuning, batch processing large datasets in parallel, scheduled compute-intensive jobs, serverless API deployment with autoscaling, protein folding and computational biology, scientific computing requiring distributed compute or specialized hardware, and data pipeline automation
62+
- **Optimize for GPU** - GPU-accelerate Python code using the NVIDIA RAPIDS ecosystem and related libraries for 10x–1000x speedups on suitable workloads. Covers 12 GPU libraries with decision framework for choosing the right tool: CuPy (drop-in NumPy/SciPy replacement for array operations, FFT, linear algebra), Numba CUDA (custom GPU kernels with fine-grained thread/block/shared-memory control), Warp (JIT-compiled simulation kernels with built-in spatial types for physics, mesh ray casting, differentiable programming, and robotics), cuDF (drop-in pandas replacement for dataframe ETL, groupby, joins), cuML (drop-in scikit-learn replacement for classification, regression, clustering, dimensionality reduction, preprocessing), cuGraph (drop-in NetworkX replacement for PageRank, centrality, community detection, shortest paths), KvikIO (GPUDirect Storage for high-performance file IO bypassing CPU memory, S3/HTTP direct-to-GPU reads, Zarr GPU backend), cuxfilter (GPU-accelerated interactive cross-filtering dashboards with Bokeh, Datashader, and Deck.gl), cuCIM (drop-in scikit-image replacement for image filtering, morphology, segmentation, plus fast whole-slide image reading for digital pathology), cuVS (GPU-accelerated vector/similarity search with CAGRA, IVF-Flat, IVF-PQ for RAG and recommender systems), cuSpatial (GPU-accelerated GeoPandas replacement for spatial joins, point-in-polygon, trajectory analysis), and RAFT/pylibraft (sparse eigensolvers, device memory management, multi-GPU communication). All libraries interoperate via CUDA Array Interface for zero-copy data sharing. Includes optimization workflow (profile first, assess GPU suitability, start with drop-in replacements, minimize host-device transfers), code transformation patterns for each library, memory management principles, and common pitfalls. Use cases: accelerating NumPy/pandas/scikit-learn/NetworkX/scikit-image/GeoPandas/Faiss workloads, physics simulation, differentiable rendering, particle systems, vector search for RAG pipelines, GPUDirect Storage file IO, interactive data exploration dashboards, geospatial analysis, medical imaging, and sparse eigenvalue problems
6263

6364
### Cheminformatics & Drug Discovery
6465
- **Datamol** - Python library for molecular manipulation and featurization built on RDKit with enhanced workflows and performance optimizations. Provides utilities for molecular I/O (reading/writing SMILES, SDF, MOL files), molecular standardization and sanitization, molecular transformations (tautomer enumeration, stereoisomer generation), molecular featurization (descriptors, fingerprints, graph representations), parallel processing for large datasets, and integration with machine learning pipelines. Features include: optimized RDKit operations, caching for repeated computations, molecular filtering and preprocessing, and seamless integration with pandas DataFrames. Designed for drug discovery and cheminformatics workflows requiring efficient processing of large compound libraries. Use cases: molecular preprocessing for ML models, compound library management, molecular similarity searches, and cheminformatics data pipelines

0 commit comments

Comments
 (0)