Ferrolearn - High-performance machine learning library

Ferrolearn brings Rust's performance to Python's machine learning ecosystem. By implementing compute-intensive algorithms in Rust, we achieve significant speedups while maintaining the familiar scikit-learn API.

Key Features

🚀 2-10x faster than pure Python implementations
🔧 Scikit-learn compatible API - drop-in replacement
🦀 Rust-powered - memory safe and blazingly fast
📊 Zero-copy operations - efficient NumPy integration
⚡ Automatic parallelization - scales with your CPU cores

Installation

Prerequisites

Python 3.8+
Rust 1.70+
pip

From PyPI

The easiest way to install Ferrolearn is via pip from the Python Package Index (PyPI):

pip install ferrolearn

This will download and install the pre-built wheel for your platform (if available) or build from source if necessary. Note that building from source requires a Rust compiler.

After installation, you can verify it by importing in Python:

import ferrolearn
print(ferrolearn.__version__)  # Should print '0.1.0' or your current version

Quick Start

from ferrolearn import KMeans
import numpy as np

# Generate sample data
X = np.random.rand(10000, 50)

# Create and fit model - same API as scikit-learn
kmeans = KMeans(n_clusters=5, random_state=42)
kmeans.fit(X)

# Get predictions
labels = kmeans.predict(X)
print(f"Cluster centers shape: {kmeans.cluster_centers_.shape}")
print(f"Iterations: {kmeans.n_iter_}")

API Reference

KMeans

class KMeans(n_clusters=8, max_iters=300, tol=1e-4, random_state=None)

Parameters:

n_clusters: Number of clusters (default: 8)
max_iters: Maximum iterations (default: 300)
tol: Convergence tolerance (default: 1e-4)
random_state: Random seed for reproducibility

Methods:

fit(X): Fit the model
predict(X): Predict cluster labels
fit_predict(X): Fit and predict in one call

Attributes:

cluster_centers_: Cluster centroids
n_iter_: Number of iterations run
inertia_: Sum of squared distances to nearest cluster

Architecture

ferrolearn leverages Rust's strengths where they matter most:

Python (API Layer)          Rust (Compute Layer)
    │                              │
    ├─ KMeans.fit() ─────────────► │ Parallel distance computation
    │                              │ SIMD-ready operations
    ├─ NumPy arrays ◄────────────► │ Zero-copy array views
    │                              │ Cache-efficient algorithms
    └─ Results ◄───────────────────┘

Development

Setup Development Environment

# Clone and setup
git clone https://github.com/Rafa-Gu98/ferrolearn.git
cd ferrolearn

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install in development mode
make dev-install

Running Tests

# All tests
make test

# Only Rust tests
cargo test

# Only Python tests
pytest tests/

Project Structure

ferrolearn/
├── src/                # Rust source code
│   ├── lib.rs          # PyO3 bindings
│   └── kmeans.rs       # K-Means implementation
├── python/             # Python package
├── tests/              # Test suite
├── Cargo.toml          # Rust dependencies
└── pyproject.toml      # Python packaging

Roadmap

Current (v0.1.0)

✅ K-Means clustering
✅ Scikit-learn compatible API
✅ Comprehensive benchmarks

Upcoming

DBSCAN clustering
Mini-batch K-Means
Random Forest
Gradient Boosting

Future

GPU acceleration
Distributed computing
More algorithms based on user feedback

Contributing

We welcome contributions! ferrolearn is most impactful for:

Algorithms with many iterations
Embarrassingly parallel computations
Memory-intensive operations

Performance Notes

When ferrolearn shines:

Medium to large datasets (>10k samples)
Moderate dimensionality (20-100 features)
Multiple iterations or clusters

Current limitations:

Small datasets may not see significant speedup due to overhead
Not all algorithms benefit equally from Rust implementation

License

MIT License - see LICENSE file for details.

Author

Rafa_PyRs.dev

Email: rafagr98.dev@gmail.com
GitHub: @rafagr98

Acknowledgments

Built with PyO3 - Rust bindings for Python
Inspired by scikit-learn - API design
Powered by ndarray and rayon

ferrolearn: Where Python meets Rust for machine learning performance

Made with 🐍 and 🦀

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples		examples
python/ferrolearn		python/ferrolearn
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
Changelog.md		Changelog.md
License		License
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ferrolearn - High-performance machine learning library

Key Features

Installation

Prerequisites

From PyPI

Quick Start

API Reference

KMeans

Architecture

Development

Setup Development Environment

Running Tests

Project Structure

Roadmap

Current (v0.1.0)

Upcoming

Future

Contributing

Performance Notes

License

Author

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

Rafa-Gu98/ferrolearn

Folders and files

Latest commit

History

Repository files navigation

Ferrolearn - High-performance machine learning library

Key Features

Installation

Prerequisites

From PyPI

Quick Start

API Reference

KMeans

Architecture

Development

Setup Development Environment

Running Tests

Project Structure

Roadmap

Current (v0.1.0)

Upcoming

Future

Contributing

Performance Notes

License

Author

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages