Skip to content
Change the repository type filter

All

    Repositories list

    • heros

      Public
      The Heuristic Evolutionary Rule Optimization System (HEROS) is a supervised rule-based machine learning algorithm designed to agnostically model diverse 'structured' data problems and yield compact human interpretable solutions. This implementation is scikit-learn compatible.
      Jupyter Notebook
      0700Updated Aug 5, 2025Aug 5, 2025
    • A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
      Python
      73100Updated Jun 25, 2025Jun 25, 2025
    • Python
      1000Updated Jun 10, 2025Jun 10, 2025
    • A scikit-learn implementation based on ExSTraCS 2.0
      Jupyter Notebook
      2902Updated Jun 3, 2025Jun 3, 2025
    • Simple Transparent End-To-End Automated Machine Learning Pipeline for Supervised Learning in Tabular Binary Classification Data
      Jupyter Notebook
      117820Updated Apr 30, 2025Apr 30, 2025
    • scikit-FIBERS (Feature Inclusion Bin Evolver for Risk Stratification) is a scikit-learn compatible machine learning algorithm for modeling or feature learning in survival analyses where feature 'burden' may be predictive of risk strata. Originally designed to identify amino-acid positions where mismatch burden predicts kidney graft failure risk.
      Jupyter Notebook
      2600Updated Mar 29, 2025Mar 29, 2025
    • LCS Discovery and Visualization Environment (LCS-DIVE)
      Python
      2400Updated Jan 17, 2025Jan 17, 2025
    • A scikit-learn-compatible Python implementation of eLCS, a supervised learning variant of Learning Classifier Systems
      Jupyter Notebook
      91912Updated Jun 17, 2024Jun 17, 2024
    • scikit learn compatible implementation of XCS, the most popular and best studied learning classifier system algorithm to date.
      Jupyter Notebook
      91203Updated Jun 17, 2024Jun 17, 2024
    • GAMETES

      Public
      Source code for the Genetic Architecture Model Emulator for Testing and Evaluating Software (GAMETES) is an algorithm for the generation of complex single nucleotide polymorphism (SNP) models for simulated association studies.
      Java
      1500Updated Jun 11, 2024Jun 11, 2024
    • Documentation and informational resources for LPC use
      Python
      9300Updated Nov 27, 2023Nov 27, 2023
    • FIBERS

      Public
      Feature Inclusion Bin Evolver for Risk Stratification (FIBERS) is an evolutionary algorithm that constructs bins of features, seeking to optimize the bins' stratification of event risk over time.
      Python
      0100Updated May 5, 2023May 5, 2023
    • scikit-RARE is scikit compatible pypi package for the RARE (Relevant Association Rare-variant-bin Evolver) evolutionary algorithm.
      Python
      0000Updated Mar 3, 2023Mar 3, 2023
    • RARE

      Public
      RARE: Relevant Association Rare-variant-bin Evolver (under development); an evolutionary algorithm approach to binning rare variants as a rare variant association analysis tool. Applications, visualizations, and modifications currently in works.
      Python
      1000Updated Jun 30, 2022Jun 30, 2022
    • Experimental variation of scikit-ExSTraCS that allows the user to import an initial rule population that will get initially evaluated and assigned fitness values prior to the start of learning iterations. This allows for the import of manually curated expert knowledge derived rules, or rules derived from other sources.
      Jupyter Notebook
      4000Updated May 11, 2022May 11, 2022
    • An automated, rigorous, and largely scikit-learn based machine learning analysis pipeline for binary classification. Adopts current best practices to avoid bias, optimize performance, ensure replicatability, capture complex associations (e.g. interactions and heterogeneity), and enhance interpretability. Includes (1) exploratory analysis, (2) da…
      Jupyter Notebook
      1700Updated May 7, 2022May 7, 2022
    • A set of Python-based Jupyter notebooks illustrating a documented example of a semi-automated term harmonization pipeline applied to harmonizing medical history terms across 28 clinical trials of pulminary arterial hypertension
      Jupyter Notebook
      0000Updated Oct 6, 2021Oct 6, 2021
    • An (updated and expanded) rigorous, well documented machine learning analysis pipeline for binary classification datasets assembled as a Jupyter Notebook. Includes exploratory analysis, data processing, feature processing, ML modeling (13 algorithms) with hyperparameter sweeps, visualizations, and statistical analysis. A comprehensive starting p…
      Jupyter Notebook
      61100Updated Jun 16, 2021Jun 16, 2021
    • Example PyKE code and Jupyter Notebook for a simple backwards chaining expert system as described in this lecture on YouTube: https://www.youtube.com/watch?v=mzsk5_EmZq8
      Jupyter Notebook
      82400Updated May 24, 2021May 24, 2021
    • An rigorous, machine learning analysis pipeline for binary classification datasets assembled as parallelizable command line modules. Includes exploratory analysis, data processing, feature processing, ML modeling (11 algorithms) with hyperparameter sweeps, visualizations, and statistical analysis. A comprehensive starting point to adapt to your …
      Python
      0000Updated Apr 23, 2021Apr 23, 2021
    • Python scripts to generate an diverse archive of simulated SNP datasets using GAMETES
      Python
      0000Updated Dec 3, 2020Dec 3, 2020
    • GP-LCS

      Public
      Supplemental materials and code for our GP-LCS project, adapting ExSTraCS to evolve GP trees rather than rules for comparison to other stand-alone GP algorithms
      Python
      0000Updated Sep 15, 2020Sep 15, 2020
    • An rigorous, well documented machine learning analysis pipeline for binary classification datasets assembled as a Jupyter Notebook. Includes exploratory analysis, data processing, feature processing, ML modeling (9 algorithms, including the original ExSTraCS algorithm) with hyperparameter sweeps, visualizations, and statistical analysis. A compr…
      Python
      3910Updated Sep 1, 2020Sep 1, 2020
    • Code and results for an investigation of pancreatic cancer datasets applying our binary classification machine learning analysis pipeline notebook. Includes analysis and comparison of three pancreatic cancer datasets.
      Jupyter Notebook
      3200Updated Aug 26, 2020Aug 26, 2020
    • This repository includes educational materials on machine learning and a basic example machine learning analysis pipeline. These materials were originally developed for a workshop series at the University of Pennsylvania.
      HTML
      7900Updated May 21, 2020May 21, 2020
    • Assembly of Jupyter notebooks comprising basic machine learning pipeline tasks. This student driven, independent study project will eventually evolve into a user-friendly starting point for ML pipeline example notebooks.
      Jupyter Notebook
      1100Updated Oct 15, 2018Oct 15, 2018