Epistasis Lab

Identifying the complex genetic architectures of disease

This is the documentation page for the Epistasis Lab, a research group in the Institute for Biomedical Informatics at UPenn.

In a nutshell, our lab develops computational methods to identify the complex genetic architectures that lead to human disease. Our methods address challenges such as epistasis, heterogeneity, and scalability.

ellyn: A sklearn-compatible linear genetic programming system for symbolic regression and classification.

FEW: A feature engineering wrapper for scikit-learn.

scikit-mdr: A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.

scikit-rebate: A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

tpot: A Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.

Useful Collections

ClinicalDataSources: Open or Easy Access Clinical Data Sources for Biomedical Research

Penn ML Benchmarks (PMLB): A large, curated repository of benchmarks for evaluating supervised machine learning algorithms.