Epistasis Lab

Identifying the complex genetic architectures of disease

View My GitHub Profile

This is the documentation page for the Epistasis Lab, a research group in the Institute for Biomedical Informatics at UPenn.

In a nutshell, our lab develops computational methods to identify the complex genetic architectures that lead to human disease. Our methods address challenges such as epistasis, heterogeneity, and scalability.

You can browse our projects below for more information.

Methods

EVE: ENSEMBL VEP on EC2

ellyn: A sklearn-compatible linear genetic programming system for symbolic regression and classification.

ExSTraCS: Extended Supervised Tracking and Classifying System

FEW: A feature engineering wrapper for scikit-learn.

scikit-mdr: A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.

scikit-rebate: A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

tpot: A Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.

Useful Collections

ClinicalDataSources: Open or Easy Access Clinical Data Sources for Biomedical Research

Penn ML Benchmarks (PMLB): A large, curated repository of benchmarks for evaluating supervised machine learning algorithms.