Epistasis Lab

Identifying the complex genetic architectures of disease

View My GitHub Profile

This is the documentation page for the Epistasis Lab, a research group at at Cedars-Sinai Medical Center in Los Angeles, CA (USA).

In a nutshell, our lab develops computational methods to identify the complex genetic and environmental interactions that lead to human disease. Our methods address challenges such as epistasis, heterogeneity, and scalability.

You can browse our projects below for more information.

Methods

EVE: ENSEMBL VEP on EC2

ellyn: A sklearn-compatible linear genetic programming system for symbolic regression and classification.

ExSTraCS: Extended Supervised Tracking and Classifying System

FEAT: A feature engineering automation tool for regression and classification.

FEW: A feature engineering wrapper for scikit-learn.

scikit-mdr: A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.

scikit-rebate: A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.

stir: ReliefF on steroids

tpot: A Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.

treeheatr: Heatmap-integrated decision tree visualizations

Useful Collections

ClinicalDataSources: Open or Easy Access Clinical Data Sources for Biomedical Research

Penn ML Benchmarks (PMLB): A large, curated repository of benchmarks for evaluating supervised machine learning algorithms.

pmlbr: an R interface to PMLB