In a nutshell, our lab develops computational methods to identify the complex genetic and environmental interactions that lead to human disease. Our methods address challenges such as epistasis, heterogeneity, and scalability.
You can browse our projects below for more information.
EVE: ENSEMBL VEP on EC2
ellyn: A sklearn-compatible linear genetic programming system for symbolic regression and classification.
ExSTraCS: Extended Supervised Tracking and Classifying System
FEAT: A feature engineering automation tool for regression and classification.
FEW: A feature engineering wrapper for scikit-learn.
scikit-mdr: A sklearn-compatible Python implementation of Multifactor Dimensionality Reduction (MDR) for feature construction.
scikit-rebate: A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
stir: ReliefF on steroids
tpot: A Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.
treeheatr: Heatmap-integrated decision tree visualizations
ClinicalDataSources: Open or Easy Access Clinical Data Sources for Biomedical Research
Penn ML Benchmarks (PMLB): A large, curated repository of benchmarks for evaluating supervised machine learning algorithms.
pmlbr: an R interface to PMLB