Genetic Epidemiology

Dr Gustavo de los Campos & Dr Ana Vazquez
- Dr Gustavo de los Campos Faculty page
- Dr Ana Vazquez Faculty page
Methods and Software for Large Scale Genomic Analysis

The QuantGen group, led by Drs. de los Campos and Vazquez has developed several methods and software packages for analysis of large-scale genomic data sets. The packages developed and maintained by the group include:
- BGLR software for high-dimensional Bayesian regression implements various types of shrinkage and variable selection methods and is optimized for analysis of very large genomic data sets. The development and maintenance of the package has been funded by R01 GM101219 (PI de los Campos, 2 cycles). [ R | GitHub | publication ]
- BGData is a suite of R-packages for handling extremely large genomic data sets (millions of samples and millions of features). The package implements memory mapping of bed files, linked arrays, and many functions for genomic data analysis, including computing kinship and GWAS. The development and maintenance of the software was supported by R01 GM101219 (PI de los campos). [ R | GitHub |publication ]
- MTM is a software for multi-trait analysis of complex traits. The functionality of this package has been incorporated (with a much more efficient and complete implementation) into the BGLR R-package. [ GitHub ].
- pedigreemm enables fitting mixed effects models including pedigree information. The package pedigreeTools implements various operations on pedigrees, including pruning, editing, computation of additive relationships and functions thereof (e.g., Cholesky decomposition). [ R | GitHub ]
- pleiotest is an R-package for multi-trait GWAS and pleiotropy analysis. The package is largely developed in C++ and it is highly optimized for analysis of very large genomic data sets. [ GitHub ]
- PedigreeTools: A suit of functions for pedigree analyses, including: sorting and editing pedigree data, computing inbreeding, additive relationships and functions of it.
- MOSS (multi-omic integration with Sparse Singular Value Decomposition), integrates multiple and large omic data layers of high-dimensional and big size datasets. [ R | GitHub ].
Analysis and prediction of complex human traits with biobank data

The QuantGen group, led by Drs. de los Campos and Vazquez, has developed and implemented methods for complex-trait prediction DNA information. This research has been funded by NIH grants R01GM0999992 and R01GM101219(PI de los Campos) and span from the development of parametric and semi-parametric methods, algorithms, and applications involving biobank-sized data. Selected publications: de los Campos et al. (2010) , Makowsky et al. (2011), de los Campos et al. (2013), de los Campos et al. 2015, Kim et al. (2017), Bellot et al. (2018), Lello et al. (2019)

Incorporating sex and ethnic differences in genomic models

Classical genomic models assumes that the effects of genes on disease risk are homogeneous across the subject of a population. However, genomic research has shown that sex and ethnic differences can modulate the effects of genes. de los Campos and Vazquez have developed whole-genome regression models that incorporate sex and ethnic differences as well as high-dimensional environmental information. Selected publications: Jarquin et al. (2015), Veturi et al. (2018), Funkhouser et al. (2020).

ORIGINS (PI Rebecca Knickmeyer)

The QuantGen group contributes statistical and genetic expertise to the ORIGINs working group of the Enigma consortium. The prenatal and early postnatal period represents the foundational phase of human brain development This working group focuses on (1) Identifying genetic factors contributing to early brain development, (2) Developing predictive models for cognitive ability and emotional functioning using genetic variation, environmental risk factors, and neuroimaging phenotypes, and (3) Clarify how genetic risk for psychiatric disease manifests in infancy and early childhood.
Dr Chenxi Li

Faculty page

Genetic/genomic survival association and risk prediction

Most of the genetic association studies of human diseases use case-control phenotypes (diseased vs. non-diseased). However, time-to-disease traits are more informative for the gene-disease association and are more suitable for building risk prediction models. We are developing robust and efficient statistical methods to detect genetic associations and predict disease risks with omics data for various types of survival outcomes and models. This project is led by Dr. Chenxi Li.
Dr Xiaoyu Liang

Faculty page

DNA Methylation-Based Biomarker for Predicting Risk of Substance Use Disorder

Substance use and polysubstance use have significant public health implications due to their widespread prevalence and associated mortality, morbidity, and economic costs. These issues have a profound impact not only on individuals who use substances but also on their families and broader communities. One of the limitations of previous studies on substance use disorder was assessed by self-reported, which may lead to inaccurate assessment and introduce bias. We have identified DNA methylation signatures as a robust biomarker for predicting different substances. Our results demonstrate that an objective measure of substance use is a more informative phenotype than self-reported data for revealing epigenetic mechanisms.

genetic

Department of EPIDEMIOLOGY AND BIOSTATISTICS

Genetic Epidemiology

Methods and Software for Large Scale Genomic Analysis

Analysis and prediction of complex human traits with biobank data

Incorporating sex and ethnic differences in genomic models

ORIGINS (PI Rebecca Knickmeyer)

Genetic/genomic survival association and risk prediction

DNA Methylation-Based Biomarker for Predicting Risk of Substance Use Disorder

Research