The QuantGen group, led by Drs. de los Campos and Vazquez has developed several methods and software packages for analysis of large-scale genomic data sets. The packages developed and maintained by the group include:
The QuantGen group, led by Drs. de los Campos and Vazquez, has developed and implemented methods for complex-trait prediction DNA information. This research has been funded by NIH grants R01GM0999992 and R01GM101219(PI de los Campos) and span from the development of parametric and semi-parametric methods, algorithms, and applications involving biobank-sized data. Selected publications: de los Campos et al. (2010) , Makowsky et al. (2011), de los Campos et al. (2013), de los Campos et al. 2015, Kim et al. (2017), Bellot et al. (2018), Lello et al. (2019)
Classical genomic models assumes that the effects of genes on disease risk are homogeneous across the subject of a population. However, genomic research has shown that sex and ethnic differences can modulate the effects of genes. de los Campos and Vazquez have developed whole-genome regression models that incorporate sex and ethnic differences as well as high-dimensional environmental information. Selected publications: Jarquin et al. (2015), Veturi et al. (2018), Funkhouser et al. (2020).
The QuantGen group contributes statistical and genetic expertise to the ORIGINs working group of the Enigma consortium. The prenatal and early postnatal period represents the foundational phase of human brain development This working group focuses on (1) Identifying genetic factors contributing to early brain development, (2) Developing predictive models for cognitive ability and emotional functioning using genetic variation, environmental risk factors, and neuroimaging phenotypes, and (3) Clarify how genetic risk for psychiatric disease manifests in infancy and early childhood.
Genetic/genomic survival association and risk prediction
Most of the genetic association studies of human diseases use case-control phenotypes (diseased vs. non-diseased). However, time-to-disease traits are more informative for the gene-disease association and are more suitable for building risk prediction models. We are developing robust and efficient statistical methods to detect genetic associations and predict disease risks with omics data for various types of survival outcomes and models. This project is led by Dr. Chenxi Li.
To identify genetic variants associated with complex traits or diseases, people need to use data with larger sample size and more powerful tests, especially when causal genetic variants have weak effects. However, acquiring data of sufficient scale can be challenging. We have developed gene-based association tests that leverage publicly available GWAS summary statistics. These methods have been successfully employed to pinpoint genes associated with conditions like schizophrenia and fasting glucose levels.
The association between genetic variants and a single phenotype is usually weak. It is increasingly recognized that joint analysis of multiple phenotypes can be potentially more powerful than the univariate analysis and can shed new light on underlying biological mechanisms of complex diseases. Our research primarily focuses on the development and application of innovative methods for jointly analyzing multiple phenotypes in genome-wide and epigenome-wide association studies, aimed at enhancing our ability to uncover the genetic underpinnings of complex diseases like chronic obstructive pulmonary disease, schizophrenia, and rheumatoid arthritis
Many real-world interventions are not randomly assigned. Even in randomized controlled trials, intercurrent events may prevent valid inference of efficacy. Non-compliance and non-random drop-out may lead to selection bias and comprise the interval validity of a study. Dr. Zhehui Luo collaborates with epidemiologists, sociologists, clinicians and other scientists to find appropriate methods dealing with complications of causal inference using observational data. Currently she is exploring different impacts of non-compliance in placebo-controlled versus active-controlled trials and examining the potential of using spatial-temporal data to mimic randomized trials. The applications apply to trials for reproductive health and birth outcomes.
Using large administrative claims data in research has advantages and disadvantages, which demands thorough understanding of the structure and limitations of such databases. Working closely with experts in generating, modifying and storing these data and utilizing her expertise in causal inference, Dr. Zhehui Luo provides insights in the design, analysis and interpretation of several program evaluation studies that have policy implications for improving population health. Currently she is investigating the impact of extending Medicaid benefit to persons affected by the Flint water crisis on health service utilization and implementing community health worker home visiting programs on birth outcomes. Dr. Zhehui Luo
Causal effects answer questions like "What would happen to the outcome if an intervention were implemented". Without randomized trials, associations may differ from causal effects, and causal effect estimation requires careful reasoning about confounding bias, selection bias, or other sources of biases, while avoiding unrealistic assumptions on the population. Dr. Qiu works on developing optimal statistical procedures for causal inference under minimal assumptions. The methods apply to perinatal & pediatric epidemiology.
Data are often coarsened, for example, missing or censored. Researchers may also wish to leverage data from different sources to obtain more accurate estimators and better machine learning models. Dr. Qiu works on developing optimal statistical procedures to fully extract information from coarsened data and data from different sources. The methods apply to patient-reported outcomes, perinatal & pediatric epidemiology, vaccine trials, machine learning, and prediction sets.