Biostatistics & Bioinformatics

Biostatistics & Bioinformatics Research & Practice

Research Overview

In the Department of Biostatistics and Bioinformatics, we value collaboration in our research and practice above all. Our faculty are engaged in multidisciplinary work across disciplines and groups at Rollins, throughout the Woodruff Health Sciences Center, and across Emory.

Our Research Topics and Types

Our department’s methodological research involves developing and applying statistical methodology in search of answers to medical and public health questions.

Research Topics

  • Cancer
  • Cardiovascular health
  • Environmental health
  • Epidemiology
  • HIV/AIDS
  • Infectious diseases
  • Mental health
  • Microbiome 

Research Types

  • Agreement studies
  • Bayesian methodology
  • Bioinformatics
  • Causal inference
  • Clinical trials
  • Latent class analysis
  • Machine learning
  • Spatial statistics
  • Survival analysis 

Research Spotlight

BIOS Research at the Forefront

More BIOS Research News

The Data Dilemma

Rollins’ new certificate program teaches students to navigate the complexities of data science.

Focus Areas

Our Areas of Expertise

Agreement studies have wide and important applications in biomedical research and clinical practice. For example, when a new assay or instrument is developed, it is important to assess whether the new method can reproduce the results of a traditional method or of a gold standard. Our department has established a strong research program in developing agreement methodology for complex outcomes in biomedical studies.  

Research conducted by our faculty includes:

  • Development of new statistical methods to extend the existing agreement paradigm to handle multiple scale (continuous/ordinal) measurements
  • Development of nonparametric as well as parametric approaches to assess agreement for survival outcomes which involve censored or truncated observations
  • Development of agreement methods to investigate the alignment between traditional behavior/clinical outcomes and the emerging high-dimensional neuroimaging data
  • Development of agreement methods for assessing comparability among brain images acquired from multi-center neuroimaging studies
  • Agreement methods for data resulting from studies where each observer makes replicated or repeated readings on each study subject 

Bioinformatics is an interdisciplinary field developing methods and software tools to analyze high-dimensional data generated from biomedical experiments. With biomedical data sets becoming larger, more diverse, and more complex, information science and biostatistics play a larger role in the biomedical sciences.

Bioinformatics is a very diverse field, with applications ranging across DNA sequence analysis, protein structure assessment, models of molecular evolution, and imaging analysis. Research in our department concentrates primarily on omics—genomics, epigenomics, and metabolomics.

Our faculty's research has resulted in novel statistical methodologies, as well as powerful and efficient algorithms and software developed for biomedical researchers. In addition, our faculty members collaborate extensively with biologists and clinicians at Emory and elsewhere to assist efforts to identify novel biological insights from these rapidly expanding experimental data. 

The goal of causal inference is to make inference on causation and treatment effects using data from observational studies as well as data from complex clinical trials. Our faculty members are actively engaged in methodological and collaborative research in this area.

Examples include:

  • Development of a general class of hybrid trial designs that combine features of treatment randomization and patient choice of treatments. Such trials are useful in behavioral intervention studies where treatment assignments cannot be blinded and strong motivation is often required to maintain compliance
  • Extension of propensity score approaches to non-binary treatment regimens and other approaches related to generalized propensity scores
  • Semiparametric methods for estimating effect of non-binary treatment regimens
  • Development of efficiency theory for estimators of causal effects
  • Developing robust methods for drawing inference about causal effects
  • Incorporation of machine learning into estimation of casual effects 

Neuroimaging techniques have become an increasingly important tool in clinical research to help diagnose, treat, and prevent brain diseases. In recent years, imaging statistics has emerged as one of the fastest growing research areas in biostatistics.

The main goal of imaging statistics is to develop and apply state-of-the-art statistical methods to help extract the most relevant and accurate information from neuroimaging data, advancing scientific understanding of human brain function.  

Our department hosts one of the first imaging statistics research centers in the country, the Center for Biomedical Imaging Statistics (CBIS). CBIS currently develops statistical methods for data acquired from various imaging modalities including functional and structural magnetic resonance imaging, magnetic resonance spectroscopic imaging, and positron emission tomography.  

CBIS faculty and students have conducted statistical methodological research in:

  • Brain network analysis using analytical tools such as independent component analysis and graphical models to understand brain architecture and neural circuits
  • Imaging-based predictive modeling that aims to extract features in imaging data to predict individual disease status and treatment response
  • Reproducibility of imaging studies using analytical tools such as agreement methodologies and meta-analysis
  • Imaging genetics that integrate neuroimaging and genetic data to investigate how genetic variations impact brain structure and function which further leads to alterations in subjects' behavioral and psychiatric outcomes
  • Impacts of image acquisition, reconstruction, and preprocessing methods with applications to experimental design

In addition to methodological research, CBIS has collaborated with imaging researchers from Emory's Departments of Psychiatric and Behavioral Sciences, Radiology, and Biomedical Engineering, and the Winship Cancer Institute

Machine learning and artificial intelligence describe techniques for automated recognition of patterns in data. From image recognition to prediction of clinical disease, these techniques have myriad applications in modern society. Biostatisticians play an important role in developing and understanding the theoretical properties of such techniques. Examples include:

  • Development of new machine learning algorithms for prediction of clinical disease and optimal allocation of treatments
  • Methods for estimating the predictive performance of machine learning algorithms
  • Methods for optimal selection of tuning parameters for machine learning algorithms
  • Methods for scalable machine learning for online data collection 

The human microbiome is the community of microbes in and on the human body. Recent developments in high-throughput sequencing have allowed all the microbes in a community to be identified in a single, simple experiment such as 16S rRNA gene sequencing or metagenome shotgun sequencing.  

Many of these 16S studies have produced headlines in the popular press and tantalizing hints in scientific literature that conditions as widely varied as obesity, rheumatoid arthritis, autism, preterm birth, and Alzheimer's disease may be related to the microbiome.  Further, interventions to change the microbiome and affect human health are easy to imagine. For example, fecal microbial transplant has recently been shown to be a highly effective, low-cost treatment for persistent Clostridium difficile infection.

Although research into the microbiome is generating exciting headlines, the basic statistical science required to fully understand and analyze data has not kept pace to answer even basic questions.  

Challenges include:

  • Because there are typically hundreds or thousands of species cohabiting at a site, the data that represent the microbial community are high-dimensional.
  • Human microbiome studies frequently adopt complex study designs such as paired, clustered, or longitudinal schemes.
  • The presence of confounding variables (e.g., gender and ancestry) and more sophisticated outcomes (e.g., possibly censored survival time) are inherent issues in many observational studies of human microbiome.
  • The complicated process of recruiting samples in medical contexts typically results in samples being sequenced in different batches, leading to strong batch effects.
  • Due to the high dimensionality of microbiome data, it is important to adjust findings for multiple comparisons.
  • Complex issues such as causal inference and mediation analysis (e.g., how much of the effect of baby aspirin on hazard of myocardial infarction is due to a change in the gut microbiome) have not been addressed in the context of analysis of 16S microbiome data. 

Public health data are increasingly being collected with geospatial information. The analysis of spatially referenced data provides opportunities for a wide variety of methodological and applied statistical research. These approaches often involve the use of spatially correlated random effects within generalized linear mixed models to accurately estimate fixed effects accounting for the presence of spatial correlation.

Research typically uses geographic information systems to manage and visualize data and Bayesian hierarchical models to examine associations between outcomes and possible explanatory variables.  

The department's faculty members are involved in numerous research projects developing spatiotemporal models for a wide range of applications. Examples include:

  • Infectious disease (spatial dynamics of raccoon rabies, malaria, and schistosomiasis)
  • Ecology (spatial patterns in sea turtle nesting)
  • Epidemiology (measuring and mapping disparities in disease burdens and accessibility to health care/sanitation)
  • Exposure assessment (data assimilation of satellite imagery and ground monitor exposure data)
  • Environmental health (estimating the health impacts of air quality, extreme heat, and climate change) 

In addition to methodological work directly related to the health sciences, our faculty members also engage in fundamental research in statistical theory.

One area of theoretical research addresses the problem of "many nuisance parameters,” which arises when there is substantial heterogeneity in the population that is not of main scientific interest but that must be accounted for in order to arrive at valid inference and robust conclusions. The presence of many nuisance parameters is pathological: it invalidates standard methods of statistical inference.  

Department faculty members have developed approaches to reduce or eliminate the harmful effects of nuisance parameters in either the full likelihood context (e.g. relaxed conditional likelihood under a rectangular array asymptotic setting) or the estimating function context (e.g. composite conditional score functions, orthogonal second order locally ancillary estimating functions, and G-ancillary estimating functions). These methods are designed to be computationally feasible and robust while avoiding unnecessary modeling assumptions.

Another area of research in statistical theory concerns the application and adaptation of empirical process methods to provide accurate and reliable inference for complex data structures. Examples include development of classification strategy, study of global quantile regression in high-dimensional settings, and efficient estimation and robust inference for causal effects.

Epidemiology and environmental health represent two highly important traditional disciplines in the broad field of public health. Our faculty members are engaged in collaborative research on topics like:  

  • Air pollution and health
  • HIV and cancer epidemiology
  • Reproductive health outcomes

Key ongoing public health problems provide the motivation for much of the methodological research conducted by department faculty in this area. Areas of particular interest include but are not limited to:

  • Methods for causal inference in observational studies
  • Methods for handling missing and mismeasured data
  • Methods for assessing agreement between multiple biomarkers of exposure and/or disease
  • Spatial analysis and geographic information systems
  • Modeling the dynamics of outbreaks of infectious disease in space and time.
  • Survival analysis and quantile regression relating to limit-of-detection of environmental exposures
  • High-dimensional measures of lifetime exposures (the exposome)
  • Statistical genetics and estimation of gene-environment interactions
  • Imaging statistics, including remote sensing images as markers of environmental exposure
  • Methods to account for pooled laboratory specimens and non-detectable measurements 

Survival analysis addresses time-to-event data, which arise routinely in clinical trials and observational follow-up studies. One distinguishing focus of survival analysis is the ability to draw information from incomplete observations of time-to-event responses in real data settings, addressing complications known as censoring, competing risks, and truncation.

Methodology has been well established for traditional types of survival data, where assumptions (such as independent censoring and independent truncation) are deemed reasonable. Techniques such as the Kaplan-Meier curve, log-rank test, and Cox's proportional hazards regression model have been well accepted and are widely used in many areas across biomedical research. Despite the success of these standard survival analysis techniques, there has been increasing attention to their limitations in practical scenarios where their underlying assumptions are considered unrealistic.

There are also many interesting research problems arising from the rapid development of new, high-dimensional data structures applied to new investigative goals. Examples of such problems include assessment of dynamic survival processes, screening and selection of high-dimensional survival predictors, and delineation of fine-tuned or personalized treatment effects on survival. These challenges provide an exciting outlook for survival analysis methodological research in the future, requiring creative integration with other modern developments of statistical techniques.

Dynamic regression provides another research direction currently under active development by department faculty. Classical models, including the proportional hazards model and accelerated failure time model, presume constant effects of covariates. Such constancy assumption, however, is not realistic in many applications where effects of covariates may actually evolve over time. For instance, the effectiveness of an AIDS drug is typically eroded over time due to drug resistance. To address this issue, quantile regression provides a popular and flexible means, allowing covariate effects to vary across data quantiles. Department faculty members are actively involved in developing quantile regression methods that can appropriately handle special features of survival data.

Centers & Labs

Research Centers, Cores, & Labs

a group of students looking at a laptop and smiling
Biostatistics Collaboration Core (BCC)

BCC

The BCC provides biostatistical consultation services to students, faculty, and staff at Rollins, the Woodruff Health Sciences Center, and across Emory University.

Learn more about the BCC
EPI research group presenting their study schedule at a meeting
Biostatistics, Epidemiology, & Research Design Program

Biostatistics, Epidemiology, & Research Design

The Biostatistics, Epidemiology & Research Design program supplies comprehensive biostatistical and epidemiological support to investigators through the Atlanta Clinical & Translational Science Institute.

Learn More about biostatistics, epidemiology, and research design
Rollins buildings
Center for Biomedical Imaging Statistics (CBIS)

CBIS

CBIS conducts research on statistical methods for analyzing data from biomedical imaging studies, such as brain, cardiac, breast, and prostate imaging.

Learn More about CBIS
CFAR Biostatistics and Bioinformatics Core

CFAR Biostatistics and Bioinformatics Core

This core at Emory’s Center for AIDS Research works to assist AIDS researchers with data management, statistical analysis, data monitoring, and more.

Learn more about biostatistics at CFAR
Dr. Lauren McCullough and EPI students engaging in classroom setting
Emory Network of Computational Omics Research

Emory Network of Computational Omics Research

Emory Network of Computational Omics Research conducts methodological research to develop new statistical methods and ML/AI algorithms capable of extracting new insights from large-scale, high-throughput multi-omics data.

Learn more about computational omics

Join Us

Ready for Your Next Step?

Whether you already know you want to pursue research at Rollins or want to learn more about admissions and costs, we’re here to help.