Current and Future Work
The Gilad Lab is now focused on moving beyond simple explorations of gene expression levels, to studies of variation in regulatory mechanisms, response phenotypes, and ultimately – complex traits (including disease). the following are ongoing projects in the lab:
Understanding genetic determinants of regulatory variation
Understanding how the genome encodes regulatory information is a central goal of our research. How does a single genome encode such exquisitely precise, yet highly distinctive programs of gene regulation across different cell types, time points, and conditions? We have taken two general perspectives to address this question. The first is a comparative genomics approach, in which we explore differences between species to identify possible connections between regulatory changes and adaptation. Using the second approach, we examine interindividual differences within species to connect regulatory variation to functional differences in complex human traits. We have applied a similar framework to study regulatory variation within and between different tissues and more recently, different cell types, which can yield insight into development and differentiation in addition to evolution and disease. Combining these complementary strategies can provide us with insight into how the effects of genetic variation are propagated through changes in molecular mechanisms to differences in gene expression. With a better understanding of the cascade of regulatory events that connect genetic differences to variation in gene expression outputs, we can return to the persistent questions that motivate much of the research in evolutionary biology and disease susceptibility. What genetic changes are critical to the evolution of particular lineages? What are the regulatory changes that lead to phenotypic adaptation, and what mechanisms are affected? What are the molecular pathways that lead to disease?
Going beyond steady-state RNA levels to uncover regulatory mechanisms
Our lab uses multiple complementary approaches to characterize variation in genetic and epigenetic regulatory mechanisms within and between species. By integrating gene expression data from multiple tissues and cell types with epigenetic profiles, chromatin states, post-transcriptional modifications, and other molecular data, we can generate specific hypotheses about regulatory mechanisms and learn about their relative importance for different molecular and phenotypic outcomes. For example, we have used functional genomic data to explore the roles of splicing, transcription rate, RNA decay, and other mechanisms on gene regulation in human cell lines.
We are currently collaborating with Yang Li to characterize the genetic basis of alternative polyadenylation site usage (APA) in human cell lines. Over half of human genes contain multiple polyadenylation signal sites, resulting in vast mRNA isoform diversity. Changes in APA affect variation in mRNA stability and localization, miRNA binding site availability, translation efficiency, and protein interactions. Thus, APA represents a large substrate through which interindividual genetic variation can produce functional changes (potentially leading to variation in disease risk). Our current work is using 3’-RNA-seq to measure APA in the nuclear and total cell fractions of iPSC-derived cardiomyocytes, which will allow us to decouple the effects of RNA decay and nuclear export on steady-state mRNA levels.
Genetic and mechanistic basis of robustness
Robustness, or the ability to maintain a stable phenotype despite genetic changes and environmental perturbations, is an important property of multicellular development and evolution. Yet the dynamic fluctuations caused by genetic changes are the driving force of adaptation. How is a balance achieved between these two seemingly opposite forces? Robustness is typically studied by experimental evolution approaches, in which changes in phenotypes of entire organisms, tissues, or cell populations are measured across time under different conditions or selective pressures. However, such measurements focus on the final outcome – whether a trait is robust to perturbation or not – but often ignore the mechanisms, the genetic and molecular circuits that underlie robustness in single cells, and ultimately lead to a robust outcome. In order to understand how robustness is encoded, we need to measure and consider variation in phenotypes across individual cells.
Single cell RNA-sequencing (scRNA-seq) allows us to go beyond simply measuring the average level of gene expression in a tissue sample and directly measure cell-to-cell variance in gene expression. Recently, we used single cell RNA sequencing to measure gene expression variance in single cells from multiple individuals, which allowed us to test the hypothesis that gene expression variance is under genetic control. Our results suggested that relative to how much they vary between individuals, genetic effects on gene expression variance are smaller than effects on mean expression levels.
scRNA-seq also provides us with the resolution to study transcriptional dynamics during differentiation. Through their dual capacity to differentiate and proliferate, stem cells regulate cell type and number during development. Cell-cell variability can affect how individual cell fates are determined in response to key stimuli, such as transcription factors. However, it is challenging to identify the causes of cell-cell variability in single cell data. Our lab has recently developed methods that allow us to disentangle the effects of genetic variation from other sources of variation in single cells, which is important for understanding how and why cells switch between states. For example, we developed an approach to quantify continuous cell cycle phase using gene expression data collected from iPSCs, which can be used to characterize and account for transcriptional heterogeneity related to cell cycle. We are currently using single cell sequencing to study the transcriptional dynamics of cells undergoing cardiomyocyte differentiation.
Analysis of regulatory variation during differentiation
Dynamic effects on gene regulation (including tissue-specific and environment-specific effects) are thought to be important for disease and development, yet few studies of gene regulation in humans have collected data from multiple time points, tissues, or cell types in the same individuals. To address this, we are characterizing interindividual variation in gene expression across regulatory trajectories during cardiomyocyte differentiation. Measuring gene expression at multiple time points will allow us to identify variants associated with transient, dynamic effects on gene expression, and may reveal mechanisms important for differentiation and disease. In addition, combining this information with data collected from single cells may provide insight into probabilistic differentiation strategies and uncover transient cell types that are important for differentiation.
Embryoid bodies as a model for studying cell type-specific regulation
Embryoid bodies are stem cell aggregates that spontaneously and asynchronously differentiate into cell types originating from all three germ layers. Applying single cell RNA-sequencing to embryoid bodies allow us to study the transcriptomes of cells from all three germ layers, including pluripotent and intermediate cell types, in a single, controlled genetic environment. We are currently developing methods to identify and analyze eQTLs in embryoid bodies, which will yield insight into how genetic variation shapes gene expression across a wide variety of cell types and cell states.
Mapping eQTLs associated with cardiovascular disease risk
Integrating trait-associated variants with eQTLs has become a standard approach for identifying genes relevant to complex human traits and diseases because it allows us to interpret the effects of genetic variation within a biological context. We have generated a panel of Hutterite cell lines, which we are using to study cardiovascular disease and other complex traits in collaboration with Carole Ober. The Hutterites of South Dakota are a founder population of European descent that practices a communal, farming lifestyle. The small number of founding genomes results in reduced genetic heterogeneity in the Hutterite population, and their communal lifestyle reduces environmental heterogeneity between individuals, both of which facilitate identification of disease genes. We recently mapped eQTLs using gene expression data collected from differentiated Hutterite cardiomyocytes and integrated the results with chromatin accessibility data, showing that cardiomyocyte eQTLs are enriched in regulatory regions.
We have also used Hutterite cardiomyocytes to identify genetic variants that affect the transcriptomic response to doxorubicin. Sensitivity to doxorubicin and other anthracyclines is a key limiting factor in determining optimal chemotherapy regimens, as nearly half of patients that receive high doses of the drug develop congestive heart failure. Using a panel of cardiomyocytes from 45 Hutterite individuals, we measured gene expression levels after 24 hours of exposure to varying doxorubicin dosages. Most genes were differentially expressed, and over 6,000 genes showed evidence of differential splicing due to reduced splicing fidelity in the presence of the doxorubicin. We found interindividual variation in transcriptional responses to be predictive of cell damage in vitro, which was in turn associated with risk of anthracycline-induced toxicity in vivo. This study identified hundreds of SNPs associated with the doxorubicin-induced transcriptional response, including SNPs associated with differential splicing.
Comparative functional genomic studies in primates
A key goal of our research is to understand how genetic differences lead to phenotypic differences between species. A long-standing hypothesis is that changes in gene regulation play an important role in adaptive evolution, particularly in primates. Consistent with this notion, the past decade of research has yielded an increasing number of cases where regulatory changes have been shown to contribute to species-specific adaptations and to reproductive isolation. Nonetheless, we still know of only a handful of examples of regulatory adaptations in primates, and even fewer cases where the underlying regulatory mechanisms have been resolved. In order to gain true insight into regulatory processes that underlie variation in complex phenotypes, we must have access to faithful model systems for a wide range of tissues and cell types. We believe that matched panels of induced pluripotent stem cells (iPSCs) from humans and non-human primates is the way forward.
Induced pluripotent stem cells (iPSCs) and differentiated cells as a model system to study human evolution, development, and disease
Comparative genomic studies in primates are extremely restricted because we only have access to a few types of cell lines from non-human apes, and to a limited collection of frozen tissues. To address this challenge, we have established a multi-species panel of iPSCs that includes humans, chimpanzees, and rhesus macaques. iPSCs are an excellent model for studying gene regulation, cellular differentiation, and development in vitro – they are self-renewing, amenable to freezing, and they can be differentiated into multiple cell types. Recent projects in our lab have used iPSCs to compare temporal patterns of gene regulation in humans and chimpanzees during endodermal differentiation, to explore the role of transposable element silencing in primate regulatory evolution, and to measure differences in 3D chromatin structure between humans and chimpanzees. We have also used this resource to study the genetic architecture of complex traits and human-specific adaptations, including adaptations relevant to disease.
Connecting regulatory changes to phenotypes with response QTLs
Another strategy for connecting regulatory changes to actual phenotypic differences between species is to perform comparative genomic studies of response phenotypes. Our past work examined the immune response to bacterial infection in primary monocytes from humans, chimpanzees, and rhesus macaques. We found that species-specific immune responses are enriched for genes involved in viral response pathways, as well genes in apoptosis and cancer pathways. We also found chimpanzee-specific immune signaling pathways to be enriched for HIV-interacting genes, which could explain why HIV-infected chimpanzees exhibit relatively strong resistance to AIDS progression. In a similar study, we characterized interindividual variation in response to Mycobacterium tuberculosis infection in humans, and identified a novel set of candidate loci that may contribute to Tuberculosis susceptibility.
We have also applied this approach to identify human-specific gene regulatory adaptations in differentiated cells. For example, in a recent study of the transcriptional response to hypoxia in iPSC-derived cardiomyocytes from humans and chimpanzees, we identified hundreds of genes with species-specific regulatory responses, many of which have been associated with cardiovascular disease. Current members of the lab are measuring responses to mechanical stress in chondrocytes and osteoblasts derived from human and chimpanzee iPSCs, which we are using as a model of osteoporosis-related phenotypes.
Combining functional genomic data to uncover regulatory mechanisms
We use multiple complementary approaches to characterize variation in genetic and epigenetic regulatory mechanisms in primates. For example, we performed a comparative epigenetic study of primate LCLs to explore the contribution of RNA polymerase II and four histone modifications associated with transcription initiation (H3K4me1, H3K4me3, H3K27ac, and H3K27me3) to interspecies variation in gene expression levels. We found a tendency for differentially expressed genes to associate with interspecies differences in histone mark enrichment at transcription start sites. In contrast, H3K9me3, which is associated with transposable element silencing, does not appear to drive interspecies differences. Our recent study comparing gene expression and methylation levels in livers, kidneys, hearts, and lungs from humans, chimpanzees, and rhesus macaques found that only 7-11% of interspecies differences in gene expression can be explained by corresponding differences in promoter DNA methylation. However, gene expression divergence in conserved tissue-specific genes can be explained by corresponding inter-species methylation changes much more often. Current projects in the lab are using primate iPSCs to characterize interspecies differences in 3D chromatin structure and species-specific responses to environmental perturbations.
Experimental design and analysis of genome-wide studies
Genome-wide studies of gene regulation must account for many potential confounding sources of variation, which can be biological or technical. Comparative genomic studies in primates are particularly sensitive to confounding due to the many physical, morphological, and environmental differences between species, as well as to the small sample sizes necessitated by limited access to primate tissues. However, even human genomic studies can be confounded by differences in sample processing, unbalanced study designs, and the techniques used to obtain and analyze genomic data. As confounding sources of variation can have a profound impact on how the results of genomic studies are interpreted, experimental design has remained a foremost concern of our lab. In particular, we are interested in experimental designs and practices that either minimize the effects of confounding variables or allow us to estimate and account for their effects in our analyses.
Throughout the last 10 years, we have used a variety of technologies, tissues, and cell types in our research, and our experimental designs continue to evolve with each new application. Our past work examined the effect of RNA integrity score on RNA-sequencing data in frozen tissue and measured the effects of epigenetic memory and reprogramming on gene expression variation in iPSCs. More recently, we have focused on developing effective experimental designs for single-cell RNA-sequencing studies, and are now exploring ways to estimate continuous cell cycle phase from gene expression patterns in single cells. We have also shown that a balanced study design and effective use of sample metadata can be used to strengthen comparative genomic analyses of different tissues and species.