Narrative overview of my research.

I am a behavioral geneticist with formal training in statistics and psychology. My research over the last fifteen years has spanned multiple disciplines. My graduate school and postdoctoral work concerned the heterogeneity of depression and whether patterns of depressive symptoms depended on what precipitated the episode. My later postdoctoral and early faculty work focused on the development of methods in extended twin family designs and on quantifying biases in estimates of genetic and environmental influences from these designs. My research over the last ten years has used whole-genome data to better understand trait genetic architecture - the number, rarity, and mode of action of genes that underlie complex traits and psychiatric disorders. My current work uses structural equation modeling in whole-genome biobank datasets to differentiate the genetic and environmental causes of parent-offspring similarity. This work has brought me full circle, back to my early extended twin family modeling roots while drawing heavily on what I learned in analyzing whole-genome data.

Mentors and mentees. I have been incredibly fortunate to have had amazing mentors and mentees throughout my career. My scientific outlook and skills in behavioral genetics were shaped by working with many titans in the field, including Nick Martin, Mike Neale, Lindon Eaves, Ken Kendler, Hermine Maes, and John Hewitt. I owe an enormous debt to my mentor in graduate school, Randolph Nesse, the best writer I have ever worked with. I have learned a tremendous amount about quantitative genetics working with Peter Visscher and Naomi Wray in extended visits to their lab in Brisbane, Australia - people in their lab, particularly Jian Yang, Hong Lee, and Loic Yengo - were incredibly generous with their time, and my collaborations with them are some of my proudest accomplishments. Finally, I continue to be amazed by the brilliant and resourcesful mentees who have been through my lab: postdocs Christine Garver-Apgar, Doug Bjelland, Rasool Tahmasbi, Luke Evans, Meng Huang, Subrata Paul, Emmanuel Sapin, Yongkang Kim, Katerina Zorina, and Kristen Kelly; and graduate students Laramie Duncan, Dan Howrigan, Matt Simonson, Teresa de Candia, Emma Johnson, Richard Border, Jared Balbona, Spencer Moore, and Pamela Romero Villela. Working with them has been the best part of my career. They have taught me as much as I taught them, and they have pushed me in directions that I never envisioned moving into. Any successes I have had are due at least as much to their hard work and insights as my own.

Specific Areas of Research. My long-term goal is to understand, at a broad level, the causes of human complex trait variation. My lab uses measured genetic data, family data, and simulations to accomplish this. Below, I describe my lab's major areas of focus in behavioral genetics research.

1. Development of extended pedigree models to uncover the genetic and environmental architecture of traits. A central issue that my lab has grappled with is how to interpret estimates from traditional behavioral genetics designs. Twin-only designs are imprecise at uncovering the genetic and environmental architecture of traits (KELLER and COVENTRY 2005). We developed several extended twin family models that are much better at dissecting traits genetic and environmental architectures (KELLER, MEDLAND, & DUNCAN, 2010; KELLER et al. 2009) and used these to estimate genetic effects, familial environmental effects, and the influence of assortative mating, on the variation of, and covariation between, traits (KELLER et al. 2013). To explore the biases and accuracies of methods that use genome-wide data to estimate variance components, we developed a fast forward-time simulator of whole-genome sequence data (GeneEvolve) that can incorporate several of the complexities, such as assortative mating and vertical transmission, that occur in real data (TAHMASBI & KELLER, 2017). We recently used the full spectrum of relatedness across 86 billion pairs of individuals to estimate heritability for multiple complex traits and to demonstrate the influence that rare alleles and shared environments have on similarity (KEMPER et al., 2021).

2. Development of models that use whole-genome data to uncover the genetic architecture of traits. SNP-heritability measures the extent to which genetic similarity at measured single nucleotide polymorphisms (SNPs) is related to phenotypic similarity between all pairs of individuals in a sample. It provides a sense of how much trait variance can eventually be predicted from SNPs. We have estimated that ~one-third to one-half of additive genetic variation is due to common SNPs for many traits, including schizophrenia (LEE et al. 2012), personality (VERWEIJ et al. 2012), and cardiovascular disease (SIMONSON et al. 2011). We were the first to use these approaches to estimate the genetic correlation of the same trait between ethnicities, finding that the genetic correlation of schizophrenia risk between individuals of African and European descent was high (r~.70) at common SNPs (DE CANDIA et al. 2013). Postdoctoral trainee and current IBG faculty member Luke Evans and I built on advances from YANG et al. (2015)) to develop an approach for estimating the full allelic spectra of complex traits using SNPs stratified by their frequency and linkage disequilibrium (EVANS et al, 2018a) and used this to estimate the allelic spectra of multiple traits in the UK Biobank (EVANS et al, 2018a; EVANS et al., 2021). This approach was used recently by Peter Visschers lab to show that the full trait heritability can be recovered in samples of unrelated individuals (WAINSCHTEIN et al, 2021).

3. The effects of distal inbreeding on complex traits. Although traditionally studied using pedigrees, we demonstrated that very distal inbreeding (e.g., from common ancestors up to ~50 generations in the past) can be reliably measured in ostensibly outbred samples using modern, dense SNP arrays (KELLER et al. 2011). Based on this work, we genomically measured inbreeding in samples of up to 400,000 individuals to demonstrate inbreeding depression on cognitive abilities (HOWRIGAN et al. 2016), schizophrenia (KELLER et al. 2012), and other fitness-related traits (JOHNSON et al., 2018), although we failed to replicate the schizophrenia finding in a subsequent analysis (JOHNSON et al, 2016). Recently, we proposed a novel method for partitioning inbreeding depression according to different genomic annotations and showed that inbreeding effects are more pronounced in conserved regions and within regulatory elements (YENGO et al., 2021).

4. Gene-by-environment interactions (GxE). GxE research tests the hypothesis that the effect of some environmental variable on some outcome measure depends on a particular genetic polymorphism. Previous graduate student Laramie Duncan and I argued that the false positive rate of published findings in the field may be much higher than the nominal type I error rate of .05 because sample sizes have typically been small, multiple testing corrections have been insufficient, and the unpublished "file drawer" of negative findings may be large (DUNCAN and KELLER 2011). I later described why the typical way that GxE research attempts to statistically control for covariates is incorrect and described a simple way to do so correctly (KELLER, 2013). Previous graduate student Richard Border and I criticized common methodological practices in GxE research (BORDER & KELLER, 2017), and we found no support for commonly studied GxE interaction or main effect hypotheses of major depression in a large sample with ~100% power to detect previously reported effects (BORDER et al., 2019). As written up here and here, these findings suggest that 1000s of previous papers on this topic were wrong - which raises uncomfortable questions about current scientific practices - and not just regarding candidate gene studies - in my opinion. Finally, I was a member of a working group convened by NIDA to present recommendations for conducting GxE research (DICK et al. 2015).

5. Utilization of whole-genome data to estimate assortative mating and vertical transmission. My lab has recently been interested in using measured genetic data to estimate vertical transmission (the environmental influence of parents on offspring) and the degree of assortative mating (the tendency for mating partners to be similar). We directly measured gametic phase disequilibrium, a consequence of assortative mating on heritable traits, by correlating height and educational polygenic scores of odd and even chromosomes, demonstrating genetic signatures of assortment for the first time (YENGO et al., 2018). On the other hand, we showed that assortative mating only trivially increases the overall genetic relatedness between mates (YENGO et al., 2020). We recently showed that assortative mating biases estimates of SNP-heritability (BORDER et al, 2022). In two recent papers that were solicited for a special issue in Behavior Genetics, we introduced a structural equation modeling approach that uses polygenic scores built from transmitted and non-transmitted alleles to estimate traits' full variation due to vertical transmission as well as the additive genetic variation and the passive gene-environment covariation (BALBONA et al., 2021; KIM et al., 2021). We believe that this has the potential to be an important new approach in our field.

Impact of Research. I have authored or co-authored 124 journal articles, reviews, and chapters. I am first or last (senior) author on 64 of these. I have published papers in many of the top journals in my field, including Science (impact factor, IF=47.8), Nature Genetics (IF=35.2), Nature Human Behavior (IF=24.3), American Journal of Psychiatry (IF=14.7), Biological Psychiatry (IF=9.8), PLoS Genetics (IF=8.7), American Journal of Human Genetics (IF=11.2), and Brain and Behavioral Sciences (IF=18.6). Sixty-one of my articles have been cited 61 or more times (h-index=61) and altogether my articles have been cited over 31,000 times. Several papers from my lab have impacted how behavioral genetics is conducted. For example, at least two journals (Behavioral Genetics and Journal of Abnormal Child Psychology) changed their formal editorial policies following our publication (DUNCAN & KELLER, 2011) that suggested the false-positive rate in candidate gene-by-environment research was very high. A more recent publication (BORDER et al., 2019) that demonstrated a high false positive rate in depression candidate-gene and candidate-gene by environment interaction findings received press from major news outlets, including Nature, The Atlantic, NPR, and CBC. I demonstrated that gene-by-environment interaction studies had almost universally failed to properly correct for potential confounders, and the solution I discuss has become widely adopted (KELLER, 2014). We were the first lab to estimate the overlap in genetic effects for a trait between individuals of different ancestries (DE CANDIA et al., 2013), and this has since become its own entire area of inquiry for several labs worldwide. A modification of earlier GREML approaches (EVANS et al., 2018) has been adopted in the community to estimate the allelic spectra and ~full narrow-sense heritability of traits (including both common and rare variants) in samples of unrelated people. Finally, my mentees and I have received several awards for behavioral genetics research, including the Fulker Award for best paper published in Behavioral Genetics in 2011 (KELLER, MEDLAND, & DUNCAN, 2010), the Fuller/Scott Early Career Award from the Behavioral Genetics Association in 2012, the Faculty Research Award in the Psychology & Neuroscience department in 2019, the Dozier/Muenzinger Award (Richard Border) in 2020, the Thompson Award (Jared Balbona) in 2020, and the Fulker Award again in 2021 (BALBONA, KIM, & KELLER, 2021).

Extramural Funding Record. I have been sole PI on five awarded NIH submissions (one K01, three R01s, and one R25) and co-PI on another R01. My first grant was a NIMH K01 in 2010 (MH085812), written to help me gain training in the analysis of whole-genome data. I then received funding for a NIMH R01 in 2013 (MH100141) that was aimed at developing methods for understanding the genetic architecture of complex traits using whole-genome data. This R01 was renewed in 2018 with additional focus on estimating environmental influences. I was originally co-PI with Scott Vrieze on an R01 to explore ways to validate genetic associations in humans using molecular experiments (in collaboration with Jerry Stitzel) that was first submitted in 2017 and funded in 2019 (DA044283). I became PI on a subcontract of this grant once Dr. Vrieze moved to the University of Minnesota. More recently, I am PI on an R01 that uses structural equation modeling in datasets with parents and offspring and genome-wide data to estimate the genetic and environmental causes of parent-offspring similarity in psychiatric traits. I was also co-PI (with John Hewitt) on an R25 (MH019918) submission in 2019 to fund a workshop on statistical genetics. I have since taken over as sole PI from John Hewitt, and this grant was renewed in 2022. This workshop is one of the main gateways into human behavioral genetics for many people in our field, bringing together ~25 leaders in behavioral genetics from across the world to train ~110 students each year. (It is where I got my start in the field in 2004.) Finally, I am co-I on three R01's put in by colleagues at IBG (Rhee, Friedman, and Hopfer). In total, I am funded by seven NIH grants and PI on four of these. Collectively, these grants bring ~ $16.6M (direct cost) to CU, or ~$3.3M/year.

Vision for the Future of Behavioral Genetics Research. This is an extraordinary time in the history of behavioral genetics. Researchers have shared access to increasingly large whole-genome datasets, ushering in new ways to answer questions about genetic architecture and the causes of familial similarity that are difficult or impossible to answer using traditional approaches. Genomic data is also being combined with "big data" from brain imaging, transcription, and other biomarkers, presenting enormous computational and analytic challenges but also presenting enormous opportunities for gaining insight into endophenotypic pathways between genes and traits. At the same time, genome-wide association studies have continued apace, pointing to regions in the genome that are definitively associated with trait variation. However, pinpointing the specific causal genetic variants, and elucidating how these causal variants influence traits, will require translational collaborations between human geneticists and researchers using model organisms. IBG has expertise in human quantitative genetics and model organism research, and so is uniquely positioned to be at the forefront of a genetic revolution that will translate statistical associations into mechanistic insight. I am excited about the future of behavioral genetics and am eager to help shape how the field uses new data sources and novel approaches to unlock the mysteries of the human condition.

[Matthew C Keller's Home Page] [Biosketch] [Vita] [Publications]