Incorporating Functional Genomic Information in Genetic Association Studies Using an Empirical Bayes Approach

Authors

  • Amy V. Spencer,

    1. Advanced Analytics Centre, Global Medicines Development, AstraZeneca, Macclesfield, United Kingdom
    2. School of Mathematics and Statistics, University of Sheffield, Sheffield, United Kingdom
    Search for more papers by this author
  • Angela Cox,

    1. Department of Oncology, Sheffield Cancer Research Centre, University of Sheffield Medical School, Sheffield, United Kingdom
    Search for more papers by this author
  • Wei-Yu Lin,

    1. Department of Oncology, Sheffield Cancer Research Centre, University of Sheffield Medical School, Sheffield, United Kingdom
    2. Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
    Search for more papers by this author
  • Douglas F. Easton,

    1. Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
    2. Department of Oncology, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
    Search for more papers by this author
  • Kyriaki Michailidou,

    1. Department of Public Health and Primary Care, Centre for Cancer Genetic Epidemiology, University of Cambridge, Cambridge, United Kingdom
    Search for more papers by this author
  • Kevin Walters

    Corresponding author
    1. School of Mathematics and Statistics, University of Sheffield, Sheffield, United Kingdom
    • Correspondence to: Kevin Walters, School of Mathematics and Statistics, University of Sheffield, Sheffield, United Kingdom. E-mail: k.walters@sheffield.ac.uk

    Search for more papers by this author

ABSTRACT

There is a large amount of functional genetic data available, which can be used to inform fine-mapping association studies (in diseases with well-characterised disease pathways). Single nucleotide polymorphism (SNP) prioritization via Bayes factors is attractive because prior information can inform the effect size or the prior probability of causal association. This approach requires the specification of the effect size. If the information needed to estimate a priori the probability density for the effect sizes for causal SNPs in a genomic region isn't consistent or isn't available, then specifying a prior variance for the effect sizes is challenging. We propose both an empirical method to estimate this prior variance, and a coherent approach to using SNP-level functional data, to inform the prior probability of causal association. Through simulation we show that when ranking SNPs by our empirical Bayes factor in a fine-mapping study, the causal SNP rank is generally as high or higher than the rank using Bayes factors with other plausible values of the prior variance. Importantly, we also show that assigning SNP-specific prior probabilities of association based on expert prior functional knowledge of the disease mechanism can lead to improved causal SNPs ranks compared to ranking with identical prior probabilities of association. We demonstrate the use of our methods by applying the methods to the fine mapping of the CASP8 region of chromosome 2 using genotype data from the Collaborative Oncological Gene-Environment Study (COGS) Consortium. The data we analysed included approximately 46,000 breast cancer case and 43,000 healthy control samples.

Ancillary