Skip to content

Detailed GWAS using NARAC data to investigate genetic risk factors for RA. Key steps include genetic data cleaning, PCA, PRS and sex-specific GWAS.

Notifications You must be signed in to change notification settings

N3ha-Rao/Genetic-Analysis-NARAC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genetic Analysis of Rheumatoid Arthritis (RA) Using North American Rheumatoid Arthritis Consortium (NARAC) Data

This study was carried out as a final project for the course SPH BS 859.

Dataset

The NARAC dataset comprises a case-control study of Rheumatoid Arthritis with 868 cases and 1,194 controls. Cases were recruited across the United States, predominantly of Northern European descent, and met the American College of Rheumatology criteria for RA. Controls were sourced from the New York Cancer Project, with a slight enrichment for individuals of Southern European or Ashkenazi Jewish ancestry. All subjects are considered unrelated.

Acknowledgement: Genetic Analysis Workshop 16

Methods

  1. Genetic Data Cleaning:

    • SNP Filtering: Applied thresholds for minor allele frequency (MAF > 0.01), genotype call rate (missingness < 5%), and Hardy-Weinberg equilibrium (p > 1e-6).
    • Individual Filtering: Removed individuals with high missing genotype rates (>5%).
  2. Principal Component Analysis (PCA):

    • Performed PCA to identify and adjust for population stratification, using pruned SNPs to reduce linkage disequilibrium.
  3. Genome-Wide Association Studies (GWAS):

    • Conducted sex-specific GWAS to identify genetic associations with RA in males and females separately.
    • Combined results through meta-analysis to detect overall associations.
  4. Polygenic Risk Scores (PRS):

    • Calculated PRS using summary statistics from European and Asian populations to predict RA risk in the NARAC dataset.

Citations

  • Purcell S, Neale B, Todd-Brown K, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559-575. doi:10.1086/519795
  • Price AL, Patterson NJ, Plenge RM, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904-909. doi:10.1038/ng1847
  • Anderson CA, Pettersson FH, Clarke GM, et al. Data quality control in genetic case-control association studies. Nat Protoc. 2010;5(9):1564-1573. doi:10.1038/nprot.2010.116
  • Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 2007;17(10):1520-1528. doi:10.1101/gr.6665407

About

Detailed GWAS using NARAC data to investigate genetic risk factors for RA. Key steps include genetic data cleaning, PCA, PRS and sex-specific GWAS.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published