There is a nice vignette in snpStats concerning linkage disequilibrium (LD) analysis as supported by software in that package. The purpose of this package is to simplify handling of existing population-level data on LD for the purpose of flexibly defining LD blocks.
The hmld
function imports gzipped tabular data from
hapmap’s repository .
suppressPackageStartupMessages({
library(ldblock)
library(GenomeInfoDb)
})
path = dir(system.file("hapmap", package="ldblock"), full=TRUE)
ceu17 = hmld(path, poptag="CEU", chrom="chr17")
## importing /tmp/RtmpusqS1z/Rinst185039fe2a72/ldblock/hapmap/ld_chr17_CEU.txt.gz
## done.
## ldstruct for population CEU, chrom chr17
## dimensions: 36621 x 36621 ; statistic is Dprime
## object structure:
## ldmat chrom genome allpos poptag statInUse
## "dsCMatrix" "character" "character" "numeric" "character" "character"
## allrs
## "character"
For some reason knitr/render will not display this image nicely.
##
## Attaching package: 'Matrix'
## The following object is masked from 'package:S4Vectors':
##
## expand
This ignores physical distance and MAF. The bright stripes are probably due to SNP with low MAF.
We’ll use ceu17
and the gwascat
package to
enumerate SNP that are in LD with GWAS hits.
## gwascat loaded. Use makeCurrentGwascat() to extract current image.
## from EBI. The data folder of this package has some legacy extracts.
load(system.file("legacy/ebicat37.rda", package="gwascat"))
#seqlevelsStyle(ebicat37) = "NCBI" # noop?
seqlevels(ebicat37) = gsub("chr", "", seqlevels(ebicat37))
e17 = ebicat37[ which(as.character(seqnames(ebicat37)) == "17") ]
Some dbSNP names for GWAS hits on chr17 are
## [1] "rs11078895" "rs11891" "rs7501939" "rs9905704" "rs4796793"
## [6] "rs78378222"
We will use expandSnpSet
to obtain names for SNP that
were found in HapMap CEU to have which D′ > .9 with any of these hits.
These names are added to the input set.
## [1] 530
## Warning in ldblock::expandSnpSet(rsh17, ldstruct = ceu17, lb = 0.9): dropping
## 149 rsn not matched in ld matrix
## [1] 13209
## [1] TRUE
Not all GWAS SNP are in the HapMap LD resource. You can use your own LD data as long as the format agrees with that of the HapMap distribution.