In this vignette, we demonstrate a procedure that helps SuSiE get out of local optimum.
We simulate phenotype using UK Biobank genotypes from 50,000 individuals. There are 1001 SNPs. It is simulated to have exactly 2 non-zero effects at 234, 287.
library(susieR)
library(curl)
# Using libcurl 8.7.1 with LibreSSL/3.3.6
data_file <- tempfile(fileext = ".RData")
data_url <- paste0("https://raw.githubusercontent.com/stephenslab/susieR/",
                   "master/inst/datafiles/FinemappingConvergence1k.RData")
curl_download(data_url,data_file)
load(data_file)
b <- FinemappingConvergence$true_coef
susie_plot(FinemappingConvergence$z, y = "z", b=b)
The strongest marginal association is a non-effect SNP.
Since the sample size is large, we use sufficient statistics (\(X^\intercal X, X^\intercal y, y^\intercal
y\) and sample size \(n\)) to
fit susie model. It identifies 2 Credible Sets, one of them is false
positive. This is because susieR get stuck around a local
minimum.
fitted <- with(FinemappingConvergence,
               susie_suff_stat(XtX = XtX, Xty = Xty, yty = yty, n = n))
susie_plot(fitted, y="PIP", b=b, main=paste0("ELBO = ", round(susie_get_objective(fitted),2)))
Our refine procedure to get out of local optimum is
fit a susie model, \(s\) (suppose it has \(K\) CSs).
for CS in \(s\), set SNPs in CS to have prior weight 0, fit susie model –> we have K susie models: \(t_1, \cdots, t_K\).
for each \(k = 1, \cdots, K\), fit susie with initialization at \(t_k\) (\(\alpha, \mu, \mu^2\)) –> \(s_k\)
if \(\max_k \text{elbo}(s_k) > \text{elbo}(s)\), set \(s = s_{kmax}\) where \(kmax = \arg_k \max \text{elbo}(s_k)\) and go to step 2; if no, break.
We fit susie model with above procedure by setting
refine = TRUE.
fitted_refine <- with(FinemappingConvergence,
                      susie_suff_stat(XtX = XtX, Xty = Xty, yty = yty,
                                      n = n, refine=TRUE))
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
# WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
susie_plot(fitted_refine, y="PIP", b=b, main=paste0("ELBO = ", round(susie_get_objective(fitted_refine),2)))
With the refine procedure, it identifies 2 CSs with the true signals, and the achieved evidence lower bound (ELBO) is higher.
Here are some details about the computing environment, including the versions of R, and the R packages, used to generate these results.
sessionInfo()
# R version 4.3.3 (2024-02-29)
# Platform: aarch64-apple-darwin20 (64-bit)
# Running under: macOS 15.4.1
# 
# Matrix products: default
# BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
# LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# time zone: America/Chicago
# tzcode source: internal
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
# [1] curl_5.2.1    Matrix_1.6-5  susieR_0.14.2
# 
# loaded via a namespace (and not attached):
#  [1] gtable_0.3.4        jsonlite_1.8.8      highr_0.10         
#  [4] dplyr_1.1.4         compiler_4.3.3      crayon_1.5.2       
#  [7] tidyselect_1.2.1    Rcpp_1.0.12         parallel_4.3.3     
# [10] jquerylib_0.1.4     scales_1.3.0        yaml_2.3.8         
# [13] fastmap_1.1.1       lattice_0.22-5      ggplot2_3.5.0      
# [16] R6_2.5.1            plyr_1.8.9          generics_0.1.3     
# [19] mixsqp_0.3-54       knitr_1.45          tibble_3.2.1       
# [22] RcppZiggurat_0.1.6  munsell_0.5.0       bslib_0.6.1        
# [25] pillar_1.9.0        rlang_1.1.5         utf8_1.2.4         
# [28] cachem_1.0.8        reshape_0.8.9       xfun_0.42          
# [31] sass_0.4.9          RcppParallel_5.1.10 cli_3.6.4          
# [34] magrittr_2.0.3      digest_0.6.34       grid_4.3.3         
# [37] irlba_2.3.5.1       lifecycle_1.0.4     vctrs_0.6.5        
# [40] Rfast_2.1.0         evaluate_1.0.3      glue_1.8.0         
# [43] fansi_1.0.6         colorspace_2.1-0    rmarkdown_2.26     
# [46] matrixStats_1.2.0   tools_4.3.3         pkgconfig_2.0.3    
# [49] htmltools_0.5.8.1