Title: | Summary Statistics-Based Multivariate Meta-Analysis of Genome-Wide Association Studies Using Canonical Correlation Analysis |
---|---|
Description: | metaCCA performs multivariate analysis of a single or multiple GWAS based on univariate regression coefficients. It allows multivariate representation of both phenotype and genotype. metaCCA extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. |
Authors: | Anna Cichonska <[email protected]> |
Maintainer: | Anna Cichonska <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.35.0 |
Built: | 2024-11-29 06:39:12 UTC |
Source: | https://github.com/bioc/metaCCA |
This function computes phenotypic correlation
matrix S_YY
based on univariate summary statistics S_XY
.
estimateSyy( S_XY )
estimateSyy( S_XY )
S_XY |
Univariate summary statistics. Data frame with row names corresponding to SNP IDs (e.g., position or rs_id) and the following columns: - - - then, two columns for each trait (phenotypic variable) to be included in the analysis; in turn: 1) 2) ("traitID" in the column name must be an ID of a trait specified by a user; do not use underscores "_" in trait IDs outside "_b"/"_se" in order for the IDs to be processed correctly). |
S_YY |
Matrix containing correlations between traits given as input. Row and column names correspond to trait IDs. |
In practice, summary statistics of at least one chromosome should be used in order to ensure good quality of the estimate of phenotypic correlation structure.
Anna Cichonska
Cichonska et al. (2016) metaCCA: Summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics, 32(13):1981-1989.
# Estimating correlations between 10 traits given their # univariate summary statistics across 1000 SNPs S_YY = estimateSyy( S_XY = S_XY_full_study1 ) # Viewing the resulting phenotypic correlation matrix print( S_YY, digit = 3 )
# Estimating correlations between 10 traits given their # univariate summary statistics across 1000 SNPs S_YY = estimateSyy( S_XY = S_XY_full_study1 ) # Viewing the resulting phenotypic correlation matrix print( S_YY, digit = 3 )
This function performs genotype-phenotype association analysis according to metaCCA algorithm (univariate summary statistics-based analysis of a single or multiple genome-wide association studies (GWAS) that allows multivariate representation of both genotype and phenotype).
The function accepts a varying number of arguments, depending on the type of the analysis. By default, single-SNP–multi-trait association analysis is performed, where each given SNP is tested against all given phenotypic variables. Other options are to perform single-SNP–multi-trait analysis of one selected SNP, as well as multi-SNP–multi-trait analysis.
metaCcaGp( nr_studies, S_XY, std_info, S_YY, N, analysis_type, SNP_id, S_XX )
metaCcaGp( nr_studies, S_XY, std_info, S_YY, N, analysis_type, SNP_id, S_XX )
nr_studies |
Number of studies to be analysed. |
S_XY |
Univariate summary statistics of the variables to be analysed. A list of data frames (one for each study) with row names corresponding to SNP IDs (e.g., position or rs_id) and the following columns: - - - then, two columns for each trait (phenotypic variable) to be included in the analysis; in turn: 1) 2) ("traitID" in the column name must be an ID of a trait specified by a user; do not use underscores "_" in trait IDs outside "_b"/"_se" in order for the IDs to be processed correctly). |
std_info |
A vector with numerical values (most likely the data were not standardised - the genotypes were not
standardised before univariate regression coefficients and standard errors
were computed - option |
S_YY |
A list of phenotypic correlation matrices (one for each study)
estimated using |
N |
A vector with numbers of individuals in each study. |
Arguments below are OPTIONAL and depend on the type of the analysis.
analysis_type |
Indicator of the analysis type. 1) Single-SNP–multi-trait analysis of one selected SNP: 2) Multi-SNP–multi-trait analysis: |
SNP_id |
1) Single-SNP–multi-trait analysis of one selected SNP: An ID of the SNP of interest. 2) Multi-SNP–multi-trait analysis: A vector with IDs of SNPs to be analysed jointly. |
S_XX |
A list of data frames (one for each study) containing correlations between SNPs. Row names (and, optionally, column names) must correspond to SNP IDs. This argument needs to be given only in case of multi-SNP–multi-trait analysis. |
result |
Data frame with row names corresponding to SNP IDs. Columns contain: 1) 2) 3) 4) |
Anna Cichonska
Cichonska et al. (2016) metaCCA: Summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics, 32(13):1981-1989.
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Analysis of one study according to metaCCA algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. result1 = metaCcaGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1 ) ), N = N1 ) # Viewing association results print( result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. result2 = metaCcaGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. result3 = metaCcaGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1 ) ) # Viewing association results print( result3, digits = 3 ) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Meta-analysis of two studies according to metaCCA algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. meta_result1 = metaCcaGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ) ) # Viewing association results print( meta_result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. meta_result2 = metaCcaGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( meta_result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. meta_result3 = metaCcaGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1, S_XX_study2 ) ) # Viewing association results print( meta_result3, digits = 3 )
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Analysis of one study according to metaCCA algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. result1 = metaCcaGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1 ) ), N = N1 ) # Viewing association results print( result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. result2 = metaCcaGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. result3 = metaCcaGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1 ) ) # Viewing association results print( result3, digits = 3 ) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Meta-analysis of two studies according to metaCCA algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. meta_result1 = metaCcaGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ) ) # Viewing association results print( meta_result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. meta_result2 = metaCcaGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( meta_result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. meta_result3 = metaCcaGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1, S_XX_study2 ) ) # Viewing association results print( meta_result3, digits = 3 )
This function performs genotype-phenotype association analysis according to metaCCA+ algorithm (the variant of metaCCA, where the full covariance matrix is shrunk beyond the level guaranteeing its positive semidefinite property).
metaCcaPlusGp
requires exactly the same inputs as metaCcaGp
function,
and it has the same output format.
metaCcaPlusGp( nr_studies, S_XY, std_info, S_YY, N, analysis_type, SNP_id, S_XX )
metaCcaPlusGp( nr_studies, S_XY, std_info, S_YY, N, analysis_type, SNP_id, S_XX )
nr_studies |
Number of studies to be analysed. |
S_XY |
Univariate summary statistics of the variables to be analysed. A list of data frames (one for each study) with row names corresponding to SNP IDs (e.g., position or rs_id) and the following columns: - - - then, two columns for each trait (phenotypic variable) to be included in the analysis; in turn: 1) 2) ("traitID" in the column name must be an ID of a trait specified by a user; do not use underscores "_" in trait IDs outside "_b"/"_se" in order for the IDs to be processed correctly). |
std_info |
A vector with numerical values (most likely the data were not standardised - the genotypes were not
standardised before univariate regression coefficients and standard errors
were computed - option |
S_YY |
A list of phenotypic correlation matrices (one for each study)
estimated using |
N |
A vector with numbers of individuals in each study. |
Arguments below are OPTIONAL and depend on the type of the analysis.
analysis_type |
Indicator of the analysis type. 1) Single-SNP–multi-trait analysis of one selected SNP: 2) Multi-SNP–multi-trait analysis: |
SNP_id |
1) Single-SNP–multi-trait analysis of one selected SNP: An ID of the SNP of interest. 2) Multi-SNP–multi-trait analysis: A vector with IDs of SNPs to be analysed jointly. |
S_XX |
A list of data frames (one for each study) containing correlations between SNPs. Row names (and, optionally, column names) must correspond to SNP IDs. This argument needs to be given only in case of multi-SNP–multi-trait analysis. |
result |
Data frame with row names corresponding to SNP IDs. Columns contain: 1) 2) 3) 4) |
Anna Cichonska
Cichonska et al. (2016) metaCCA: Summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics, 32(13):1981-1989.
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Analysis of one study according to metaCCA+ algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. result1 = metaCcaPlusGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1 ) ), N = N1 ) # Viewing association results print( result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. result2 = metaCcaPlusGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. result3 = metaCcaPlusGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1 ) ) # Viewing association results print( result3, digits = 3 ) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Meta-analysis of two studies according to metaCCA+ algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. meta_result1 = metaCcaPlusGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ) ) # Viewing association results print( meta_result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. meta_result2 = metaCcaPlusGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( meta_result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. meta_result3 = metaCcaPlusGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1, S_XX_study2 ) ) # Viewing association results print( meta_result3, digits = 3 )
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Analysis of one study according to metaCCA+ algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. result1 = metaCcaPlusGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1 ) ), N = N1 ) # Viewing association results print( result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. result2 = metaCcaPlusGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. result3 = metaCcaPlusGp( nr_studies = 1, S_XY = list( S_XY_study1 ), std_info = 0, S_YY = list( estimateSyy(S_XY_full_study1) ), N = N1, analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1 ) ) # Viewing association results print( result3, digits = 3 ) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Meta-analysis of two studies according to metaCCA+ algorithm. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Default single-SNP--multi-trait analysis. # Here, we will test each of 10 SNPs for an association with a set of 10 traits. meta_result1 = metaCcaPlusGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ) ) # Viewing association results print( meta_result1, digits = 3 ) # Single-SNP--multi-trait analysis of one selected SNP. # Here, we will test one of 10 SNPs for an association with a set of 10 traits. meta_result2 = metaCcaPlusGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 1, SNP_id = 'rs80' ) # Viewing association results print( meta_result2, digits = 3 ) # Multi-SNP--multi-trait analysis. # Here, we will test a set of 5 SNPs for an association with a set of 10 traits. meta_result3 = metaCcaPlusGp( nr_studies = 2, S_XY = list( S_XY_study1, S_XY_study2 ), std_info = c( 0, 0 ), S_YY = list( estimateSyy(S_XY_full_study1), estimateSyy(S_XY_full_study2) ), N = c( N1, N2 ), analysis_type = 2, SNP_id = c( 'rs10', 'rs80', 'rs140', 'rs170', 'rs172' ), S_XX = list( S_XX_study1, S_XX_study2 ) ) # Viewing association results print( meta_result3, digits = 3 )
Number of individuals in study 1.
Numeric value
Test data
Part of the simulated toy data set.
Number of individuals in study 2.
Numeric value
Test data
Part of the simulated toy data set.
Data frame containing correlations between SNPs estimated from a reference database matching the study 1 population, e.g., the 1000Genomes. Here, [10 SNPs x 10 SNPs].
Data frame
Test data
Part of the simulated toy data set.
Data frame containing correlations between SNPs estimated from a reference database matching the study 2 population, e.g., the 1000Genomes. Here, [10 SNPs x 10 SNPs].
Data frame
Test data
Part of the simulated toy data set.
Data frame containing univariate summary statistics (regression coefficients and
standard errors) of study 1 for 1000 SNPs and 10 traits. It will be used for
estimating phenotypic correlation structure S_YY
of study 1.
Data frame
Test data
Part of the simulated toy data set.
Data frame containing univariate summary statistics (regression coefficients and
standard errors) of study 2 for 1000 SNPs and 10 traits. It will be used for
estimating phenotypic correlation structure S_YY
of study 2.
Data frame
Test data
Part of the simulated toy data set.
Data frame containing univariate summary statistics (regression coefficients and standard errors) of study 1 corresponding to the variables to be included in the association analysis: 10 SNPs and 10 traits.
Data frame
Test data
Part of the simulated toy data set.
Data frame containing univariate summary statistics (regression coefficients and standard errors) of study 2 corresponding to the variables to be included in the association analysis: 10 SNPs and 10 traits.
Data frame
Test data
Part of the simulated toy data set.