topdownr is free and open-source software. If you use it, please support the project by citing it in publications:
P.V. Shliaha, S. Gibb, V. Gorshkov, M.S. Jespersen, G.R. Andersen, D. Bailey, J. Schwartz, S. Eliuk, V. Schwämmle, and O.N. Jensen. 2018. Maximizing Sequence Coverage in Top-Down Proteomics By Automated Multi-modal Gas-phase Protein Fragmentation. Analytical Chemistry. DOI: 10.1021/acs.analchem.8b02344
For bugs, typos, suggestions or other questions, please file an issue
in our tracking system (https://github.com/sgibb/topdownr/issues) providing as
much information as possible, a reproducible example and the output of
sessionInfo()
.
If you don’t have a GitHub account or wish to reach a broader audience for general questions about proteomics analysis using R, you may want to use the Bioconductor support site: https://support.bioconductor.org/.
topdownr
Data Generation WorkflowTo create methods the user will have to install and modify Orbitrap Fusion LUMOS workstation first:
TribridSeriesWorkstationSetup-v3.2.exe
from
Thermo Scientific.TribridSeriesWorkstationSetup-v3.2.exe
.XMLMethodChanger is needed to convert the xml methods into
.meth
files. It could be found at https://github.com/thermofisherlsms/meth-modifications
The user has to download and compile it himself (or request it from
Thermo Scientific as well). You would need at least the 3.2
beta version.
In order to use XMLMethodChanger the operating system has to
use the .
(dot) as decimal mark and the ,
(comma) as digit group separator (one thousand dot two should be
formated as 1,000.2
).
In Windows 7 the settings are located at
Windows Control Panel > Region and Language > Formats
.
Choose English (USA) here or use the Additional
settings button to change it manually.
After data aquisition topdownr
would need the header
information from the .raw
files. Therefore the
ScanHeadsman software is used. It could be downloaded from https://bitbucket.org/caetera/scanheadsman
It requires Microsoft .NET 4.5 or later (it is often preinstalled on a typical modern Windows or could be found in Microsoft’s Download Center, e.g. https://www.microsoft.com/en-us/download/details.aspx?id=30653). Additionally you would need Thermo’s MS File Reader which could be downloaded free of charge (but you have to register) from the Thermo FlexNet website: https://thermo.flexnetoperations.com/
ScanHeadsman was created by Vladimir Gorshkov [email protected].
Importantly, XMLmethodChanger does not create methods de novo, but modifies pre-existing methods (supplied with XMLMethodChanger) using modifications described in XML files. Thus the whole process of creating user specified methods consists of 2 parts:
topdownr::createExperimentsFragmentOptimisation
, and
topdownr::writeMethodXmls
below)..meth
file to XMLmethodChanger.We choose to use targeted MS2 scans (TMS2) as a way to store the fragmentation parameters. Each TMS2 is stored in a separate experiment. Experiments do not overlap.
topdownr
Shown below is the process of creating XML files and using them to modify the TMS2IndependentTemplateForTD.meth template file.
library("topdownr")
## Create MS1 settings
ms1 <- expandMs1Conditions(
FirstMass=400,
LastMass=1200,
Microscans=as.integer(10)
)
## Set TargetMass
targetMz <- cbind(mz=c(560.6, 700.5, 933.7), z=rep(1, 3))
## Set common settings
common <- list(
OrbitrapResolution="R120K",
IsolationWindow=1,
MaxITTimeInMS=200,
Microscans=as.integer(40),
AgcTarget=c(1e5, 5e5, 1e6)
)
## Create settings for different fragmentation conditions
cid <- expandTms2Conditions(
MassList=targetMz,
common,
ActivationType="CID",
CIDCollisionEnergy=seq(7, 35, 7)
)
hcd <- expandTms2Conditions(
MassList=targetMz,
common,
ActivationType="HCD",
HCDCollisionEnergy=seq(7, 35, 7)
)
etd <- expandTms2Conditions(
MassList=targetMz,
common,
ActivationType="ETD",
ETDReactionTime=as.double(1:2)
)
etcid <- expandTms2Conditions(
MassList=targetMz,
common,
ActivationType="ETD",
ETDReactionTime=as.double(1:2),
ETDSupplementalActivation="ETciD",
ETDSupplementalActivationEnergy=as.double(1:2)
)
uvpd <- expandTms2Conditions(
MassList=targetMz,
common,
ActivationType="UVPD"
)
## Create experiments with all combinations of the above settings
## for fragment optimisation
exps <- createExperimentsFragmentOptimisation(
ms1=ms1, cid, hcd, etd, etcid, uvpd,
groupBy=c("AgcTarget", "replication"), nMs2perMs1=10, scanDuration=0.5,
replications=2, randomise=TRUE
)
## Write experiments to xml files
writeMethodXmls(exps=exps)
## Run XMLMethodChanger
runXmlMethodChanger(
modificationXml=list.files(pattern="^method.*\\.xml$"),
templateMeth="TMS2IndependentTemplateForTD.meth",
executable="path\\to\\XmlMethodChanger.exe"
)
After setting up direct infusion make sure that MS1 spectrum produces expected protein mass after deconvolution by Xtract. Shown below is a deconvoluted MS1 spectrum for myoglobin. The dominant mass corresponds to myoglobin with Met removed.
Prior to R
analysis of protein fragmentation data we
have to convert the .raw
files.
Some of the information (SpectrumId, Ion Injection Time (ms), Orbitrap Resolution, targeted Mz, ETD reaction time, CID activation and HCD activation) is stored in scan headers, while other (ETD reagent target and AGC target) is only available in method table.
You can run ScanHeadsman from the commandline
(ScanHeadsman.exe --noMS --methods:CSV
) or use the function
provided by topdownr
:
ScanHeadsman will generate a .txt
(scan header
table) and a .csv
(method table) file for each
.raw
file.
The spectra have to be charge state deconvoluted with Xtract node in Proteome Discoverer 2.1. The software returns deconvoluted spectra in mzML format.
Once a .csv
, .txt
, and .mzML
file for each .raw
have been produced we can start the
analysis using topdownr
. Please see analysis
vignette (vignette("analysis", package="topdownr")
) for an
example.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggplot2_3.5.1 ranger_0.17.0 topdownrdata_1.28.0
## [4] topdownr_1.29.0 Biostrings_2.75.1 GenomeInfoDb_1.43.2
## [7] XVector_0.47.0 IRanges_2.41.1 S4Vectors_0.45.2
## [10] ProtGenerics_1.39.0 BiocGenerics_0.53.3 generics_0.1.3
## [13] BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] rlang_1.1.4 magrittr_2.0.3
## [3] clue_0.3-66 matrixStats_1.4.1
## [5] compiler_4.4.2 vctrs_0.6.5
## [7] reshape2_1.4.4 stringr_1.5.1
## [9] pkgconfig_2.0.3 crayon_1.5.3
## [11] fastmap_1.2.0 labeling_0.4.3
## [13] utf8_1.2.4 rmarkdown_2.29
## [15] UCSC.utils_1.3.0 preprocessCore_1.69.0
## [17] purrr_1.0.2 xfun_0.49
## [19] MultiAssayExperiment_1.33.1 zlibbioc_1.52.0
## [21] cachem_1.1.0 jsonlite_1.8.9
## [23] DelayedArray_0.33.2 BiocParallel_1.41.0
## [25] parallel_4.4.2 cluster_2.1.6
## [27] R6_2.5.1 bslib_0.8.0
## [29] stringi_1.8.4 limma_3.63.2
## [31] GenomicRanges_1.59.1 jquerylib_0.1.4
## [33] Rcpp_1.0.13-1 SummarizedExperiment_1.37.0
## [35] iterators_1.0.14 knitr_1.49
## [37] Matrix_1.7-1 igraph_2.1.1
## [39] tidyselect_1.2.1 abind_1.4-8
## [41] yaml_2.3.10 doParallel_1.0.17
## [43] codetools_0.2-20 affy_1.85.0
## [45] lattice_0.22-6 tibble_3.2.1
## [47] plyr_1.8.9 withr_3.0.2
## [49] Biobase_2.67.0 evaluate_1.0.1
## [51] pillar_1.9.0 affyio_1.77.0
## [53] BiocManager_1.30.25 MatrixGenerics_1.19.0
## [55] foreach_1.5.2 MSnbase_2.33.2
## [57] MALDIquant_1.22.3 ncdf4_1.23
## [59] munsell_0.5.1 scales_1.3.0
## [61] glue_1.8.0 lazyeval_0.2.2
## [63] maketools_1.3.1 tools_4.4.2
## [65] mzID_1.45.0 sys_3.4.3
## [67] QFeatures_1.17.0 vsn_3.75.0
## [69] mzR_2.41.1 buildtools_1.0.0
## [71] XML_3.99-0.17 grid_4.4.2
## [73] impute_1.81.0 tidyr_1.3.1
## [75] MsCoreUtils_1.19.0 colorspace_2.1-1
## [77] GenomeInfoDbData_1.2.13 PSMatch_1.11.0
## [79] cli_3.6.3 fansi_1.0.6
## [81] S4Arrays_1.7.1 dplyr_1.1.4
## [83] AnnotationFilter_1.31.0 pcaMethods_1.99.0
## [85] gtable_0.3.6 sass_0.4.9
## [87] digest_0.6.37 SparseArray_1.7.2
## [89] farver_2.1.2 htmltools_0.5.8.1
## [91] lifecycle_1.0.4 httr_1.4.7
## [93] statmod_1.5.0 MASS_7.3-61