--- title: "Introduction to LCMSQA" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to LCMSQA} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` The goal of '**LCMSQA**' is to make it easy to check the quality of liquid chromatography/mass spectrometry (LC/MS) experiments using a '**[shiny](https://cran.r-project.org/package=shiny)**' application. It uses the R package '**[xcms](https://bioconductor.org/packages/xcms/)**' workflow for data import, visualization, and feature detection of internal standards or known metabolites that can be used to evaluate and adapt the peak detection settings. ## Installation In R session, please type ```{r install} ## Install from CRAN install.packages("LCMSQA") ## Load LCMSQA pacakge library(LCMSQA) ``` ## Parallel Processing Most methods in `xcms` support parallel processing via '**[BiocParallel](https://bioconductor.org/packages/BiocParallel/)**' to save time. We highly recommend to initiate the parallel processing setup explicitly before starting the app. To initiate `multicore`-based parallel evaluation in Unix-based systems (e.g., Linux, macOS), ```{r unix parallel} ## Unix-based systems library(BiocParallel) register(bpstart(MulticoreParam())) ``` On Windows, `MulticoreParam` results in serial evaluation. Please use `SnowParam` instead. ```{r windows parallel} ## Windows system library(BiocParallel) register(bpstart(SnowParam())) ``` For other options and details, please check the `BiocParallel` package vignettes. ## Shiny App Launch the shiny app with ```{r shiny app} runQA() ``` The system's default web browser will be launched automatically after the app is started. ## Example Data The example data used in this vignette can be downloaded from the following links: [example1.mzML](https://github.com/HimesGroup/LCMSQA/raw/main/inst/examples/example1.mzML), [example2.mzML](https://github.com/HimesGroup/LCMSQA/raw/main/inst/examples/example2.mzML), and [IS_info.csv](https://github.com/HimesGroup/LCMSQA/raw/main/inst/examples/IS_info.csv) (right click -> save link as). ## File Input The app needs the following input: 1. (required) mass-spectrometry data files of quality control (QC) samples in open formats: AIA/ANDI NetCDF, mzXML, mzData and mzML Check '**[msconvert](https://proteowizard.sourceforge.io/tools/msconvert.html)**' if you have data in different formats. Multiple files can be selected. ![](msfile.png){width=80%} 2. (optional) internal standard information (or other known metabolites) in a CSV format with the columns: + compound: the name of compound + adduct: adduct type (e.g., [M+H]+) + mode: must be either "positive" or "negative" + mz: a known mass-to-charge ratio (m/z) value You will see the menu to upload a CSV file after uploading mass-spectrometry files. You can skip this step and specify mass-to-charge ratio (m/z) manually to explore metabolic features of interest. ![](ISfile.png){width=80%} ## Tuning Parameters You can tune multiple parameters for metabolic feature detection (peak picking + grouping). ![](paramUI.png){width=80%} ### 1. Set m/z and retention time of interest - compound (or m/z) with a ppm tolerance - retention time in second (min, max) ### 2. Peak picking using the [centWave](https://sneumann.github.io/xcms/reference/findChromPeaks-centWave.html) method - ppm: the maximal tolerated m/z deviation in consecutive scans in ppm for the initial region of interest (ROI) definition - peak width: the expected approximate peak width in chromatographic space - signal/noise cut: the signal to noise ratio cutoff - m/z diff: the minimum difference in m/z dimension required for peaks with overlapping retention times - noise: a minimum intensity required for centroids to be considered in the first analysis step - prefilter (>= peaks, >= intensity): the prefilter step for the first analysis step (ROI detection) - Gaussian fit: whether or not a Gaussian should be fitted to each peak - m/z center: the function to calculate the m/z center of the chromatographic peaks - integration: whether or not peak limits are found through descent on the Mexican Hat filtered data ### 3. Peak grouping using the [peak density](https://sneumann.github.io/xcms/reference/do_groupChromPeaks_density.html) method - bandwidth: the bandwidth (standard deviation of the smoothing kernel) to be used - min fraction: the minimum fraction of samples in which the peaks has to be detected to define a peak group - bin size: the size of overlapping slices in m/z dimension ## Tabs The application consists of four main tabs: - Total Ion Current (TIC) Chromatogram - Mass Spectrum - Extracted Ion Chromatogram (XIC) - Metabolic Feature Detection ### 1. TIC This is the default tab that is opened once you upload the files. The TIC chromatogram shows the summed signals over the entire range of masses. Alternatively, base peak chromatogram can be displayed to monitor the most intense signal in each spectrum. The **Collapse** checkbox is used to display the chromatograms of multiple files in one figure. ![](TIC.png){width=100%} ### 2. Mass Spectrum In this tab, a mass spectrum is presented in which the most intense ion is re-scaled to an abundance of 100. If you click any data point within a chromatogram on top, a mass spectrum of the given scan time will be automatically displayed. ![](massspec.png){width=100%} ### 3. XIC For a specific slice of m/z and retention time ranges, clicking **Generate XIC** in the sidebar panel generates plots where each figure shows an XIC on top and m/z variation against retention time on bottom. You can choose a subset of files to display from the dropdown menu. ![](XIC.png){width=100%} ### 4. Feature Detection If you click **Detect Features** in the sidebar panel, chromatographic peak detection is performed using the centWave method and identified peaks are grouped into a feature. This tab shows apex positions of m/z and retention time values within the feature on the left and integrated peak areas on the right using a bar plot. Relative standard deviation (RSD) is calculated to measure the reproducibility among QC samples. A RSD value is not calculated if there is any missing value in the integrated peak area. In that case, you can exclude samples with missing values from the dropdown menu. ![](feature.png){width=100%} ## References 1. Smith, C.A. and Want, E.J. and O'Maille, G. and Abagyan,R. and Siuzdak, G.: XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification, Analytical Chemistry, 78:779-787 (2006) 2. Ralf Tautenhahn, Christoph Boettcher, Steffen Neumann: Highly sensitive feature detection for high resolution LC/MS BMC Bioinformatics, 9:504 (2008)