Data Dictionary

Effective use of pgfinder requires an understanding of the inputs and outputs of the software.

Inputs

pgfinder takes data from mass spectrometry instruments, as well as a “database” of expected masses and some user specified modifications.

FTRS Files

This is a file corresponding to deconvolution data generated by the Byos® software with the extension .ftrs.

MaxQuant Files

MaxQuant Files are output by the MaxQuant software. They are tab separated value (TSV) files with a .txt extension.

Modifications

Any number of modifications can be selected to enrich the search space of the database of masses to which the input data is being compared. Allowed mofifications are:

Modification Description
Sodium Search for masses corresponding to sodium adducts
Potassium Search for masses corresponding to potassium adducts
Anh Search for anhydromuropeptides
DeAc Search for deacetylated muropeptides
DeAc_Anh Search for deacetylated anhydromuropeptides
Nude Search for muropeptides with an extra GlcNAc-MurNAc disaccharide
Decay Correct output taking into account in-source decay products
Amidation Search for Amidated muropeptides
Amidase Search for peptides resulting from amidase cleavage (GlcNAc-MurNAc loss)
Double_Anh Search for anhydromuropeptides (2 Anhydro groups)
Multimers Search for multimers resulting from 3-3 and 4-3 crosslinks
Multimers Glyco Search for multimers resulting from transglycosylation (no transpeptidation
Multimer Lac Search for lactyl-peptides multimers
O-Ac Search for O-acetylated muropeptides

Mass Databases (Lists)

Mass databases are lists of structures and their associated mass. They are in CSV format with a .csv extension. pgfinder has built-in mass lists for Escherichia coli and Clostridium difficle, but can take a different mass list as an input.

Column Description Unit
Structure Structure code NA
Monoisotopicmass Monoisotopic mass atomic mass unit

Outputs

pgfinder outputs CSV (.csv) files. The columns in these files depend on the input file format.

Embedded Metadata

The first column contains the following metadata

Data Description
file Input data file
masses_file Mass list file
rt_window Retention time window
modifications List of modifications
ppm ppm tolerance
consolidation_ppm ppm tolerance for consolidation
version PGFinder version used in analysis

PGFinder Output

Field Description
Metadata Metadata about input files.
ID Peak identifier.
RT (min) Retention Time in minutes.
Charge Charge states at which mass was observed. Can be used to work back from monoisotopic mass to the recorded raw mass/charge ratios.
Obs (Da) Observed mass in Dalton (unit)
Theo (Da) Theoretical mass in Dalton (unit)
Delta ppm Change in Parts Per Million (difference between Obs (Da) and Theo (Da)).
Inferred structure Inferred peptidoglycan structure.
Intensity Intensity of peak in relative units.
Inferred structure (best match) Most likely inferred structure.
Intensity (best match) Intensity of the most likely inferred structure.
Total Intensity Sum of intensities for consolidated structures. Can be used to compare how much material was injected/measured between different runs.
Structure Structure of most likely inferred structure.
Abundance (%) Amount (as a percentage of total intensity) of the inferred structure.
Consolidated RT (min) Consolidated Retention time in minutes of the most likely inferred structure..
Consolidated Theo (Da) Consolidated theoretical mass in Dalton (unit) of the most likely inferred structure.
Consolidated Delta ppm Consolidated change in Parts Per Million of the most likely inferred structure.