pgfinder.pgio

PG Finder I/O operations

Module Contents

Functions

ms_file_reader(→ pandas.DataFrame)

Read mass spec data.

ftrs_reader(→ pandas.DataFrame)

Reads Features file from Byos.

_select_and_order_columns(→ pandas.DataFrame)

Select (renamed) columns and order them.

theo_masses_reader(→ pandas.DataFrame)

Reads theoretical masses files (csv) returning a Panda Dataframe

maxquant_file_reader(file[, columns])

Reads maxquant files and outputs data as a dataframe.

dataframe_to_csv_metadata(→ str | None)

Convert dataframe to CSV with metadata.

default_filename(→ str)

Generate a default filename based on the current date/time.

read_yaml(→ dict)

Read a YAML file.

Attributes

LOGGER

pgfinder.pgio.LOGGER
pgfinder.pgio.ms_file_reader(file: str | pathlib.Path) pandas.DataFrame[source]

Read mass spec data.

Parameters:

file (str | Path) – Path to be loaded.

Returns:

File loaded as Pandas Dataframe.

Return type:

pd.DataFrame

pgfinder.pgio.ftrs_reader(file: str | pathlib.Path, columns: dict = COLUMNS) pandas.DataFrame[source]

Reads Features file from Byos.

Parameters:
  • file (str | Path) – Feature file to be read.

  • columns (dict) – Dictionary of columns, this defaults to the global COLUMNS which is read from ‘config/columns.yaml’ and

  • formats. (simplifies extension to new) –

Returns:

Pandas DataFrame of features.

Return type:

pd.DataFrame

pgfinder.pgio._select_and_order_columns(df: pandas.DataFrame, columns: dict = COLUMNS) pandas.DataFrame[source]

Select (renamed) columns and order them.

Parameters:
  • df (pd.DataFrame) – Full dataframe from which a subset of variables is to be returned.

  • columns (dict) – Dictionary of columns, this defaults to the global COLUMNS which is read from ‘config/columns.yaml’ and

  • formats. (simplifies extension to new) –

Returns:

Subset of data frame with selected columns in specified order.

Return type:

pd.DataFrame

pgfinder.pgio.theo_masses_reader(file: str | pathlib.Path) pandas.DataFrame[source]

Reads theoretical masses files (csv) returning a Panda Dataframe

Parameters:

file (str | Path) – Path to file to be loaded.

Returns:

Pandas DataFrame of theoretical masses.

Return type:

pd.DataFrame

pgfinder.pgio.maxquant_file_reader(file: str | pathlib.Path, columns: dict = COLUMNS)[source]

Reads maxquant files and outputs data as a dataframe.

Parameters:
  • filepath (str | Path) – Path to a text file.

  • columns (dict) – Dictionary of columns, this defaults to the global COLUMNS which is read from ‘config/columns.yaml’ and

  • formats. (simplifies extension to new) –

Returns:

Pandas Data frame.

Return type:

pd.DataFrame

pgfinder.pgio.dataframe_to_csv_metadata(output_dataframe: pandas.DataFrame, save_filepath: str | pathlib.Path = None, filename: str | pathlib.Path = None, float_format: str = '%.4f') str | None[source]

Convert dataframe to CSV with metadata.

If save_filepath is specified return the relative path of the output file, including the filename, otherwise return the .csv in the form of a string.

Parameters:
  • output_dataframe (pd.DataFrame) – Dataframe to output.

  • save_filepath (str | Path) – Path to save to.

  • filename (str | Path) – Filename to save to.

  • float_format (str) – Format for floating point numbers (default 4 decimal places).

Returns:

Either returns the path to write data to or writes it to CSV.

Return type:

str | None

pgfinder.pgio.default_filename(prefix: str = 'results_') str[source]

Generate a default filename based on the current date/time.

Parameters:

prefix (str) – String to use as a prefix, default is ‘results_’.

Returns:

Filename with format ‘results_YYYY-MM-DD-hh-mm-ss.csv’.

Return type:

str

pgfinder.pgio.read_yaml(filename: str | pathlib.Path) dict[source]

Read a YAML file.

Parameters:

filename (str | Path) – YAML file to read.

Returns:

Dictionary of the file.

Return type:

dict