micom.workflows.build

Worflow to build models for several samples.

Module Contents

Functions

_reduce_group(df)

build_and_save(args)

Build a single community model.

build(taxonomy, model_db, out_folder, cutoff=0.0001, threads=1, solver=None)

Build a series of community models.

_summarize_models(args)

build_database(manifest, out_path, rank='genus', threads=1, compress=None, compresslevel=6, progress=True)

Create a model database from a set of SBML files.

Attributes

REQ_FIELDS

micom.workflows.build._reduce_group(df)[source]
micom.workflows.build.build_and_save(args)[source]

Build a single community model.

micom.workflows.build.build(taxonomy, model_db, out_folder, cutoff=0.0001, threads=1, solver=None)[source]

Build a series of community models.

This is a best-practice implementation of building community models for several samples in parallel.

Parameters
  • taxonomy (pandas.DataFrame) – The taxonomy used for building the model. Must have at least the columns “id” and “sample_id”. This must also contain at least a column with the same name as the rank used in the model database. Thus, for a genus-level database you will need a column genus. Additional taxa ranks can also be specified and will be used to be more stringent in taxa matching. Finally, the taxonomy should contain a column abundance. It will be used to quantify each individual in the community. If absent, MICOM will assume all individuals are present in the same amount.

  • model_db (str) – A pre-built model database. If ending in .qza must be a Qiime 2 artifact of type MetabolicModels[JSON]. Can also be a folder, zip (must end in .zip) file or None if the taxonomy contains a column file.

  • out_folder (str) – The built models and a manifest file will be written to this folder.

  • cutoff (float in [0.0, 1.0]) – Abundance cutoff. Taxa with a relative abundance smaller than this will not be included in the model.

  • threads (int >=1) – The number of parallel workers to use when building models. As a rule of thumb you will need around 1GB of RAM for each thread.

  • solver (str) – Name of the solver used for the linear and quadratic problems.

Returns

The manifest for the built models. Contains taxa abundances, build metrics and file basenames.

Return type

pandas.DataFrame

micom.workflows.build.REQ_FIELDS[source]
micom.workflows.build._summarize_models(args)[source]
micom.workflows.build.build_database(manifest, out_path, rank='genus', threads=1, compress=None, compresslevel=6, progress=True)[source]

Create a model database from a set of SBML files.

Note

A manifest for the joined models will also be written to the output folder as “manifest.csv”. This may contain NA entries for additional columns that had different values within the summarized models.

Parameters
  • manifest (pandas.DataFrame) – A manifest of SBML files containing their filepath as well as taxonomy. Must contain the columns “file”, “kingdom”, “phylum”, “class”, “order”, “family”, “genus”, and “species”. May contain additional columns.

  • out_path (str) – The directory or zip file where the joined models will be written.

  • threads (int >=1) – The number of parallel workers to use when building models. As a rule of thumb you will need around 1GB of RAM for each thread.

  • compress (str (default None)) – Compression method to use. Must be “zlib”, “bz2”, “lzma” or None. This parameter is ignored if out_path does not end with “.zip”.

  • compresslevel (int [1-9] (default: 6)) – Level of compression. Only used if compress is not None. This parameter is ignored if out_path does not end with “.zip”.

  • progress (bool) – Whether to show a progress bar.

Returns

The manifest of the joined models. Will still contain information from the original metadata.

Return type

pd.DataFrame