:py:mod:`micom.workflows.build`
===============================

.. py:module:: micom.workflows.build

.. autoapi-nested-parse::

   Worflow to build models for several samples.


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   micom.workflows.build._reduce_group
   micom.workflows.build.build_and_save
   micom.workflows.build.build
   micom.workflows.build._summarize_models
   micom.workflows.build.build_database


Attributes
~~~~~~~~~~

.. autoapisummary::

   micom.workflows.build.REQ_FIELDS


.. py:function:: _reduce_group(df)


.. py:function:: build_and_save(args)

   Build a single community model.


.. py:function:: build(taxonomy, model_db, out_folder, cutoff=0.0001, threads=1, solver=None)

   Build a series of community models.

   This is a best-practice implementation of building community models
   for several samples in parallel.

   :param taxonomy: The taxonomy used for building the model. Must have at least the
                    columns "id" and "sample_id". This must also
                    contain at least a column with the same name as the rank used in
                    the model database. Thus, for a genus-level database you will need
                    a column `genus`. Additional taxa ranks can also be specified and
                    will be used to be more stringent in taxa matching.
                    Finally, the taxonomy should contain a column `abundance`. It will
                    be used to quantify each individual in the community. If absent,
                    MICOM will assume all individuals are present in the same amount.
   :type taxonomy: pandas.DataFrame
   :param model_db: A pre-built model database. If ending in `.qza` must be a Qiime 2
                    artifact of type `MetabolicModels[JSON]`. Can also be a folder,
                    zip (must end in `.zip`) file or None if the taxonomy contains a
                    column `file`.
   :type model_db: str
   :param out_folder: The built models and a manifest file will be written to this
                      folder. Will continue
   :type out_folder: str
   :param cutoff: Abundance cutoff. Taxa with a relative abundance smaller than this
                  will not be included in the model.
   :type cutoff: float in [0.0, 1.0]
   :param threads: The number of parallel workers to use when building models. As a
                   rule of thumb you will need around 1GB of RAM for each thread.
   :type threads: int >=1
   :param solver: Name of the solver used for the linear and quadratic problems.
   :type solver: str

   :returns: The manifest for the built models. Contains taxa abundances,
             build metrics and file basenames.
   :rtype: pandas.DataFrame


.. py:data:: REQ_FIELDS

   
.. py:function:: _summarize_models(args)


.. py:function:: build_database(manifest, out_path, rank='genus', threads=1, compress=None, compresslevel=6, progress=True)

   Create a model database from a set of SBML files.

   .. note::

      A manifest for the joined models will also be written to the output folder
      as "manifest.csv". This may contain NA entries for additional columns
      that had different values within the summarized models.

   :param manifest: A manifest of SBML files containing their filepath as well as taxonomy.
                    Must contain the columns "file", "kingdom", "phylum", "class",
                    "order", "family", "genus", and "species". May contain additional
                    columns.
   :type manifest: pandas.DataFrame
   :param out_path: The directory or zip file where the joined models will be written.
   :type out_path: str
   :param threads: The number of parallel workers to use when building models. As a
                   rule of thumb you will need around 1GB of RAM for each thread.
   :type threads: int >=1
   :param compress: Compression method to use. Must be "zlib", "bz2", "lzma" or None.
                    This parameter is ignored if out_path does not end with ".zip".
   :type compress: str (default None)
   :param compresslevel: Level of compression. Only used if compress is not None.
                         This parameter is ignored if out_path does not end with ".zip".
   :type compresslevel: int [1-9] (default: 6)
   :param progress: Whether to show a progress bar.
   :type progress: bool

   :returns: The manifest of the joined models. Will still contain information
             from the original metadata.
   :rtype: pd.DataFrame