{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Single Model API\n", "\n", "## Building communities\n", "\n", "`micom` will construct communities from a specification via a Pandas DataFrame. Here, the DataFrame needs at least two columns: \"id\" and \"file\" which specify the ID of the organism/tissue and a file containing the actual individual model. \n", "\n", "To make more sense of that we can look at a small example. `micom` comes with a function that generates a simple example community specification consisting of several copies of a small *E. coli* model containing only the central carbon metabolism." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idgenusspeciesreactionsmetabolitesfile
0Escherichia_coli_1EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
1Escherichia_coli_2EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
2Escherichia_coli_3EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
3Escherichia_coli_4EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
\n", "
" ], "text/plain": [ " id genus species reactions metabolites \\\n", "0 Escherichia_coli_1 Escherichia Escherichia coli 95 72 \n", "1 Escherichia_coli_2 Escherichia Escherichia coli 95 72 \n", "2 Escherichia_coli_3 Escherichia Escherichia coli 95 72 \n", "3 Escherichia_coli_4 Escherichia Escherichia coli 95 72 \n", "\n", " file \n", "0 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "1 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "2 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "3 /home/cdiener/code/micom/micom/data/e_coli_cor... " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.data import test_taxonomy\n", "\n", "taxonomy = test_taxonomy()\n", "taxonomy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we see this specification contains the required fields and some more information. In fact the specification may contain any number of additional information which will be saved along with the community model. One special example is \"abundance\" which we will get to know soon :) The logic behind requiring the community information in this table format is that this table can be appended as supplement to your project or publication as is and describes your community composition without any doubt.\n", "\n", "In order to convert the specification in a community model we will use the `Community` class from `micom` which derives from the cobrapy `Model` class." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f8eff43e337d440fa048416040171fed", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Build a community with a total of 400 reactions.\n" ] } ], "source": [ "from micom import Community\n", "\n", "com = Community(taxonomy)\n", "print(\"Build a community with a total of {} reactions.\".format(len(com.reactions)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This includes the correctly scaled exchange reactions with the internal medium and initializes the external imports to the maximum found in all models. The original taxonomy is maintained in the `com.taxonomy` attribute.\n", "\n", "Note that `micom` can figure out how to read a variety of different file types based on the extension. It curently supports:\n", "\n", "- `.pickle` for pickled models\n", "- `.xml` or `.gz` for XML models\n", "- `.json` for JSON models\n", "- `.mat` for COBRAtoolbox models\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idgenusspeciesreactionsmetabolitesfileabundance
id
Escherichia_coli_1Escherichia_coli_1EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...0.25
Escherichia_coli_2Escherichia_coli_2EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...0.25
Escherichia_coli_3Escherichia_coli_3EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...0.25
Escherichia_coli_4Escherichia_coli_4EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...0.25
\n", "
" ], "text/plain": [ " id genus species \\\n", "id \n", "Escherichia_coli_1 Escherichia_coli_1 Escherichia Escherichia coli \n", "Escherichia_coli_2 Escherichia_coli_2 Escherichia Escherichia coli \n", "Escherichia_coli_3 Escherichia_coli_3 Escherichia Escherichia coli \n", "Escherichia_coli_4 Escherichia_coli_4 Escherichia Escherichia coli \n", "\n", " reactions metabolites \\\n", "id \n", "Escherichia_coli_1 95 72 \n", "Escherichia_coli_2 95 72 \n", "Escherichia_coli_3 95 72 \n", "Escherichia_coli_4 95 72 \n", "\n", " file \\\n", "id \n", "Escherichia_coli_1 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "Escherichia_coli_2 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "Escherichia_coli_3 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "Escherichia_coli_4 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "\n", " abundance \n", "id \n", "Escherichia_coli_1 0.25 \n", "Escherichia_coli_2 0.25 \n", "Escherichia_coli_3 0.25 \n", "Escherichia_coli_4 0.25 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "com.taxonomy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can notice we have gained a new column called `abundance`. This column quantifies the relative quantity of each individual in the community. Since we did not specify this in the original taxonomy `micom` assumes that all individuals are present in the same quantity." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The presented community here is pretty simplistic. For microbial communities `micom` includes a larger taxonomy for 773 microbial species from the [AGORA paper](https://doi.org/10.1038/nbt.3703). Here the \"file\" column only contains the base names for the files but you can easily prepend any path as demonstrated in the following:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
organismseedidkingdomphylummclassorderfamilygenusoxygenstatmetabolism...draftcreatedphenotypeimgidncbiidplatformkbaseidspeciesfileidtaxa_id
0Abiotrophia defectiva ATCC 49176Abiotrophia defectiva ATCC 49176 (592010.4)BacteriaFirmicutesBacilliLactobacillalesAerococcaceaeAbiotrophiaFacultative anaerobeSaccharolytic, fermentative or respiratory...07/01/141.02.562617e+09592010.0ModelSEEDNaNdefectivamodels/Abiotrophia_defectiva_ATCC_49176.xmlAbiotrophia_defectiva_ATCC_4917646125.0
1Achromobacter xylosoxidans A8Achromobacter xylosoxidans A8 (762376.5)BacteriaProteobacteriaBetaproteobacteriaBurkholderialesAlcaligenaceaeAchromobacterAerobeRespiratory...NaNNaNNaN762376.0ModelSEEDkb_g_3268_model_gfxylosoxidansmodels/Achromobacter_xylosoxidans_A8.xmlAchromobacter_xylosoxidans_A885698.0
2Achromobacter xylosoxidans NBRC 15126 = ATCC 2...NaNBacteriaProteobacteriaBetaproteobacteriaBurkholderialesAlcaligenaceaeAchromobacterAerobeRespiratory...NaNNaNNaN1216976.0Kbasekb|g.208127xylosoxidansmodels/Achromobacter_xylosoxidans_NBRC_15126.xmlAchromobacter_xylosoxidans_NBRC_1512685698.0
3Acidaminococcus fermentans DSM 20731Acidaminococcus fermentans DSM 20731 (591001.3)BacteriaFirmicutesNegativicutesAcidaminococcalesAcidaminococcaceaeAcidaminococcusObligate anaerobeFermentative...04/17/16NaN6.463119e+08591001.0Kbasekb|g.2555fermentansmodels/Acidaminococcus_fermentans_DSM_20731.xmlAcidaminococcus_fermentans_DSM_20731905.0
4Acidaminococcus intestini RyC-MR95Acidaminococcus intestini RyC-MR95 (568816.4)BacteriaFirmicutesNegativicutesSelenomonadalesAcidaminococcaceaeAcidaminococcusObligate anaerobeAsaccharolytic, glutamate is fermented...08/03/144.02.511231e+09568816.0KbaseNaNintestinimodels/Acidaminococcus_intestini_RyC_MR95.xmlAcidaminococcus_intestini_RyC_MR95187327.0
\n", "

5 rows × 26 columns

\n", "
" ], "text/plain": [ " organism \\\n", "0 Abiotrophia defectiva ATCC 49176 \n", "1 Achromobacter xylosoxidans A8 \n", "2 Achromobacter xylosoxidans NBRC 15126 = ATCC 2... \n", "3 Acidaminococcus fermentans DSM 20731 \n", "4 Acidaminococcus intestini RyC-MR95 \n", "\n", " seedid kingdom phylum \\\n", "0 Abiotrophia defectiva ATCC 49176 (592010.4) Bacteria Firmicutes \n", "1 Achromobacter xylosoxidans A8 (762376.5) Bacteria Proteobacteria \n", "2 NaN Bacteria Proteobacteria \n", "3 Acidaminococcus fermentans DSM 20731 (591001.3) Bacteria Firmicutes \n", "4 Acidaminococcus intestini RyC-MR95 (568816.4) Bacteria Firmicutes \n", "\n", " mclass order family genus \\\n", "0 Bacilli Lactobacillales Aerococcaceae Abiotrophia \n", "1 Betaproteobacteria Burkholderiales Alcaligenaceae Achromobacter \n", "2 Betaproteobacteria Burkholderiales Alcaligenaceae Achromobacter \n", "3 Negativicutes Acidaminococcales Acidaminococcaceae Acidaminococcus \n", "4 Negativicutes Selenomonadales Acidaminococcaceae Acidaminococcus \n", "\n", " oxygenstat metabolism ... \\\n", "0 Facultative anaerobe Saccharolytic, fermentative or respiratory ... \n", "1 Aerobe Respiratory ... \n", "2 Aerobe Respiratory ... \n", "3 Obligate anaerobe Fermentative ... \n", "4 Obligate anaerobe Asaccharolytic, glutamate is fermented ... \n", "\n", " draftcreated phenotype imgid ncbiid platform \\\n", "0 07/01/14 1.0 2.562617e+09 592010.0 ModelSEED \n", "1 NaN NaN NaN 762376.0 ModelSEED \n", "2 NaN NaN NaN 1216976.0 Kbase \n", "3 04/17/16 NaN 6.463119e+08 591001.0 Kbase \n", "4 08/03/14 4.0 2.511231e+09 568816.0 Kbase \n", "\n", " kbaseid species \\\n", "0 NaN defectiva \n", "1 kb_g_3268_model_gf xylosoxidans \n", "2 kb|g.208127 xylosoxidans \n", "3 kb|g.2555 fermentans \n", "4 NaN intestini \n", "\n", " file \\\n", "0 models/Abiotrophia_defectiva_ATCC_49176.xml \n", "1 models/Achromobacter_xylosoxidans_A8.xml \n", "2 models/Achromobacter_xylosoxidans_NBRC_15126.xml \n", "3 models/Acidaminococcus_fermentans_DSM_20731.xml \n", "4 models/Acidaminococcus_intestini_RyC_MR95.xml \n", "\n", " id taxa_id \n", "0 Abiotrophia_defectiva_ATCC_49176 46125.0 \n", "1 Achromobacter_xylosoxidans_A8 85698.0 \n", "2 Achromobacter_xylosoxidans_NBRC_15126 85698.0 \n", "3 Acidaminococcus_fermentans_DSM_20731 905.0 \n", "4 Acidaminococcus_intestini_RyC_MR95 187327.0 \n", "\n", "[5 rows x 26 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.data import agora\n", "\n", "tax = agora.copy()\n", "tax.file = \"models/\" + tax.file # assuming you have downloaded the AGORA models to the \"models\" folder\n", "tax.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving and loading communities\n", "\n", "Contructing large community models can be slow which is due to performance limitations of the solvers. In essence, adding a single variable/constraint to a model qih 10 variables is much faster than adding it to a model with 10 million variables. Thus, we recommend you save the constructed community in a serialized format afterwards which will be much faster in loading repetitively." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2a381ea543bd4803bdeb7eadd021cd2f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "CPU times: user 756 ms, sys: 12.6 ms, total: 768 ms\n", "Wall time: 776 ms\n" ] } ], "source": [ "%time com = Community(taxonomy)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 77.8 ms, sys: 3.77 ms, total: 81.6 ms\n", "Wall time: 81.4 ms\n" ] } ], "source": [ "%time com.to_pickle(\"community.pickle\")" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 36.4 ms, sys: 7.71 ms, total: 44.1 ms\n", "Wall time: 43.6 ms\n" ] } ], "source": [ "from micom import load_pickle\n", "%time com = load_pickle(\"community.pickle\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see loading the model from the pickle format is much faster than creating it *de novo*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }