{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MICOM workflows\n", "\n", "The MICOM workflow API provides prebuilt solutions for the most common MICOM analyses across several samples. This will manage most of the workload for you and uses an efficient parallelization scheme to make use of several CPU cores. The workflow API is probably the best entry point for you if the following is true:\n", "\n", "1. You have at least 2 samples with taxonomy assignments and abundances for each\n", "2. You have chosen to use one of the preused model databases or have already built your own\n", "3. You want to run a set of standard analyses and visualization on the models\n", "\n", "In that case the prebuilt workflows will make your analysis much simpler and faster and will take care of parallelizing your analyses. The worflow API can be mixed with the MICOM API at any point. So you could do a few steps with the workflows and then run you own analyses downstream from that. Additionally, you can directly import you start data from Qiime 2. See the [Loading Qiime 2 data](qiime2.html).\n", "\n", "## Building and models and simulating growth\n", "\n", "### Input formats\n", "\n", "To start building community models for all your samples you will need to provide your data to MICOM. MICOM prefers to have the taxonomy and abundances for all samples in a single [tidy DataFrame](https://vita.had.co.nz/papers/tidy-data.pdf). Here each taxon in each sample is a row which provides its taxonomy and abundance. This may sound a bit confusing but should become pretty clear when looking at an example. MICOM can generate a simple example DataFrame which we can use as guidance." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idgenusspeciesreactionsmetabolitesfilesample_idabundance
0Escherichia_coli_1EscherichiaEscherichia coli 09572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_1882
1Escherichia_coli_2EscherichiaEscherichia coli 19572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_1718
2Escherichia_coli_3EscherichiaEscherichia coli 29572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_1817
3Escherichia_coli_4EscherichiaEscherichia coli 39572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_1850
0Escherichia_coli_1EscherichiaEscherichia coli 09572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_2423
1Escherichia_coli_2EscherichiaEscherichia coli 19572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_2765
2Escherichia_coli_3EscherichiaEscherichia coli 29572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_2694
3Escherichia_coli_4EscherichiaEscherichia coli 39572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_2129
0Escherichia_coli_1EscherichiaEscherichia coli 09572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_3823
1Escherichia_coli_2EscherichiaEscherichia coli 19572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_3110
2Escherichia_coli_3EscherichiaEscherichia coli 29572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_3260
3Escherichia_coli_4EscherichiaEscherichia coli 39572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_3807
0Escherichia_coli_1EscherichiaEscherichia coli 09572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_4436
1Escherichia_coli_2EscherichiaEscherichia coli 19572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_462
2Escherichia_coli_3EscherichiaEscherichia coli 29572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_4631
3Escherichia_coli_4EscherichiaEscherichia coli 39572/home/cdiener/code/micom/micom/data/e_coli_cor...sample_4479
\n", "
" ], "text/plain": [ " id genus species reactions \\\n", "0 Escherichia_coli_1 Escherichia Escherichia coli 0 95 \n", "1 Escherichia_coli_2 Escherichia Escherichia coli 1 95 \n", "2 Escherichia_coli_3 Escherichia Escherichia coli 2 95 \n", "3 Escherichia_coli_4 Escherichia Escherichia coli 3 95 \n", "0 Escherichia_coli_1 Escherichia Escherichia coli 0 95 \n", "1 Escherichia_coli_2 Escherichia Escherichia coli 1 95 \n", "2 Escherichia_coli_3 Escherichia Escherichia coli 2 95 \n", "3 Escherichia_coli_4 Escherichia Escherichia coli 3 95 \n", "0 Escherichia_coli_1 Escherichia Escherichia coli 0 95 \n", "1 Escherichia_coli_2 Escherichia Escherichia coli 1 95 \n", "2 Escherichia_coli_3 Escherichia Escherichia coli 2 95 \n", "3 Escherichia_coli_4 Escherichia Escherichia coli 3 95 \n", "0 Escherichia_coli_1 Escherichia Escherichia coli 0 95 \n", "1 Escherichia_coli_2 Escherichia Escherichia coli 1 95 \n", "2 Escherichia_coli_3 Escherichia Escherichia coli 2 95 \n", "3 Escherichia_coli_4 Escherichia Escherichia coli 3 95 \n", "\n", " metabolites file sample_id \\\n", "0 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_1 \n", "1 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_1 \n", "2 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_1 \n", "3 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_1 \n", "0 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_2 \n", "1 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_2 \n", "2 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_2 \n", "3 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_2 \n", "0 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_3 \n", "1 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_3 \n", "2 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_3 \n", "3 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_3 \n", "0 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_4 \n", "1 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_4 \n", "2 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_4 \n", "3 72 /home/cdiener/code/micom/micom/data/e_coli_cor... sample_4 \n", "\n", " abundance \n", "0 882 \n", "1 718 \n", "2 817 \n", "3 850 \n", "0 423 \n", "1 765 \n", "2 694 \n", "3 129 \n", "0 823 \n", "1 110 \n", "2 260 \n", "3 807 \n", "0 436 \n", "1 62 \n", "2 631 \n", "3 479 " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.data import test_data\n", "\n", "data = test_data()\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is very simple example where each sample contains 4 different *E. coli* species in random abundances. Thus, every sample has 4 rows in this DataFrame. The DataFrame also contains additional columns, **the only required columns are \"id\", \"sample_id\", \"abundance\" and one column that provides the summary rank, here \"species\".**\n", "\n", "Note that we also have an additional column \"genus\" here. The minimal taxonomic information you have to provide is only the name of the taxonomy rank matching the database you are using. So if you are using a genus-level database you will need a column \"genus\". In this case we mill use a species-level database so we had to provide a column \"species\". If there any additional columns from the set `{\"kingdom\", \"phylum\", \"class\", \"order\", \"family\", \"genus\", \"species\"}` those will be used to make the mapping with the database more stringent. For instance, here we provided a column \"genus\" which means models will only be counted as a \"match\" if the taxon has the same genus *and* species in the data and the model database. \n", "\n", "Thus, the more taxonomic rank columns you include in the data you pass to MICOM, the more stringent MICOM will become matching to the reference database. This can be used to circumvent poorly matching ranks as well. For instance, if you know your data matches well by genus and phylum names but families are named differently even for the same taxa you can omit the \"family\" column from your data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building community models\n", "\n", "To build a community sample for each of your sample you will need the abundance table as provided above and a model database. Usually we recommend to use one of the prebuilt MICOM database from https://doi.org/10.5281/zenodo.3755182. Additionally, you can also [create your own database]().\n", "\n", "For our example we have a custom species-level database that is bundled with MICOM. With the abundance table and database you can now start building your models by providing a folder where the assembled community models should be stored." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5636900736ac45b9b8a9e9ca7f9272e3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=4.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "from micom.data import test_db\n", "from micom.workflows import build\n", "\n", "manifest = build(data, out_folder=\"models\", model_db=test_db, cutoff=0.0001, threads=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will also allow you to specify a relative abundance cutoff for a taxon to be included with the `cutoff` argument. The default is to include only taxa that constitute at least 0.01% of the sample. Model building will be automatically parallelized over multiple CPUs and the number of cores/threads to use for should be set with the `threads` argument. The workflows will also warn you if for any samples less than 50% of the abundance was matched to the database. Since our data was random this may have happened here.\n", "\n", "The `build` workflow will return a model manifest:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
reactionsmetabolitesfilesample_idfound_taxatotal_taxafound_fractionfound_abundance_fraction
09572sample_1.picklesample_13.04.00.750.730028
19572sample_2.picklesample_23.04.00.750.789657
29572sample_3.picklesample_33.04.00.750.588500
39572sample_4.picklesample_43.04.00.750.728856
\n", "
" ], "text/plain": [ " reactions metabolites file sample_id found_taxa total_taxa \\\n", "0 95 72 sample_1.pickle sample_1 3.0 4.0 \n", "1 95 72 sample_2.pickle sample_2 3.0 4.0 \n", "2 95 72 sample_3.pickle sample_3 3.0 4.0 \n", "3 95 72 sample_4.pickle sample_4 3.0 4.0 \n", "\n", " found_fraction found_abundance_fraction \n", "0 0.75 0.730028 \n", "1 0.75 0.789657 \n", "2 0.75 0.588500 \n", "3 0.75 0.728856 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "manifest" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will propagate information from your input table as well as give you metrics on how well the samples were matched to the database. Our database only include models for the first 3 *E. coli* species so you see the workflow could only match 3/4 taxa for each sample. Probably the most important column is the `found_abundance_fraction`. This one tells which fraction of the sample abundance was matched to the database. You usually want this column to be above 0.5 so the majority of the sample is matched. A value of 1.0 would be perfect but is usually hard to achieve. Values around 0.8 are usually pretty good.\n", "\n", "The `file` column denotes the filename for the built community within the folder specified as `out_folder` before. You can use the `load_pickle` function to read individual models and run custom analyses. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "305\n" ] } ], "source": [ "from micom import load_pickle\n", "\n", "com = load_pickle(\"models/sample_1.pickle\")\n", "print(len(com.reactions))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simulating growth\n", "\n", "With our built models we can now advance to simulating growth with MICOMs `cooperative tradeoff` algorithm. This will use the manifest we just generated but will also require a growth medium to be specified. A growth medium provided information which metabolites are available to the microbes for consumption and also provided an upper bound on the flux it is added to the system. This can be obtained from fluxomics data or approximated from growth media (for cultivation settings) or diet data (for gut microbiota models). Obtaining a correct media composition can be challenging. We will show some helper functions for that later on but for now we will use a pres-specified medium saved in Qiime 2 format. We also provide a growth medium describing an average Western diet for the AGORA model database at https://doi.org/10.5281/zenodo.3755182. \n", "\n", "Growth media in MICOM are pretty simple DataFrames and can be read from a variety of formats. Here we will use the Qiime 2 Artifact format." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
reactionfluxmetabolite
reaction
EX_glc__D_mEX_glc__D_m10.000000glc__D_m
EX_nh4_mEX_nh4_m4.362240nh4_m
EX_o2_mEX_o2_m18.579253o2_m
EX_pi_mEX_pi_m2.942960pi_m
\n", "
" ], "text/plain": [ " reaction flux metabolite\n", "reaction \n", "EX_glc__D_m EX_glc__D_m 10.000000 glc__D_m\n", "EX_nh4_m EX_nh4_m 4.362240 nh4_m\n", "EX_o2_m EX_o2_m 18.579253 o2_m\n", "EX_pi_m EX_pi_m 2.942960 pi_m" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.data import test_medium\n", "from micom.qiime_formats import load_qiime_medium\n", "\n", "medium = load_qiime_medium(test_medium)\n", "medium" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see a medium is simply a DataFrame with columns `reaction` and `flux`. Where reaction is the name of external exchange reaction in the model and flux is the upper bound (usually in mmol/gDW/h). \n", "\n", "The last thing we need to choose is the `tradeoff` parameter for the growth simulation. This is explained in detail in the [Methods used by MICOM] section and expresses what fraction of maximum community growth is to be maintained while trying to maximize individual growth rates. The `tradeoff` takes values between 0 and 1 where zero denotes no community growth and and 1 denotes maximum community growth. We will use a vlue of 0.5 here. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "941f83ab524e4bef94ea445b930f356a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=4.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "from micom.workflows import grow\n", "\n", "res = grow(manifest, model_folder=\"models\", medium=medium, tradeoff=0.5, threads=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This gives us a results tuple with three entries: `growth_rates`, `exchanges`, and `annotations` providing the growth rates, exchange fluxes and metabolite annotations, respectively. This could be passed on to the visualization workflows or you could run your own analyses on those DataFrames. But for now we will go back and look at some helper workflows to choose a tradeoff parameter and get media." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Choosing a tradeoff parameter\n", "\n", "Results depend strongly on the tradeoff parameter. Even though values between 0.3-0.6 usually work well we recommend to run a tradeoff analysis to choose the best parameters for your data set and protocol. If you have already analyzed many samples in your lab and found a particular value to work well in general you may just use that but you should at least run this analysis once. In [our paper](https://doi.org/10.1128/mSystems.00606-19) we found that tradeoff best reproducing *in vivo* growth rates is the largest tradeoff that allows the majority of the bacteria to grow. Thus, the best tradeoff value is the value providing the best compromise between individual and cooperative growth. \n", "\n", "The `tradeoff` workflow will run growth simulations with several tradeoff values and return the results. " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "2425d018188e4ef8b0772b4a7badd7c9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=4.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
abundancegrowth_ratereactionsmetabolitestaxontradeoffsample_id
compartments
Escherichia_coli_20.3010480.0000009572Escherichia_coli_2NaNsample_1
Escherichia_coli_30.3425581.4338639572Escherichia_coli_3NaNsample_1
Escherichia_coli_40.3563940.8665109572Escherichia_coli_4NaNsample_1
Escherichia_coli_20.3010480.7189379572Escherichia_coli_21.0sample_1
Escherichia_coli_30.3425580.8180669572Escherichia_coli_31.0sample_1
\n", "
" ], "text/plain": [ " abundance growth_rate reactions metabolites \\\n", "compartments \n", "Escherichia_coli_2 0.301048 0.000000 95 72 \n", "Escherichia_coli_3 0.342558 1.433863 95 72 \n", "Escherichia_coli_4 0.356394 0.866510 95 72 \n", "Escherichia_coli_2 0.301048 0.718937 95 72 \n", "Escherichia_coli_3 0.342558 0.818066 95 72 \n", "\n", " taxon tradeoff sample_id \n", "compartments \n", "Escherichia_coli_2 Escherichia_coli_2 NaN sample_1 \n", "Escherichia_coli_3 Escherichia_coli_3 NaN sample_1 \n", "Escherichia_coli_4 Escherichia_coli_4 NaN sample_1 \n", "Escherichia_coli_2 Escherichia_coli_2 1.0 sample_1 \n", "Escherichia_coli_3 Escherichia_coli_3 1.0 sample_1 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.workflows import tradeoff\n", "\n", "tradeoff_rates = tradeoff(manifest, model_folder=\"models\", medium=medium, threads=2)\n", "tradeoff_rates.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we see it returns growth rates for each taxon in each sample for various tradeoff values. There is also a tradeoff value of `NaN` which means optimization of the pure community growth rate without regularization which usually has very bad performance and is provided as a reference. To choose a good value we can count how many of the taxa can grow (growth rate > 1e-6) on average for each of the tradeoff values across all samples." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
tradeoff0
00.112
10.212
20.312
30.412
40.512
50.612
60.712
70.812
80.912
91.012
\n", "
" ], "text/plain": [ " tradeoff 0\n", "0 0.1 12\n", "1 0.2 12\n", "2 0.3 12\n", "3 0.4 12\n", "4 0.5 12\n", "5 0.6 12\n", "6 0.7 12\n", "7 0.8 12\n", "8 0.9 12\n", "9 1.0 12" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tradeoff_rates.groupby(\"tradeoff\").apply(\n", " lambda df: (df.growth_rate > 1e-6).sum()).reset_index()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In that case all taxa (3 taxa for 4 samples each) can grow for all tradeoff values since we provided an excess of nutrients in the medium. So a tradeoff of 1.0 would have been the best here. For real data you will usually see those numbers decline for larger tradeoff values. A more detailed analysis can be performed with the [`plot_tradeoff` visualization](viz.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fixing growth media\n", "\n", "Providing a growth medium may be complicated since you often only have some intuitions about a few components of the medium but lack information on others. Even when supplying putatively complete descriptions you will often observe that the models will predict the absence of growth since you are lacking an essential cofactor. To help with this MICOM provides a workflow that can complete any predefined growth medium with the minimal additional substrates to allow growth for all taxa in the database. \n", "\n", "For instance let us assume we know that our *E. coli* samples consume Glucose and Oxygen. The respective exchange reactions are `EX_glc__D_m` and `EX_o2_m`. So we can start by building our candidate medium assuming we can import twice as much oxygen as glucose." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
reactionflux
0EX_glc__D_m10
1EX_o2_m20
\n", "
" ], "text/plain": [ " reaction flux\n", "0 EX_glc__D_m 10\n", "1 EX_o2_m 20" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "candidate_medium = pd.DataFrame({\"reaction\": [\"EX_glc__D_m\", \"EX_o2_m\"], \"flux\": [10, 20]})\n", "candidate_medium" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now ask MICOM to complete this medium by adding the smallest amount of overall flux so that all taxa in the database can grow with a growth rate off at least 0.1 1/h. " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "65a5e1aea1dc41ab9381a49c31dca619", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=4.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
reactionmetabolitedescriptionflux
0EX_glc__D_mglc__D_mD-Glucose10.00000
1EX_gln__L_mgln__L_mL-Glutamine0.27264
2EX_o2_mo2_mO220.00000
3EX_pi_mpi_mPhosphate0.36787
\n", "
" ], "text/plain": [ " reaction metabolite description flux\n", "0 EX_glc__D_m glc__D_m D-Glucose 10.00000\n", "1 EX_gln__L_m gln__L_m L-Glutamine 0.27264\n", "2 EX_o2_m o2_m O2 20.00000\n", "3 EX_pi_m pi_m Phosphate 0.36787" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.workflows import fix_medium\n", "\n", "medium = fix_medium(manifest, model_folder=\"models\", medium=candidate_medium, \n", " community_growth=0.1, min_growth=0.01, \n", " max_import=10, threads=2)\n", "medium" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So we see that we can achieve growth by adding import for phosphate and glutamine." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.1rc1" } }, "nbformat": 4, "nbformat_minor": 4 }