{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing your own workflows\n", "\n", "Micom was designed to create and analyze personalized metabolic models for microbial communities. This makes it necessary to run many of the analyses in micom for many samples. As all of the methods currently implemented in micom can be run independently for each sample this workload can be parallelized pretty easily. To make this simple micom provides a workflow module that lets you run analyses for many samples in parallel. It also integrate with the micom logger and has workarounds for some memory leaks in optlang which improves memory usage. As a rule of thumb for each sample you will need one CPU and about 1GB of RAM, so if you have a server with 16 cores and 16+GB of RAM available you can run up to 16 samples in parallel.\n", "\n", "For a workflow you will need two things:\n", "\n", "1. A function that takes arguments for a single sample and performs your analysis\n", "2. A list of arguments for each sample\n", "\n", "Let us understand this better with a short example. Let us assume that we want to run the cooperative tradeoff method for our *E. coli* example with varying numbers of *E. coli* strains." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idgenusspeciesreactionsmetabolitesfile
0Escherichia_coli_1EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
1Escherichia_coli_2EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
2Escherichia_coli_3EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
3Escherichia_coli_4EscherichiaEscherichia coli9572/home/cdiener/code/micom/micom/data/e_coli_cor...
\n", "
" ], "text/plain": [ " id genus species reactions metabolites \\\n", "0 Escherichia_coli_1 Escherichia Escherichia coli 95 72 \n", "1 Escherichia_coli_2 Escherichia Escherichia coli 95 72 \n", "2 Escherichia_coli_3 Escherichia Escherichia coli 95 72 \n", "3 Escherichia_coli_4 Escherichia Escherichia coli 95 72 \n", "\n", " file \n", "0 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "1 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "2 /home/cdiener/code/micom/micom/data/e_coli_cor... \n", "3 /home/cdiener/code/micom/micom/data/e_coli_cor... " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from micom.data import test_taxonomy\n", "\n", "taxonomies = [test_taxonomy(n=n) for n in range(2, 12)]\n", "taxonomies[2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will be our arguments. Each entry in taxonomies defines a single sample so we have 10 samples in total. Now we need a function that takes a single samples' arguments as input (as set of abundances) and runs the cooperative tradeoff method. So let us implement that:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from micom import Community\n", "\n", "def run_tradeoff(tax):\n", " com = Community(tax, progress=False)\n", " sol = com.cooperative_tradeoff()\n", " return sol.members" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is all we need to run the analysis in parallel." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8ecb59a4c4ad44478e97a948503ecb36", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "from micom.workflows import workflow\n", "\n", "results = workflow(run_tradeoff, taxonomies, threads=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`results` is a list that contains one entry for each result (in the correct order)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
abundancegrowth_ratereactionsmetabolites
compartments
Escherichia_coli_10.250.8739229572
Escherichia_coli_20.250.8739229572
Escherichia_coli_30.250.8739229572
Escherichia_coli_40.250.8739229572
mediumNaNNaN2020
\n", "
" ], "text/plain": [ " abundance growth_rate reactions metabolites\n", "compartments \n", "Escherichia_coli_1 0.25 0.873922 95 72\n", "Escherichia_coli_2 0.25 0.873922 95 72\n", "Escherichia_coli_3 0.25 0.873922 95 72\n", "Escherichia_coli_4 0.25 0.873922 95 72\n", "medium NaN NaN 20 20" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results[2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }