{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing your own workflows\n",
    "\n",
    "Micom was designed to create and analyze personalized metabolic models for microbial communities. This makes it necessary to run many of the analyses in micom for many samples. As all of the methods currently implemented in micom can be run independently for each sample this workload can be parallelized pretty easily. To make this simple micom provides a workflow module that lets you run analyses for many samples in parallel. It also integrate with the micom logger and has workarounds for some memory leaks in optlang which improves memory usage. As a rule of thumb for each sample you will need one CPU and about 1GB of RAM, so if you have a server with 16 cores and 16+GB of RAM available you can run up to 16 samples in parallel.\n",
    "\n",
    "For a workflow you will need two things:\n",
    "\n",
    "1. A function that takes arguments for a single sample and performs your analysis\n",
    "2. A list of arguments for each sample\n",
    "\n",
    "Let us understand this better with a short example. Let us assume that we want to run the cooperative tradeoff method for our *E. coli* example with varying numbers of *E. coli* strains."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>genus</th>\n",
       "      <th>species</th>\n",
       "      <th>reactions</th>\n",
       "      <th>metabolites</th>\n",
       "      <th>file</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Escherichia_coli_1</td>\n",
       "      <td>Escherichia</td>\n",
       "      <td>Escherichia coli</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "      <td>/home/cdiener/code/micom/micom/data/e_coli_cor...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Escherichia_coli_2</td>\n",
       "      <td>Escherichia</td>\n",
       "      <td>Escherichia coli</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "      <td>/home/cdiener/code/micom/micom/data/e_coli_cor...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Escherichia_coli_3</td>\n",
       "      <td>Escherichia</td>\n",
       "      <td>Escherichia coli</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "      <td>/home/cdiener/code/micom/micom/data/e_coli_cor...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Escherichia_coli_4</td>\n",
       "      <td>Escherichia</td>\n",
       "      <td>Escherichia coli</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "      <td>/home/cdiener/code/micom/micom/data/e_coli_cor...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   id        genus           species  reactions  metabolites  \\\n",
       "0  Escherichia_coli_1  Escherichia  Escherichia coli         95           72   \n",
       "1  Escherichia_coli_2  Escherichia  Escherichia coli         95           72   \n",
       "2  Escherichia_coli_3  Escherichia  Escherichia coli         95           72   \n",
       "3  Escherichia_coli_4  Escherichia  Escherichia coli         95           72   \n",
       "\n",
       "                                                file  \n",
       "0  /home/cdiener/code/micom/micom/data/e_coli_cor...  \n",
       "1  /home/cdiener/code/micom/micom/data/e_coli_cor...  \n",
       "2  /home/cdiener/code/micom/micom/data/e_coli_cor...  \n",
       "3  /home/cdiener/code/micom/micom/data/e_coli_cor...  "
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from micom.data import test_taxonomy\n",
    "\n",
    "taxonomies = [test_taxonomy(n=n) for n in range(2, 12)]\n",
    "taxonomies[2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This will be our arguments. Each entry in taxonomies defines a single sample so we have 10 samples in total. Now we need a function that takes a single samples' arguments as input (as set of abundances) and runs the cooperative tradeoff method. So let us implement that:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from micom import Community\n",
    "\n",
    "def run_tradeoff(tax):\n",
    "    com = Community(tax, progress=False)\n",
    "    sol = com.cooperative_tradeoff()\n",
    "    return sol.members"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is all we need to run the analysis in parallel."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "8ecb59a4c4ad44478e97a948503ecb36",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "from micom.workflows import workflow\n",
    "\n",
    "results = workflow(run_tradeoff, taxonomies, threads=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`results` is a list that contains one entry for each result (in the correct order)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>abundance</th>\n",
       "      <th>growth_rate</th>\n",
       "      <th>reactions</th>\n",
       "      <th>metabolites</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>compartments</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Escherichia_coli_1</th>\n",
       "      <td>0.25</td>\n",
       "      <td>0.873922</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Escherichia_coli_2</th>\n",
       "      <td>0.25</td>\n",
       "      <td>0.873922</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Escherichia_coli_3</th>\n",
       "      <td>0.25</td>\n",
       "      <td>0.873922</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Escherichia_coli_4</th>\n",
       "      <td>0.25</td>\n",
       "      <td>0.873922</td>\n",
       "      <td>95</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>medium</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>20</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                    abundance  growth_rate  reactions  metabolites\n",
       "compartments                                                      \n",
       "Escherichia_coli_1       0.25     0.873922         95           72\n",
       "Escherichia_coli_2       0.25     0.873922         95           72\n",
       "Escherichia_coli_3       0.25     0.873922         95           72\n",
       "Escherichia_coli_4       0.25     0.873922         95           72\n",
       "medium                    NaN          NaN         20           20"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "results[2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}