{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Work with solutions\n", "\n", "Here is an example* of work with the Cadbiom API to process the solutions obtained by a causality search.\n", "\n", "*: see notes at the end of this page.\n", " \n", "## Solutions handling" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "# Fix print function for Dino-Python\n", "from __future__ import print_function\n", "#import mpld3\n", "#mpld3.enable_notebook()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's define a function to get entities (boundaries) in Minimal Activation Conditions (MAC) from `*mac.txt` files.\n", "The command line package offers high level functions to process such data.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from cadbiom_cmd.tools.solutions import get_all_macs\n", "\n", "def load_macs(filepath):\n", " \"\"\"Return a set of entities in all MAC lines from a directory or from a file\"\"\"\n", " # Get MAC lines from a file\n", " return set(frozenset(mac.split()) for mac in get_all_macs(filepath))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Venn diagram\n", "\n", "Thanks to the package `matplotlib_venn` we can quickly design relationships between sets of boundaries accross multiple queries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from matplotlib_venn import venn3\n", "from matplotlib import pyplot as plt\n", "import itertools as it\n", "\n", " \n", "def venn(files):\n", " \"\"\"Display a Venn diagram to show relationships between sets of boundaries \n", " in the given files.\n", " \n", " :param files: Dictionary of filepaths as keys and corresponding titles as values\n", " (titles will be printed for the corresponding areas in the diagram).\n", " Ex: ``{filepath1: \"Query 1\"}``\n", " :type files: \n", " \"\"\" \n", " places = {title: set(it.chain(*load_macs(filepath))) for filepath, title in files.items()}\n", " # Tweak the size of the plot\n", " plt.figure(figsize=(8, 8))\n", " venn3(places.values(), places.keys())\n", "\n", " \n", "venn({\n", " \"_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2 and PERP_mac.txt\": \"MMP2 and PERP\",\n", " \"_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2_mac.txt\": \"MMP2\",\n", " \"_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_PERP_mac.txt\": \"PERP\",\n", "})\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get content of areas\n", "\n", "In order to know the places in each subset, we define a new function capable of additionally taking an infinite number of input files (Venn diagram are limited on this point)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "from copy import deepcopy\n", "\n", "def venn_data(files):\n", " \"\"\"Display the content of areas of the venn diagram built from sets of\n", " boundaries in the given files.\n", "\n", " Example with 3 files (7 independent sets):\n", " \n", " - PERP & MMP2 & MMP2 and PERP\n", " - PERP & MMP2 - (PERP & MMP2 & MMP2 and PERP)\n", " - PERP & MMP2 and PERP - (PERP & MMP2 & MMP2 and PERP)\n", " - MMP2 & MMP2 and PERP - (PERP & MMP2 & MMP2 and PERP)\n", " - PERP\n", " - MMP2\n", " - MMP2 and PERP\n", "\n", " :param files: Dictionary of filepaths as keys and corresponding titles as values\n", " (titles will be printed for the corresponding areas in the diagram).\n", " Ex: ``{filepath1: \"Query 1\"}``\n", " :type files: \n", " :return: Dictionary of boundaries per subsets.\n", " Names of subsets as keys; subsets as values.\n", " :rtype: :>\n", " \"\"\"\n", " result = dict()\n", " boundaries = {title: set(it.chain(*load_macs(filepath))) for filepath, title in files.items()}\n", " \n", " common_boundaries = set.intersection(*boundaries.values())\n", " result[\" & \".join(boundaries.keys())] = common_boundaries\n", "\n", " uniq_places = deepcopy(boundaries)\n", "\n", " for (file_1, places_1), (file_2, places_2) in it.combinations(boundaries.items(), 2):\n", " pair_common_boundaries = places_1 & places_2\n", " intersect = pair_common_boundaries - common_boundaries\n", "\n", " # Intersections\n", " result[file_1 + \" & \" + file_2] = intersect\n", "\n", " # Prune specific places of the 2 distinct files\n", " uniq_places[file_1] -= pair_common_boundaries\n", " uniq_places[file_2] -= pair_common_boundaries\n", "\n", " # Handle specific places (places that only belong to 1 file)\n", " for file, macs in uniq_places.items():\n", " result[file] = macs\n", "\n", " return result\n", " \n", " \n", "subsets = venn_data({\n", " \"_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2 and PERP_mac.txt\": \"MMP2 and PERP\",\n", " \"_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2_mac.txt\": \"MMP2\",\n", " \"_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_PERP_mac.txt\": \"PERP\",\n", "})\n", "\n", "# Display subsets content\n", "# Use compact=True on Python3 for nicer output\n", "import pprint\n", "pprint.pprint(subsets)\n", "\n", "# Count boundaries in each subset\n", "{k: len(v) for k, v in subsets.items()}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notes \n", "\n", "**About the use of Jupyter showed in this page:**\n", "\n", "Jupyter is a fancy tool but it allows to execute Python code block by block,\n", "in a global context (i.e., with variables that persist and will be mutated in that context,\n", "execution after execution). This is a very bad working practice that is\n", "however encouraged by this kind of tool and by IDEs unfortunately offered\n", "to beginners (Spyder for example).\n", "\n", "These methods are directly inherited from the practices of the community\n", "using the R language and the RStudio \"IDE\". To avoid side effects such as\n", "persistence of variables, one MUST reset the console/notebook between runs\n", "by reloading the kernel as often as possible.\n", "Whilst this may seem redundant or heavy, it's an extremely effective method\n", "of reducing unwanted side effects and bugs in your code." ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.9" } }, "nbformat": 4, "nbformat_minor": 2 }