# Work with solutions

Here is an example* of work with the Cadbiom API to process the solutions obtained by a causality search.

*: see notes at the end of this page.
 
## Solutions handling

In [None]:
%matplotlib inline

# Fix print function for Dino-Python
from __future__ import print_function
#import mpld3
#mpld3.enable_notebook()

Let's define a function to get entities (boundaries) in Minimal Activation Conditions (MAC) from `*mac.txt` files.
The command line package offers high level functions to process such data.


In [None]:
from cadbiom_cmd.tools.solutions import get_all_macs

def load_macs(filepath):
 """Return a set of entities in all MAC lines from a directory or from a file"""
 # Get MAC lines from a file
 return set(frozenset(mac.split()) for mac in get_all_macs(filepath))

## Venn diagram

Thanks to the package `matplotlib_venn` we can quickly design relationships between sets of boundaries accross multiple queries:

In [None]:
from matplotlib_venn import venn3
from matplotlib import pyplot as plt
import itertools as it

 
def venn(files):
 """Display a Venn diagram to show relationships between sets of boundaries 
 in the given files.
 
 :param files: Dictionary of filepaths as keys and corresponding titles as values
 (titles will be printed for the corresponding areas in the diagram).
 Ex: ``{filepath1: "Query 1"}``
 :type files: 
 """ 
 places = {title: set(it.chain(*load_macs(filepath))) for filepath, title in files.items()}
 # Tweak the size of the plot
 plt.figure(figsize=(8, 8))
 venn3(places.values(), places.keys())

 
venn({
 "_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2 and PERP_mac.txt": "MMP2 and PERP",
 "_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2_mac.txt": "MMP2",
 "_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_PERP_mac.txt": "PERP",
})
 

## Get content of areas

In order to know the places in each subset, we define a new function capable of additionally taking an infinite number of input files (Venn diagram are limited on this point).

In [None]:
from copy import deepcopy

def venn_data(files):
 """Display the content of areas of the venn diagram built from sets of
 boundaries in the given files.

 Example with 3 files (7 independent sets):
 
 - PERP & MMP2 & MMP2 and PERP
 - PERP & MMP2 - (PERP & MMP2 & MMP2 and PERP)
 - PERP & MMP2 and PERP - (PERP & MMP2 & MMP2 and PERP)
 - MMP2 & MMP2 and PERP - (PERP & MMP2 & MMP2 and PERP)
 - PERP
 - MMP2
 - MMP2 and PERP

 :param files: Dictionary of filepaths as keys and corresponding titles as values
 (titles will be printed for the corresponding areas in the diagram).
 Ex: ``{filepath1: "Query 1"}``
 :type files: 
 :return: Dictionary of boundaries per subsets.
 Names of subsets as keys; subsets as values.
 :rtype: :>
 """
 result = dict()
 boundaries = {title: set(it.chain(*load_macs(filepath))) for filepath, title in files.items()}
 
 common_boundaries = set.intersection(*boundaries.values())
 result[" & ".join(boundaries.keys())] = common_boundaries

 uniq_places = deepcopy(boundaries)

 for (file_1, places_1), (file_2, places_2) in it.combinations(boundaries.items(), 2):
 pair_common_boundaries = places_1 & places_2
 intersect = pair_common_boundaries - common_boundaries

 # Intersections
 result[file_1 + " & " + file_2] = intersect

 # Prune specific places of the 2 distinct files
 uniq_places[file_1] -= pair_common_boundaries
 uniq_places[file_2] -= pair_common_boundaries

 # Handle specific places (places that only belong to 1 file)
 for file, macs in uniq_places.items():
 result[file] = macs

 return result
 
 
subsets = venn_data({
 "_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2 and PERP_mac.txt": "MMP2 and PERP",
 "_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_MMP2_mac.txt": "MMP2",
 "_static/demo_files/EMT2_results/model_pid_mars_fix_p53_without_scc_PERP_mac.txt": "PERP",
})

# Display subsets content
# Use compact=True on Python3 for nicer output
import pprint
pprint.pprint(subsets)

# Count boundaries in each subset
{k: len(v) for k, v in subsets.items()}

## Notes 

**About the use of Jupyter showed in this page:**

Jupyter is a fancy tool but it allows to execute Python code block by block,
in a global context (i.e., with variables that persist and will be mutated in that context,
execution after execution). This is a very bad working practice that is
however encouraged by this kind of tool and by IDEs unfortunately offered
to beginners (Spyder for example).

These methods are directly inherited from the practices of the community
using the R language and the RStudio "IDE". To avoid side effects such as
persistence of variables, one MUST reset the console/notebook between runs
by reloading the kernel as often as possible.
Whilst this may seem redundant or heavy, it's an extremely effective method
of reducing unwanted side effects and bugs in your code.