Command line package¶

Tools¶

Models¶

This module groups functions directly related to the management and the extraction of data of a Cadbiom model.

Here we find high-level functions to manage the logical formulas of the events and conditions defining the transitions; as well as useful functions to manage the entities, like to obtain their metadata or the frontier places of the model.

cadbiom_cmd.tools.models.decompile_condition(tree, inhibitors_nodes)[source]¶

Recursive function to decompile conditions

Parameters:

tree (<expression>) –

Example of tree argument:
	tree = ('H', 'v', ( ('F', 'v', 'G'), '^', ( ('A', 'v', 'B'), '^', ('C', 'v', ('D', '^', 'E')) ) ))

inhibitors_nodes (<set>) – Set of inhibitors

Returns:

List of valid paths composed of entities (except inhibitors). Inhibitors are added to inhibitors_nodes.

cadbiom_cmd.tools.models.get_frontier_places(transitions, all_places)[source]¶

Return frontier places of a model (deducted from its transitions and from all places of the model).

Note

why we use all_places from the model instead of (input_places - output_places) to get frontier places ? Because some nodes are only in conditions and not in transitions. If we don’t do that, these nodes are missing when we compute valid paths from conditions.

Parameters:	arg1 (<dict> keys: names of events values: list of transitions as tuples (with in/output, and label)) – Model’s transitions. {u’h00’: [(‘Ax’, ‘n1’, {u’label’: u’h00[]’}),]
Returns:	Set of frontier places.
Return type:	<set>

cadbiom_cmd.tools.models.get_model_identifier_mapping(model_file, external_identifiers)[source]¶

Get Cadbiom names corresponding to the given external identifiers (xrefs)

Note

This function works only on v2 formated models with JSON additional data

Parameters:	model_file (<str>) – Model file. external_identifiers (<set>) – Set of external identifiers to be mapped.
Returns:	Mapping dictionary with external identifiers as keys and cadbiom names as values.
Return type:	<dict <str>:<list>>

cadbiom_cmd.tools.models.get_places_data(places, model)[source]¶

Get a list of JSON data parsed from each given places in the model.

This function is used by cadbiom_cmd.models.low_model_info().

Note

v1 models return a dict with only 1 key: ‘cadbiomName’

Note

Start nodes (with a name like __start__x) are handled even with no JSON data. They are counted in the other_types and other_locations fields.

Example of JSON data that can be found in the model:
	{ "uri": entity.uri, "entityType": entity.entityType, "names": list(entity.synonyms \| set([entity.name])), "entityRef": entity.entityRef, "location": entity.location.name if entity.location else None, "modificationFeatures": dict(entity.modificationFeatures), "members": list(entity.members), "reactions": [reaction.uri for reaction in entity.reactions], "xrefs": entity.xrefs, }
Parameters:	places (<set>) – Iterable of name of places. model (<MakeModelFromXmlFile>) – Model from handler.
Returns:	List of data parsed from each give places. Note Here is the list of field retrieved for v2 models: cadbiomName uri entityType entityRef location names xrefs
Return type:	<list <dict>>

cadbiom_cmd.tools.models.get_places_from_condition(condition)[source]¶

Parse condition string and return all places, regardless of operators.

Note

This function is only used to get all nodes in a condition when we know they are all inhibitors nodes.

Todo

See the workaround in the code, without using very time consuming and badly coded functions.

Param:	Condition string.
Type:	<str>
Returns:	Set of places.
Return type:	<set>

cadbiom_cmd.tools.models.get_transitions(parser)[source]¶

Get all transitions in the given parser.

There are two methods to access the transitions of a model.

Example:

>>> print(dir(parser))
['handler', 'model', 'parser']
>>> # Direct access
>>> events = list()
>>> for transition in parser.model.transition_list:
...     events.append(transition.event)
>>>
>>> # Indirect access via a handler
>>> events = list()
>>> for transitions in parser.handler.top_pile.transitions:
...     # transitions is a list of CTransition objects
...     for transition in transitions:
...         events.append(transition.event)

Todo

This function is relatively perfectible and although it is useful and mandatory for the design of networkx graphs based on solutions or models, it presents a rather heavy structure which dates from the time when the API of Cadbiom (of transition objects) was unknown and not documented.

Param:	Parser opened on a bcx file.
Type:	<MakeModelFromXmlFile>
Returns:	A dictionnary of events as keys, and transitions as values. Since many transitions can define an event, values are lists. Each transition is a tuple with: origin node, final node, attributes like label and condition. `{'h00': [('Ax', 'n1', {'label': 'h00[]'}),]`
Return type:	<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>

cadbiom_cmd.tools.models.get_transitions_from_model_file(model_file)[source]¶

Get all transitions and parser from a model file (bcx format).

Param:	bcx file.
Type:	<str>
Returns:	Transitions (see get_transitions()) and the Parser for the model.
Return type:	<dict>, <MakeModelFromXmlFile>

cadbiom_cmd.tools.models.parse_condition(condition, all_nodes, inhibitors_nodes)[source]¶

Return valid paths according the given logical formula and nodes; and set inhibitors_nodes

Note

inhibitors_nodes is modified(set) by this function.

Raises:	AssertionError – If no valid path was found.
Parameters:	condition (<str>) – Condition string of a transition. all_nodes (<set>) – Nodes involved in transitions + frontier places. inhibitors_nodes (<set>) – Inactivated nodes in paths of conditions. Modified by the function.
Returns:	Set of paths. Each path is a tuple of nodes.
Return type:	<set>

Graphs¶

This module groups functions directly related to the creation and the management of the graph based on a Cadbiom model.

Here we find high-level functions to create a Networkx graph, and convert it to JSON or GraphML formats.

cadbiom_cmd.tools.graphs.build_graph(solution, steps, transitions)[source]¶

Build a graph for the given solution.

Get & make all needed edges

Build graph

Note

Legend:

Default nodes: grey
Frontier places: red
Transition nodes: blue
Inhibitors nodes: white
Default transition: grey
Inhibition edge: red
Activation edge: green

Parameters:

solution (<str> or <set> or <list>) – Frontier places. String data will be split on spaces.
steps (<list <list>>) – List of steps (with events in each step).
transitions (<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>) – A dictionnary of events as keys, and transitions as values (see get_transitions()).

Returns:

Networkx graph object.
Nodes corresponding to transitions with conditions.
All nodes in the model
Edges between transition node and nodes in condition
Normal transitions without condition

Return type:

<networkx.classes.digraph.DiGraph>, <list>, <list>, <list>, <list>

cadbiom_cmd.tools.graphs.draw_graph(output_dir, frontier_places, solution_index, G, transition_nodes, all_nodes, edges_in_cond, edges)[source]¶

Draw graph with colors and export it svg file format .

This function is no longer used but can be still usefull.

Note

Legend:

red: frontier places (in frontier_places variable),
white: middle edges,
blue: transition edges

Parameters:

output_dir (<str>) – Output directory for GraphML files.
frontier_places (<set>) – Solution: a set of frontier places.
solution_index (<int> or <str>) – Index of the solution in the Cadbiom result file (used to distinguish exported filenames).
G (<networkx.classes.digraph.DiGraph>) – Networkx graph object.
transition_nodes (<list>) – Nodes corresponding to transitions with conditions. List of tuples: event, node
all_nodes (<list>) – All nodes in the model.
edges_in_cond (<list>) – Edges between transition node and nodes in condition
edges (<list>) – Normal transitions without condition.

cadbiom_cmd.tools.graphs.export_graph(output_dir, frontier_places, solution_index, G, *args)[source]¶

Export a networkx graph to GraphML format.

Note

Legend: See build_graph().

Parameters:	output_dir (<str>) – Output directory for GraphML files. frontier_places (<set>) – Solution: a set of frontier places. This argument is used to build the filename. solution_index (<int> or <str>) – Index of the solution in the Cadbiom result file (used to distinguish exported filenames). G (<networkx.classes.digraph.DiGraph>) – Networkx graph object.

cadbiom_cmd.tools.graphs.get_json_graph(G)[source]¶

Translate Networkx graph into a dictionary ready to be dumped in a JSON file.

Note

In classical JSON graph, ids of nodes are their names; also, their position in the array of nodes gives their numerical id, which is used as source or target in edges definitions. Here, for readability and debugging purpose, we use distinct attributes id and label for nodes.

Parameters:	graph (<networkx.classes.digraph.DiGraph>) – Networkx graph.
Returns:	Serialized graph ready to be dumped in a JSON file.
Return type:	<dict>

cadbiom_cmd.tools.graphs.get_solutions_graph_data(G, info, centralities)[source]¶

Complete the given dictionary with information specific to the graph considered

Doc:

https://networkx.github.io/documentation/networkx-1.10/reference/algorithms.component.html
https://networkx.github.io/documentation/stable/reference/algorithms/shortest_paths.html
average_shortest_path_length
https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.average_shortest_path_length.html#networkx.algorithms.shortest_paths.generic.average_shortest_path_length
weakly_connected_component_subgraphs
https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.algorithms.components.weakly_connected.weakly_connected_component_subgraphs.html#networkx.algorithms.components.weakly_connected.weakly_connected_component_subgraphs
Measures
https://networkx.github.io/documentation/stable/reference/algorithms/index.html

By default the following information are added:

- graph_nodes: Number of nodes
- graph_edges: Number of edges
- graph_nodes_places: Number of biological places/entities.
  The graph is a false bipartite graph, we remove the subset of transitions
  in order to have the real count of biological places/entities.

If centralities is True, the folliwing information are added to the a new key named “centralities”:

- strongly_connected:
- weakly_connected
- max_degree
- min_degree
- average_degree
- degree
- connected_components_number
- connected_components
- average_shortest_paths

Parameters:	G (<networkx.classes.digraph.DiGraph>) – NetworkX directed graph info (<dict>) – Dictionnary of data to be completed centralities (<boolean>) – Flag to activate the computation of centralities.

cadbiom_cmd.tools.graphs.merge_graphs(graphs)[source]¶

Merge graphs in the given iterable; count and add the weights to the edges of the final graph

Parameters:	graphs (<generator <networkx.classes.digraph.DiGraph>>) – Networkx graph objects.
Returns:	Networkx graph object.
Return type:	<networkx.classes.digraph.DiGraph>

Solutions¶

This module groups functions directly related to the parsing and the management of the files generated by the solver of Cadbiom.

Here we find high-level functions to parse or clean mac files, and extract all their data to a JSON format, a data interchange format that is humanly readable and useful in programming.

cadbiom_cmd.tools.solutions.convert_solutions_to_json(sol_steps, transitions, conditions=True)[source]¶

Convert all events for all solutions in a complete MAC file and write them in a separate file in the JSON format.

This is a function to quickly search all transition attributes involved in a solution.

Example:	>>> from tools.models import get_transitions >>> # Get transitions from the model >>> model_transitions = get_transitions('model.bcx') >>> decomp_solutions = convert_solutions_to_json( ... load_solutions('./solution_mac_complete.txt'), ... model_transitions, ... conditions=True, ... ) >>> print(decomp_solutions) [{ "solution": "Ax Bx", "steps": [ [{ "event": "_h_2", "transitions": [{ "ext": "n3", "ori": "Bx" }] }], ] }]
Parameters:	arg1 (<list>) – List of steps involved in a solution. See load_solutions(). A tuple of “frontier places” and a list of events in each step. `("Bx Ax", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])` arg2 (<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>) – A dictionnary of events as keys, and transitions as values. Since many transitions can define an event, values are lists. Each transition is a tuple with: origin node, final node, attributes like label and condition. `{'h00': [('Ax', 'n1', {'label': 'h00[]'}),]` See get_transitions(). arg3 (<bool>) – (Optional) Integrate in the final file, the conditions for each transition.
Returns:	Return the JSON data for the given steps. Example: [{ "solution": "Ax Bx", "steps": [ [{ "event": "_h_2", "transitions": [{ "ext": "n3", "ori": "Bx" }] }], ] }]
Return type:	<list>

cadbiom_cmd.tools.solutions.get_all_macs(path)[source]¶

Return a set of all MAC LINES from a directory or from a file.

This function is based on get_solutions() that returns mac lines and stripped mac lines, and get_mac_lines() that returns only mac lines from a file.

Note

Alternatively we do some verifications here:

Detection of duplicated MACS (AssertionError raised)
Print number of MACS per file
Print duplicated MACS
Print number of MACS

Param:	Filepath to be opened and in which solutions will be returned.
Type:	<str>
Returns:	Set of MAC/CAM from the given path.
Return type:	<frozenset <str>>

cadbiom_cmd.tools.solutions.get_mac_lines(filepath)[source]¶

Returns only a set of MAC LINES from A file.

This function is based on get_solutions() that returns mac lines and stripped mac lines.

Note

You would prefer to use get_all_macs() which:

Can handle a directory path and return all macs in it,
Can handle a simple file,
Do some verifications on all parsed macs.

Note

We assume that at this point, all MAC lines are sorted in alphabetical order.

Note

We return LINES not a set of places.

Example:	{'Cx Dx', 'Ax Bx'}

Param:	Filepath to be opened and in which solutions will be returned.
Type:	<str>
Returns:	Set of MAC/CAM from the given file.
Return type:	<set <str>>

cadbiom_cmd.tools.solutions.get_query_from_filename(model_file, solution_file)[source]¶

Return the query string according to the given model and solution filenames

Example:	>>> get_query_from_filename( ... "/path/model.bcx", ... "/another_path/model_ENTITY_and_not_ENTITY_mac_complete.txt" ... ) "ENTITY_and_not_ENTITY"
Parameters:	model_file (<str>) – Path of a bcx model. solution_file (<str>) – Path of a solution file (mac file).

cadbiom_cmd.tools.solutions.get_solutions(file_descriptor)[source]¶

Generator of solution lines and corresponding stripped lines for *mac* file.

Note

This function does not return events! It is just original lines and cleaned lines containing solutions (i.e sets of frontier places/boundaries).

We remove the last '\n' and '\t'. Tabs in the middle are replaced by one space ' '.

Param: Opened file.

Type: <file>

Returns:

A generator of tuples; each tuple contains the original line, and the cleaned line.

Example:	For an original line: `'Z\tY\tX\n'` ('Z\tY\tX', 'X Y Z')

Return type: <tuple <str>, <str>>

cadbiom_cmd.tools.solutions.load_solutions(file)[source]¶

Open a file with many solution/MACs (*mac_complete.txt files) and yield them.

Example:

>>> solutions = load_solutions('./solution_mac_complete.txt')
>>> print([solution for solution in solutions])
("Ax Bx", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])

Param:

File name

Type:

<str>

Returns:

A generator of tuples of “frontier places” and a list of events in each step.

Example:	("Ax Bx", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])

Return type:

Display, compare, and query a model¶

Display, compare, and query a model

cadbiom_cmd.models.graph_isomorph_test(model_file_1, model_file_2, output_dir=u'graphs/', make_graphs=False, make_json=False)[source]¶

Entry point for model consistency checking.

This functions checks if the graphs based on the two given models have the same topology, nodes & edges attributes/roles.

Todo

This function should not write any file, and should be exported to the module tools.

Note

Cf graphmatcher https://networkx.github.io/documentation/development/reference/generated/networkx.algorithms.isomorphism.categorical_edge_match.html

Key make_graphs:
Use in scripts:	>>> from cadbiom_cmd.models import graph_isomorph_test >>> print(graph_isomorph_test('model_1.bcx', 'model_2.bcx')) INFO: 3 transitions loaded INFO: 3 transitions loaded INFO: Build graph for the solution: Connexin_32_0 Connexin_26_0 INFO: Build graph for the solution: Connexin_32_0 Connexin_26_0 INFO: Topology checking: True INFO: Nodes checking: True INFO: Edges checking: True {'nodes': True, 'edges': True, 'topology': True}
Parameters:	model_file_1 (<str>) – Filepath of the first model. model_file_2 (<str>) – Filepath of the second model.
Key output_dir:	Output path.
	If True, make a GraphML file in output path.
Key make_json:	If True, make a JSON dump of results in output path.
Returns:	Dictionary with the results of tests. keys: ‘topology’, ‘nodes’, ‘edges’; values: booleans
Return type:	<dict <str>: <boolean>>

cadbiom_cmd.models.low_graph_info(model_file, graph_data=False, centralities=False)[source]¶

Low level function for model_graph().

Get JSON data with information about the graph based on the model.

Merge Minimal Accessibility Conditions¶

cadbiom_cmd.solution_merge.merge_macs_to_csv(directory, output_dir, csvfile=u'merged_macs.csv')[source]¶

Merge *mac.txt files from a directory to a csv file.

Structure of the CSV file:
	<Final property formula>;<boundaries in the solution>

Handle generated files¶

Handle generated files

This module provides some functions to do some analyzis on the output files of Cadbiom.

Entry points:

queries_2_json()

solutions_2_graphs()

queries_2_common_graph()

Example of the content of a complete solution file:
	Bx Ax % h2 h00 % h3 % h0 h1 % hlast Bx Ax % h2 % h3 h00 % h0 h1 % % hlast Bx Ax % h2 % h3 h00 % h0 h1 % hlast % % Bx Ax % h2 h00 % h3 % h0 h1 % hlast % % %

cadbiom_cmd.solution_sort.get_solution_graphs(sol_steps, transitions)[source]¶: Generator that yields the graphs of the given solutions.

Note

See the doc of a similar function save_solutions_to_graphs().

cadbiom_cmd.solution_sort.occurrence_matrix(output_dir, model_file, path, matrix_filename=u'occurrence_matrix.csv')[source]¶

Make a matrix of occurrences for the solutions in the given path.

Compute occurrences of each place in all mac.txt files.
Save the matrix in csv format with the following columns:

Fieldnames: “patterns (number)/places (number);mac_number;frontier places” Each request (pattern) is accompanied by the number of solutions found.

Todo

Split the creation and writing of the matrix in 2 functions.

Parameters:	output_dir (<str>) – Output path. model_file (<str>) – Filepath of the model. path (<str>) – Directory of many complete solutions files. matrix_filename (<str>) – (Optional) Filename of the matrix file.
Returns:	A dictionnary with the matrix object. keys: queries, values: occurrences of frontier places
Return type:	<dict>

cadbiom_cmd.solution_sort.queries_2_common_graph(output_dir, model_file, path, make_graphs=True, make_csv=False, make_json=False, *args, **kwargs)[source]¶

Entry point for queries_2_common_graph

Create a GraphML formated file containing a unique representation of all trajectories corresponding to all solutions in each complete MAC files (*mac_complete files).

This is a function to visualize paths taken by the solver from the boundaries to the entities of interest.

CSV fields:

- query: Query giving the solutions
- solutions: nb trajectories/solutions
- boundaries: Number of boundary places
- events: Number of events in all solutions
- genes: Number of genes involved in solutions
- Protein: Number of boundaries with the type Protein
    (genes are not counted)
- Complex: Number of boundaries with the type Complex
    (genes are not counted)
- input_boundaries: Boundaries found only as input places
- guard_boundaries: Boundaries found only in guards
- mixed_boundaries: Boundaries found in guards AND in inputs of reactions
- graph_nodes: Total number of nodes in the graph
- graph_nodes_places: Nodes that are biomolecules (do not count reaction nodes)
- graph_edges: Number of edges
- strongly_connected: Is the graph strongly connected ?
- max_degree
- min_degree
- average_degree

This function tests if the given path is a directory or a file.

Key make_graphs:
Parameters:	output_dir (<str>) – Output path. model_file (<str>) – Filepath of the model. path (<str>) – Filepath/directory of a/many complete solutions files.
	(optional) Make a GraphML for each query results in path. default: True
Key make_csv:	(optional) Make a global CSV for all query results in path. default: False
Key make_json:	(optional) Make a JSON dump of each query results in path. default: False

cadbiom_cmd.solution_sort.queries_2_json(output_dir, model_file, path, conditions=True)[source]¶

Entry point for queries_2_json

Create a JSON formated file containing all data from complete MAC files (*mac_complete files). The file will contain frontier places/boundaries and decompiled steps with their respective events for each solution.

This is a function to quickly search all transition attributes involved in a solution.

This function tests if the given path is a directory or a file.

Parameters:	output_dir (<str>) – Output path. model_file (<str>) – Filepath of the model. path (<str>) – Filepath/directory of a complete solution file. conditions (<boolean>) – (Optional) If False, conditions of transitions will not be present in the JSON file. This allows to have only places/entities used inside trajectories; thus, inhibitors are avoided.

cadbiom_cmd.solution_sort.queries_2_occcurrence_matrix(output_dir, model_file, path, transposed=False, normalized=False)[source]¶

Entry point for queries_2_occcurrence_matrix

See occurrence_matrix().

Parameters:	output_dir (<str>) – Output path. model_file (<str>) – Filepath of the model. path (<str>) – Directory of many complete solutions files. transposed (<boolean>) – (Optional) Transpose the final matrix (switch columns and rows).

cadbiom_cmd.solution_sort.save_solutions_to_graphs(output_dir, sol_steps, transitions)[source]¶

Build and export graphs based on the given solutions

Each solution is composed of a set of frontier places and steps, themselves composed of events. We construct a graph based on the transitions that occur in the composition of the events of the given solution.

Parameters:

output_dir (<str>) – Output path.
sol_steps (<tuple <str>, <list>>) –
A generator of tuples of “frontier places” and a list of events in each step.
Example:
("Bx Ax", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])
transitions (<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>) –
A dictionnary of events as keys, and transitions as values. Since many transitions can define an event, values are lists. Each transition is a tuple with: origin node, final node, attributes like label and condition.
Example:
{'h00': [('Ax', 'n1', {'label': 'h00[]'}),]

cadbiom_cmd.solution_sort.solutions_2_graphs(output_dir, model_file, path)[source]¶

Entry point for solutions_2_graphs

Create GraphML formated files containing a representation of the trajectories for each solution in complete MAC files (*mac_complete files).

This is a function to visualize paths taken by the solver from the boundaries to the entities of interest.

This function tests if the given path is a directory or a file.

Parameters:	output_dir (<str>) – Output path. model_file (<str>) – Filepath of the model. path (<str>) – Filepath/directory of a/many complete solutions files.

cadbiom_cmd.solution_sort.solutions_sort(path)[source]¶

Entry point for sorting solutions.

Read a solution(s) file(s) (*mac* files) and sort all frontier places/boundaries in alphabetical order.

This function tests if the given path is a directory or a file.

Warning

The files will be modified in place.

Param:	Filepath or directory path containing Cadbiom solutions.
Type:	<str>

cadbiom_cmd.solution_sort.sort_solutions_in_file(filepath)[source]¶

Sort all solutions in the given file in alphabetical order.

Warning

The file is modified in place.

Param:	Filepath to be opened and in which solutions will be sorted.
Arg:	<str>

cadbiom_cmd.solution_sort.transpose_csv(input_file=u'occurrence_matrix.csv', output_file=u'occurrence_matrix_t.csv')[source]¶

Useful function to transpose a csv file x,y => y,x

Note

The csv file must be semicolon ‘;’ separated.

Parameters:	input_file (<str>) – Input file. output_file (<str>) – Output file transposed.

cadbiom_cmd.solution_sort.write_json(output_dir, file_path, file_suffix, data)[source]¶

Write decompiled solutions to a JSON formated file

Called by queries_2_json TODO() and queries_2_common_graph()

Parameters:	output_dir (<str>) – Output directory file_path (<str>) – Filepath of the original solution file. We extract the basename in order to name the JSON file. file_suffix (<str>) – String added to the solution filename. Ex: filename + file_suffix + “.json” data (<list> or <dict> or <whatever>) – Data to be serialized in JSON

Search Minimal Accessibility Conditions¶

Search Minimal Accessibility Conditions

Simulation of the system until some halting condition (given with the final property) is satisfied.

class cadbiom_cmd.solution_search.ErrorReporter[source]¶: Cf class CompilReporter(object): gt_gui/utils/reporter.py

cadbiom_cmd.solution_search.compute_combinations(final_properties)[source]¶

Return all combinations of final properties.

Note

(in case of input_file and combinations set).

Param: List of final properties.

Type: <list>

Returns:

List of str. Each str is a combination of final_properties linked by a logical ‘and’.

Example:	`('TGFB1', 'COL1A1'), ('TGFB1', 'decorin')` gives: `['TGFB1 and COL1A1', 'TGFB1 and decorin']`

Return type: <list <str>>

cadbiom_cmd.solution_search.compute_macs(params)[source]¶

Launch Cadbiom search of MACs (Minimal Activation Conditions).

This function is called 1 or multiple times according to the necessity to use multiprocessing (Cf launch_researchs()).

Note

Previous result files will be deleted.

cadbiom_cmd.solution_search.detect_model_type(mclanalyser, filepath)[source]¶

Return the function to use to load the model.

The detection is based on the file extension.

bcx file: Build an MCLAnalyser from a .bcx file:: build_from_chart_file()
cal file: Build an MCLAnalyser from a .cal file of PID database: build_from_cadlang()
xml file: Build an MCLAnalyser from a .xml file of PID database:: build_from_pid_file()

Parameters:	arg1 (<MCLAnalyser>) – MCLAnalyser. arg2 (<str>) – File that contains the model.
Returns:	The function to use to read the given file.
Return type:	<func>

cadbiom_cmd.solution_search.find_mac(mcla, mac_file, mac_step_file, mac_complete_file, steps, final_prop, start_prop, inv_prop, previous_frontier_places)[source]¶

Search for 1 solution, save timings, save frontiers, and return it with the current step (deprecated, see find_macs()).

For every new solution, the system is reinitialized, and a satisfiability test is made on a new query to evaluate the minimal number of steps for reachability.

The side effect is that this process is expensive in a general way, and that parsing the properties (logical formulas of the frontier places of the previous solutions for example) in text format is very expensive because realized by the grammar ANTLR.

Parameters:	previous_frontier_places (<set <tuple <str>>>) – Set of frontier places tuples from previous solutions. These tuples will be banned from the future solutions.
Returns:	A tuple of activated frontiers and the current step. None if there is no new Solution or if problem is not satisfiable.

cadbiom_cmd.solution_search.find_macs(mcla, mac_file, mac_step_file, mac_complete_file, steps, final_prop, start_prop, inv_prop, limit, current_nb_sols, previous_frontier_places)[source]¶

Search for many solutions, save timings, and save frontiers.

For every new solution, the system is NOT reinitialized, and a satisfiability test is made ONLY when there is no solution for the current step. This test is made to evaluate the minimal number of steps for reachability.

Unlike find_mac(), this function is autonomous and takes into account the limitation of the number of solutions.

Todo

Handle all_macs flag like the old method with find_mac() Not used very often but can be usefull sometimes…

Parameters:	limit (<int>) – Limit the number of solutions. current_nb_sols (<int>) – The current number of solutions already found. This number is used to limit the number of searched solutions. previous_frontier_places (<set <tuple <str>>>) – Set of frontier places tuples from previous solutions. These tuples will be banned from the future solutions.
Returns:	None

cadbiom_cmd.solution_search.get_dimacs_start_properties(mcla, previous_frontier_places)[source]¶

Translate frontier places to their numerical values thanks to the current unfolder.

It’s much more efficient than using the ANTLR grammar to parse formulas for each new query.

Returns:	List of previous solutions (list of negative values of frontier places) Ex: [[-1, -2], [-2, -3], …]
Return type:	<list <list <int>>

cadbiom_cmd.solution_search.logical_operator(elements, operator)[source]¶

Join elements with the given logical operator.

Parameters:	arg1 (<list>) – Iterable of elements to join with a logical operator arg2 (<str>) – Logical operator to use ‘and’ or ‘or’
Returns:	logical_formula: str - AND/OR of the input list
Return type:	<str>

cadbiom_cmd.solution_search.make_logical_formula(previous_frontier_places, start_prop)[source]¶

Make a logical formula based on previous results of MAC.

The aim is to exclude previous solution.

1 line: "A B" => (A and B)
another line: "B C" => (B and C)
merge all lines: (A and B) or (B and C)
forbid all combinaisons: not((A and B) or (B and C))

Parameters:	arg1 (<set>) – Set of previous frontier places (previous solutions). arg2 (<str>) – Original property (constraint) for the solver.
Returns:	A logical formula which excludes all the previous solutions.
Return type:	<str>

cadbiom_cmd.solution_search.read_mac_file(file)[source]¶

Return a list a fontier places already found in mac file

Note

use make_logical_formula() to get the new start_prop of the run.

Param:	Mac file of a previous run
Type:	<str>
Returns:	A set a frontier places.
Return type:	<set>

cadbiom_cmd.solution_search.search_entry_point(model_file, mac_file, mac_step_file, mac_complete_file, mac_strong_file, steps, final_prop, start_prop, inv_prop, all_macs, continue_run, limit)[source]¶

Search solutions

Parameters:

model_file (<str>) – Model file (bcx, xml, cal).
mac_file (<str>) – File used to store Minimal Activation Condition (MAC/CAM).
mac_step_file (<str>) – File used to store Minimal step numbers for each solution.
mac_complete_file (<str>) – File used to store MAC & trajectories.
mac_strong_file (<str>) –
???
steps (<int>) – Maximal steps to reach the solutions.
final_prop (<str>) – Formula: Property that the solver looks for.
start_prop (<str>) – Formula: Property that will be part of the initial state of the model. In concrete terms, some entities can be activated by this mechanism without modifying the model.
inv_prop (<str>) – Formula: Invariant property that will always occur during the simulation. The given logical formula will be checked at each step of the simulation.

all_macs (<boolean>) –

If set to True (not default), search all macs with less or equal the maxium of steps defined with the argument steps. If set to False: The solver will search all solutions with the minimum of steps found in the first returned solution.

Example::	all_macs = False, steps = 10; First solution found with 4 steps; The next solution will be searched with a maximum of 4 steps; all_macs = True, steps = 10; First solution found with 4 steps; The next solution is not reachable with 4 steps but with 5 steps (which is still less than 10 steps); Get the solution for 5 steps;

continue_run (<boolean>) – If set to True (not default), previous macs from a previous run, will be reloaded.
limit (<int>) – Limit the number of solutions.

cadbiom_cmd.solution_search.solutions_search(params)[source]¶

Launch the search for Minimum Activation Conditions (MAC) for entities of interest.

If there is no input file, there will be only one process.
If an input file is given, there will be 1 process per line (per logical formula on each line).

Make an interaction graph based on molecules of interest¶

This module groups functions directly related to the design of an interaction weighted graph based on the search of molecules of interest.

Entry point: json_2_interaction_graph().

cadbiom_cmd.interaction_graph.build_graph(output_dir, all_genes, all_stimuli, genes_interactions, stimulis_interactions, genes_stimuli_interactions, molecule_stimuli_interactions)[source]¶

Make an interaction weighted graph based on the search of molecules of interest

Legend of the edges:
Edges:	gene - gene: Two genes present simultaneously in a solution stimulus - stimulus: Two stimuli present simultaneously in a solution gene - stimulus: One gene and one stimulus present simultaneously in a solution (deprecated) molecule of interest - stimulus: A molecule of interest in a trajectory related to a solution that contains a stimulus.
	gene - gene: red stimulus - stimulus: blue (deprecated) gene - stimulus: red molecule of interest - stimulus: yellow
Legend of the nodes:
	genes: red stimuli: blue molecules of interest: yellow
Parameters:	output_dir (<str>) – Output path. all_genes (<set>) – All genes in all the solutions all_stimuli (<set>) – All stimulis in all the solutions genes_interactions (<Counter>) – Interactions between genes in the same solution stimulis_interactions (<Counter>) – Interactions between stimuli in the same solution genes_stimuli_interactions (<Counter>) – Interactions between genes and stimulis in the same solution molecule_stimuli_interactions (<Counter>) – Counter interactions between molecules of interest and frontier places that are not genes (stimuli) in trajectories (i.e.: `(molecule, stimulus)`).

cadbiom_cmd.interaction_graph.build_interactions(filtered_macs, binary_interactions)[source]¶

Make binary interactions used by the graph as edges

PS: genes and stimulis are frontier places.

Parameters:

filtered_macs (<tuple <tuple <str>>>) –
All solutions related to the molecules of interest.
```
(("frontier_1", "frontier_2", "frontier_3"),)
```

binary_interactions (<dict <str>: <Counter <str>: <int>>>) –

A dictionary of related frontier places.

# For molecules of interest "A" and "B"
{"A": {
   "frontier_1": 1,
   "frontier_2": 1,
 },
 "B": {
   "frontier_3": 1,
 },
}

Returns:

Various Counters of binary interactions:

all_genes: All genes in all the solutions
all_stimuli: All stimulis in all the solutions
genes_interactions: Interactions between genes in the same solution
stimulis_interactions: Interactions between stimuli in the same solution
genes_stimuli_interactions: Interactions between genes and stimulis in the same solution
molecule_stimuli_interactions: Counter interactions between molecules of interest and frontier places that are not genes (stimuli) in trajectories (i.e.: (molecule, stimulus)).

Return type:

cadbiom_cmd.interaction_graph.filter_trajectories(trajectories, molecules_of_interest)[source]¶

Get solutions and count frontier places related to the given molecules of interest.

Parameters:

trajectories (<generator <tuple <tuple>, <set>>>) –
A generator of tuples with tuple of frontier places as keys and set of places involved in transitions as values.
```
(("Ax", "Bx"), {"n3", "Bx"})
```
molecules_of_interest (<tuple>) – Iterable of molecules of interest.

Returns:

A tuple of all solutions related to the molecules of interest, and a dictionary of related frontier places and their occurences for each molecule of interest.

# For molecules of interest "A" and "B"
((("frontier_1", "frontier_2", "frontier_3"),),
 {"A": {
    "frontier_1": 1,
    "frontier_2": 1,
  },
  "B": {
    "frontier_3": 1,
  },
 })

Return type:

cadbiom_cmd.interaction_graph.get_solutions_and_related_places(path)[source]¶

Read decompiled solutions files (*.json* files)

This functions tests if the given path is a directory or a file.

Parameters:	path (<str>) – Filepath/directory of a decompiled JSON file.
Returns:	A generator of tuples with tuple of frontier places as keys and set of places involved in transitions as values. (("Ax", "Bx"), {"n3", "Bx"})
Return type:	<generator <tuple <tuple>, <set>>>

cadbiom_cmd.interaction_graph.get_solutions_and_related_places_from_file(file_path)[source]¶

Get frontier places and other places involved in transitions.

Parameters:

file_path –

Path of a JSON file; this file is generated by convert_solutions_to_json().

A solution is composed of steps with events, composed of transitions:
	[{ "solution": "Ax Bx", "steps": [ [ { "event": "_h_2", "transitions": [{ "ext": "n3", "ori": "Bx" }] }, ], ] }]

file_path – <str>

Returns:

A generator of tuples with tuple of frontier places as keys and set of places involved in transitions as values.

(("Ax", "Bx"), {"n3", "Bx"})

Return type:

cadbiom_cmd.interaction_graph.json_2_interaction_graph(output_dir, molecules_of_interest, path)[source]¶

Entry point for json_2_interaction_graph

Read decompiled solutions files (*.json* files produced by the directive queries_2_json) and make a graph of the relationships between one or more molecules of interest, the genes and other frontier places/boundaries found among all the solutions.

More information about the graph and its legend: build_graph().

Parameters:	output_dir (<str>) – Output path. molecules_of_interest (<tuple>) – Iterable of molecules of interest. path (<str>) – Filepath/directory of a JSON solution file.

Make heatmaps¶

Module used to create a hierarchically-clustered heatmap of boundaries.

cadbiom_cmd.queries_2_clustermap.draw_matrix_heatmap(df, filepath)[source]¶

Draw and save clustermap from the given dataframe

Parameters:	df (<pandas.core.frame.DataFrame>) – Pandas dataframe filepath (<str>) – Filepath of the matrix. Used to build the SVG file.

cadbiom_cmd.queries_2_clustermap.open_dataframe(filepath)[source]¶

Get Pandas dataframe from CSV file

Because yes, pandas knows to open a CSV file (not like R). It’s awesome. Don’t teach this in bio-info please. You should always prefer complex and legacy technologies it makes you smart (especially for the first ones ><).

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

Returns:	Pandas dataframe
Return type:	<pandas.core.frame.DataFrame>

cadbiom_cmd.queries_2_clustermap.payload(output_dir, filepath)[source]¶

Make a clustermap based on an occurrence matrix for the given solution file

Parameters:	output_dir (<str>) – Output path. filepath (<str>) – Solution filepath.

cadbiom_cmd.queries_2_clustermap.queries_2_clustermap(output_dir, path, *args, **kwargs)[source]¶

Entry point for queries_2_clustermap

Create a hierarchically-clustered heatmap of boundaries in mac files.

Parameters:	output_dir (<str>) – Output path. path (<str>) – Filepath/directory of a/many complete solutions files.

cadbiom_cmd.queries_2_clustermap.write_matrix(filepath, output_dir)[source]¶

Make an occurrence matrix of boundaries found in the given solution file

Example of CSV produced:

Columns: Frontier places

Lines: Solution with a ‘1’ in columns corresponding to an occurrence of the frontier place.

solution_number;boundary_1;boundary_2;...
1;0;1;...
2;1;0;...

Parameters:	filepath (<str>) – Solution filepath. output_dir (<str>) – Output path.
Returns:	Filepath of the CSV file produced. Filename is of the form <solution_file>_sol_matrix.csv
Return type:	<str>

Key centralities:
Parameters:	model_file (<str>) – File for the model.
Key graph_data:	Also return a dictionary with the results of measures on the given graph. keys: measure’s name; values: measure’s value Example: { 'modelFile': 'string', 'modelName': 'string', 'events': int, 'entities': int, 'transitions': int, 'graph_nodes': int, 'graph_edges': int, 'centralities': { 'degree': { 'entity_1': float, 'entity_2': float }, 'strongly_connected': boolean, 'weakly_connected': boolean, 'max_degree': int, 'min_degree': int, 'average_degree': float, 'connected_components_number': int, 'connected_components': list, 'average_shortest_paths': int, } }
	If True with, compute centralities (degree, closeness, betweenness).
Returns:	Tuple of tuples from `tools.graphs.build_graph()`, set of frontier places, and dictionary with the results of measures on the given graph if requested.
Return type:	<tuple>, <str>, <dict>

Key external_file:
Parameters:	model_file (<str>) – File for the model.
	File with 1 external identifier per line.
Key external_identifiers:
	List of external identifiers to be mapped.

Key all_entities:
Parameters:	model_file (<str>) – File for the ‘.bcx’ model.
Key output_dir:	Output directory.
	If True, data for all places of the model are returned (optional).
Key boundaries:	If True, only data for the frontier places of the model are returned (optional).
Key genes:	If True, only data for the genes of the model are returned (optional).
Key smallmolecules:
	If True, only data for the smallmolecules of the model are returned (optional).
Key default:	Display quick description of the model (Number of places, transitions, entities types, entities locations).
Key json:	If True, make a JSON dump of results in output path(optional).
Key csv:	If True, make a csv dump of informations about filtered places.

Command line package¶

Tools¶

Models¶

Graphs¶

Solutions¶

Generic functions¶

Handle *mac_complete.txt files¶

Handle *mac* files¶

Display, compare, and query a model¶

Merge Minimal Accessibility Conditions¶

Handle generated files¶

Search Minimal Accessibility Conditions¶

Make an interaction graph based on molecules of interest¶

Make heatmaps¶

Handle mac files¶