Command line package

Tools

Models

This module groups functions directly related to the management and the extraction of data of a Cadbiom model.

Here we find high-level functions to manage the logical formulas of the events and conditions defining the transitions; as well as useful functions to manage the entities, like to obtain their metadata or the frontier places of the model.

cadbiom_cmd.tools.models.decompile_condition(tree, inhibitors_nodes)[source]

Recursive function to decompile conditions

Parameters:
  • tree (<expression>) –
    Example of tree argument:
     
    tree = ('H', 'v', (
        ('F', 'v', 'G'),
        '^',
        (
            ('A', 'v', 'B'),
            '^',
            ('C', 'v', ('D', '^', 'E'))
        )
    ))
    
  • inhibitors_nodes (<set>) – Set of inhibitors
Returns:

List of valid paths composed of entities (except inhibitors). Inhibitors are added to inhibitors_nodes.

cadbiom_cmd.tools.models.get_frontier_places(transitions, all_places)[source]

Return frontier places of a model (deducted from its transitions and from all places of the model).

Note

why we use all_places from the model instead of (input_places - output_places) to get frontier places ? Because some nodes are only in conditions and not in transitions. If we don’t do that, these nodes are missing when we compute valid paths from conditions.

Parameters:arg1 (<dict> keys: names of events values: list of transitions as tuples (with in/output, and label)) – Model’s transitions. {u’h00’: [(‘Ax’, ‘n1’, {u’label’: u’h00[]’}),]
Returns:Set of frontier places.
Return type:<set>
cadbiom_cmd.tools.models.get_model_identifier_mapping(model_file, external_identifiers)[source]

Get Cadbiom names corresponding to the given external identifiers (xrefs)

Note

This function works only on v2 formated models with JSON additional data

Parameters:
  • model_file (<str>) – Model file.
  • external_identifiers (<set>) – Set of external identifiers to be mapped.
Returns:

Mapping dictionary with external identifiers as keys and cadbiom names as values.

Return type:

<dict <str>:<list>>

cadbiom_cmd.tools.models.get_places_data(places, model)[source]

Get a list of JSON data parsed from each given places in the model.

This function is used by cadbiom_cmd.models.low_model_info().

Note

v1 models return a dict with only 1 key: ‘cadbiomName’

Note

Start nodes (with a name like __start__x) are handled even with no JSON data. They are counted in the other_types and other_locations fields.

Example of JSON data that can be found in the model:
 
{
    "uri": entity.uri,
    "entityType": entity.entityType,
    "names": list(entity.synonyms | set([entity.name])),
    "entityRef": entity.entityRef,
    "location": entity.location.name if entity.location else None,
    "modificationFeatures": dict(entity.modificationFeatures),
    "members": list(entity.members),
    "reactions": [reaction.uri for reaction in entity.reactions],
    "xrefs": entity.xrefs,
}
Parameters:
  • places (<set>) – Iterable of name of places.
  • model (<MakeModelFromXmlFile>) – Model from handler.
Returns:

List of data parsed from each give places.

Note

Here is the list of field retrieved for v2 models:

  • cadbiomName
  • uri
  • entityType
  • entityRef
  • location
  • names
  • xrefs

Return type:

<list <dict>>

cadbiom_cmd.tools.models.get_places_from_condition(condition)[source]

Parse condition string and return all places, regardless of operators.

Note

This function is only used to get all nodes in a condition when we know they are all inhibitors nodes.

Todo

See the workaround in the code, without using very time consuming and badly coded functions.

Param:Condition string.
Type:<str>
Returns:Set of places.
Return type:<set>
cadbiom_cmd.tools.models.get_transitions(parser)[source]

Get all transitions in the given parser.

There are two methods to access the transitions of a model.

Example:
>>> print(dir(parser))
['handler', 'model', 'parser']
>>> # Direct access
>>> events = list()
>>> for transition in parser.model.transition_list:
...     events.append(transition.event)
>>>
>>> # Indirect access via a handler
>>> events = list()
>>> for transitions in parser.handler.top_pile.transitions:
...     # transitions is a list of CTransition objects
...     for transition in transitions:
...         events.append(transition.event)

Todo

This function is relatively perfectible and although it is useful and mandatory for the design of networkx graphs based on solutions or models, it presents a rather heavy structure which dates from the time when the API of Cadbiom (of transition objects) was unknown and not documented.

Param:Parser opened on a bcx file.
Type:<MakeModelFromXmlFile>
Returns:A dictionnary of events as keys, and transitions as values. Since many transitions can define an event, values are lists. Each transition is a tuple with: origin node, final node, attributes like label and condition. {'h00': [('Ax', 'n1', {'label': 'h00[]'}),]
Return type:<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>
cadbiom_cmd.tools.models.get_transitions_from_model_file(model_file)[source]

Get all transitions and parser from a model file (bcx format).

Param:bcx file.
Type:<str>
Returns:Transitions (see get_transitions()) and the Parser for the model.
Return type:<dict>, <MakeModelFromXmlFile>
cadbiom_cmd.tools.models.parse_condition(condition, all_nodes, inhibitors_nodes)[source]

Return valid paths according the given logical formula and nodes; and set inhibitors_nodes

Note

inhibitors_nodes is modified(set) by this function.

Raises:

AssertionError – If no valid path was found.

Parameters:
  • condition (<str>) – Condition string of a transition.
  • all_nodes (<set>) – Nodes involved in transitions + frontier places.
  • inhibitors_nodes (<set>) – Inactivated nodes in paths of conditions. Modified by the function.
Returns:

Set of paths. Each path is a tuple of nodes.

Return type:

<set>

Graphs

This module groups functions directly related to the creation and the management of the graph based on a Cadbiom model.

Here we find high-level functions to create a Networkx graph, and convert it to JSON or GraphML formats.

cadbiom_cmd.tools.graphs.build_graph(solution, steps, transitions)[source]

Build a graph for the given solution.

  • Get & make all needed edges
  • Build graph

Note

Legend:

  • Default nodes: grey
  • Frontier places: red
  • Transition nodes: blue
  • Inhibitors nodes: white
  • Default transition: grey
  • Inhibition edge: red
  • Activation edge: green
Parameters:
  • solution (<str> or <set> or <list>) – Frontier places. String data will be split on spaces.
  • steps (<list <list>>) – List of steps (with events in each step).
  • transitions (<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>) – A dictionnary of events as keys, and transitions as values (see get_transitions()).
Returns:

  • Networkx graph object.
  • Nodes corresponding to transitions with conditions.
  • All nodes in the model
  • Edges between transition node and nodes in condition
  • Normal transitions without condition

Return type:

<networkx.classes.digraph.DiGraph>, <list>, <list>, <list>, <list>

cadbiom_cmd.tools.graphs.draw_graph(output_dir, frontier_places, solution_index, G, transition_nodes, all_nodes, edges_in_cond, edges)[source]

Draw graph with colors and export it svg file format .

This function is no longer used but can be still usefull.

Note

Legend:

  • red: frontier places (in frontier_places variable),
  • white: middle edges,
  • blue: transition edges
Parameters:
  • output_dir (<str>) – Output directory for GraphML files.
  • frontier_places (<set>) – Solution: a set of frontier places.
  • solution_index (<int> or <str>) – Index of the solution in the Cadbiom result file (used to distinguish exported filenames).
  • G (<networkx.classes.digraph.DiGraph>) – Networkx graph object.
  • transition_nodes (<list>) – Nodes corresponding to transitions with conditions. List of tuples: event, node
  • all_nodes (<list>) – All nodes in the model.
  • edges_in_cond (<list>) – Edges between transition node and nodes in condition
  • edges (<list>) – Normal transitions without condition.
cadbiom_cmd.tools.graphs.export_graph(output_dir, frontier_places, solution_index, G, *args)[source]

Export a networkx graph to GraphML format.

Note

Legend: See build_graph().

Parameters:
  • output_dir (<str>) – Output directory for GraphML files.
  • frontier_places (<set>) – Solution: a set of frontier places. This argument is used to build the filename.
  • solution_index (<int> or <str>) – Index of the solution in the Cadbiom result file (used to distinguish exported filenames).
  • G (<networkx.classes.digraph.DiGraph>) – Networkx graph object.
cadbiom_cmd.tools.graphs.get_json_graph(G)[source]

Translate Networkx graph into a dictionary ready to be dumped in a JSON file.

Note

In classical JSON graph, ids of nodes are their names; also, their position in the array of nodes gives their numerical id, which is used as source or target in edges definitions. Here, for readability and debugging purpose, we use distinct attributes id and label for nodes.

Parameters:graph (<networkx.classes.digraph.DiGraph>) – Networkx graph.
Returns:Serialized graph ready to be dumped in a JSON file.
Return type:<dict>
cadbiom_cmd.tools.graphs.get_solutions_graph_data(G, info, centralities)[source]

Complete the given dictionary with information specific to the graph considered

Doc:

https://networkx.github.io/documentation/networkx-1.10/reference/algorithms.component.html
https://networkx.github.io/documentation/stable/reference/algorithms/shortest_paths.html
average_shortest_path_length
https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.average_shortest_path_length.html#networkx.algorithms.shortest_paths.generic.average_shortest_path_length
weakly_connected_component_subgraphs
https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.algorithms.components.weakly_connected.weakly_connected_component_subgraphs.html#networkx.algorithms.components.weakly_connected.weakly_connected_component_subgraphs
Measures
https://networkx.github.io/documentation/stable/reference/algorithms/index.html

By default the following information are added:

- graph_nodes: Number of nodes
- graph_edges: Number of edges
- graph_nodes_places: Number of biological places/entities.
  The graph is a false bipartite graph, we remove the subset of transitions
  in order to have the real count of biological places/entities.

If centralities is True, the folliwing information are added to the a new key named “centralities”:

- strongly_connected:
- weakly_connected
- max_degree
- min_degree
- average_degree
- degree
- connected_components_number
- connected_components
- average_shortest_paths
Parameters:
  • G (<networkx.classes.digraph.DiGraph>) – NetworkX directed graph
  • info (<dict>) – Dictionnary of data to be completed
  • centralities (<boolean>) – Flag to activate the computation of centralities.
cadbiom_cmd.tools.graphs.merge_graphs(graphs)[source]

Merge graphs in the given iterable; count and add the weights to the edges of the final graph

Parameters:graphs (<generator <networkx.classes.digraph.DiGraph>>) – Networkx graph objects.
Returns:Networkx graph object.
Return type:<networkx.classes.digraph.DiGraph>

Solutions

This module groups functions directly related to the parsing and the management of the files generated by the solver of Cadbiom.

Here we find high-level functions to parse or clean mac files, and extract all their data to a JSON format, a data interchange format that is humanly readable and useful in programming.

Generic functions

Handle *mac_complete.txt files
Handle *mac* files
cadbiom_cmd.tools.solutions.convert_solutions_to_json(sol_steps, transitions, conditions=True)[source]

Convert all events for all solutions in a complete MAC file and write them in a separate file in the JSON format.

This is a function to quickly search all transition attributes involved in a solution.

Example:
>>> from tools.models import get_transitions
>>> # Get transitions from the model
>>> model_transitions = get_transitions('model.bcx')
>>> decomp_solutions = convert_solutions_to_json(
...     load_solutions('./solution_mac_complete.txt'),
...     model_transitions,
...     conditions=True,
... )
>>> print(decomp_solutions)
[{
    "solution": "Ax Bx",
    "steps": [
        [{
            "event": "_h_2",
            "transitions": [{
                "ext": "n3",
                "ori": "Bx"
            }]
        }],
    ]
}]
Parameters:
  • arg1 (<list>) – List of steps involved in a solution. See load_solutions(). A tuple of “frontier places” and a list of events in each step. ("Bx Ax", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])
  • arg2 (<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>) – A dictionnary of events as keys, and transitions as values. Since many transitions can define an event, values are lists. Each transition is a tuple with: origin node, final node, attributes like label and condition. {'h00': [('Ax', 'n1', {'label': 'h00[]'}),] See get_transitions().
  • arg3 (<bool>) – (Optional) Integrate in the final file, the conditions for each transition.
Returns:

Return the JSON data for the given steps.

Example:

[{
    "solution": "Ax Bx",
    "steps": [
        [{
            "event": "_h_2",
            "transitions": [{
                "ext": "n3",
                "ori": "Bx"
            }]
        }],
    ]
}]

Return type:

<list>

cadbiom_cmd.tools.solutions.get_all_macs(path)[source]

Return a set of all MAC LINES from a directory or from a file.

This function is based on get_solutions() that returns mac lines and stripped mac lines, and get_mac_lines() that returns only mac lines from a file.

Note

Alternatively we do some verifications here:

  • Detection of duplicated MACS (AssertionError raised)
  • Print number of MACS per file
  • Print duplicated MACS
  • Print number of MACS
Param:Filepath to be opened and in which solutions will be returned.
Type:<str>
Returns:Set of MAC/CAM from the given path.
Return type:<frozenset <str>>
cadbiom_cmd.tools.solutions.get_mac_lines(filepath)[source]

Returns only a set of MAC LINES from A file.

This function is based on get_solutions() that returns mac lines and stripped mac lines.

Note

You would prefer to use get_all_macs() which:

  • Can handle a directory path and return all macs in it,
  • Can handle a simple file,
  • Do some verifications on all parsed macs.

Note

We assume that at this point, all MAC lines are sorted in alphabetical order.

Note

We return LINES not a set of places.

Example:
{'Cx Dx', 'Ax Bx'}
Param:Filepath to be opened and in which solutions will be returned.
Type:<str>
Returns:Set of MAC/CAM from the given file.
Return type:<set <str>>
cadbiom_cmd.tools.solutions.get_query_from_filename(model_file, solution_file)[source]

Return the query string according to the given model and solution filenames

Example:
>>> get_query_from_filename(
...     "/path/model.bcx",
...     "/another_path/model_ENTITY_and_not_ENTITY_mac_complete.txt"
... )
"ENTITY_and_not_ENTITY"
Parameters:
  • model_file (<str>) – Path of a bcx model.
  • solution_file (<str>) – Path of a solution file (*mac* file).
cadbiom_cmd.tools.solutions.get_solutions(file_descriptor)[source]

Generator of solution lines and corresponding stripped lines for *mac* file.

Note

This function does not return events! It is just original lines and cleaned lines containing solutions (i.e sets of frontier places/boundaries).

We remove the last '\n' and '\t'. Tabs in the middle are replaced by one space ' '.

Param:Opened file.
Type:<file>
Returns:A generator of tuples; each tuple contains the original line, and the cleaned line.
Example:

For an original line: 'Z\tY\tX\n'

('Z\tY\tX', 'X Y Z')
Return type:<tuple <str>, <str>>
cadbiom_cmd.tools.solutions.load_solutions(file)[source]

Open a file with many solution/MACs (*mac_complete.txt files) and yield them.

Example:
>>> solutions = load_solutions('./solution_mac_complete.txt')
>>> print([solution for solution in solutions])
("Ax Bx", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])
Param:

File name

Type:

<str>

Returns:

A generator of tuples of “frontier places” and a list of events in each step.

Example:
("Ax Bx", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])

Return type:

<tuple <str>, <list>>

Display, compare, and query a model

Display, compare, and query a model

cadbiom_cmd.models.graph_isomorph_test(model_file_1, model_file_2, output_dir=u'graphs/', make_graphs=False, make_json=False)[source]

Entry point for model consistency checking.

This functions checks if the graphs based on the two given models have the same topology, nodes & edges attributes/roles.

Todo

This function should not write any file, and should be exported to the module tools.

Use in scripts:
>>> from cadbiom_cmd.models import graph_isomorph_test
>>> print(graph_isomorph_test('model_1.bcx', 'model_2.bcx'))
INFO: 3 transitions loaded
INFO: 3 transitions loaded
INFO: Build graph for the solution: Connexin_32_0 Connexin_26_0
INFO: Build graph for the solution: Connexin_32_0 Connexin_26_0
INFO: Topology checking: True
INFO: Nodes checking: True
INFO: Edges checking: True
{'nodes': True, 'edges': True, 'topology': True}
Parameters:
  • model_file_1 (<str>) – Filepath of the first model.
  • model_file_2 (<str>) – Filepath of the second model.
Key output_dir:

Output path.

Key make_graphs:
 

If True, make a GraphML file in output path.

Key make_json:

If True, make a JSON dump of results in output path.

Returns:

Dictionary with the results of tests. keys: ‘topology’, ‘nodes’, ‘edges’; values: booleans

Return type:

<dict <str>: <boolean>>

cadbiom_cmd.models.low_graph_info(model_file, graph_data=False, centralities=False)[source]

Low level function for model_graph().

Get JSON data with information about the graph based on the model.

See also

tools.graphs.get_solutions_graph_data().

Parameters:

model_file (<str>) – File for the model.

Key graph_data:

Also return a dictionary with the results of measures on the given graph. keys: measure’s name; values: measure’s value

Example:

{
    'modelFile': 'string',
    'modelName': 'string',
    'events': int,
    'entities': int,
    'transitions': int,
    'graph_nodes': int,
    'graph_edges': int,
    'centralities': {
        'degree': {
            'entity_1': float,
            'entity_2': float
        },
        'strongly_connected': boolean,
        'weakly_connected': boolean,
        'max_degree': int,
        'min_degree': int,
        'average_degree': float,
        'connected_components_number': int,
        'connected_components': list,
        'average_shortest_paths': int,
    }
}
Key centralities:
 

If True with, compute centralities (degree, closeness, betweenness).

Returns:

Tuple of tuples from tools.graphs.build_graph(), set of frontier places, and dictionary with the results of measures on the given graph if requested.

Return type:

<tuple>, <str>, <dict>

cadbiom_cmd.models.low_model_info(model_file, all_entities=False, boundaries=False, genes=False, smallmolecules=False)[source]

Low level function for model_info().

Get JSON data with information about the model and its entities.

Todo

  • add dump of transitions (option)
  • See get_transitions remark about its deprecation for the current use case
  • Dump roles of boundaries, computed here or in ChartModel… Already implemented for queries_2_common_graph and for pie charts.

See also

Format de sortie de: tools.solutions.convert_solutions_to_json()

Parameters:model_file (<str>) – File for the model.
Key all_entities:
 If True, data for all places of the model are returned (optional).
Key boundaries:If True, only data for the frontier places of the model are returned (optional).
Key genes:If True, only data for the genes of the model are returned (optional).
Key smallmolecules:
 If True, only data for the smallmolecules of the model are returned (optional).
Returns:Dictionary with informations about the model and the queried nodes.
Example:
{
    'modelFile': 'string',
    'modelName': 'string',
    'events': int,
    'entities': int,
    'boundaries': int,
    'transitions': int,
    'entitiesLocations': {
        'cellular_compartment_a': int,
        'cellular_compartment_b': int,
        ...
    },
    'entitiesTypes': {
        'biological_type_a': int,
        'biological_type_b': int,
        ...
    },
    'entitiesData': {
        [{
            'cadbiomName': 'string',
            'immediateSuccessors': ['string', ...],
            'uri': 'string',
            'entityType': 'string',
            'entityRef': 'string',
            'location': 'string',
            'names': ['string', ...],
            'xrefs': {
                'external_database_a': ['string', ...],
                'external_database_b': ['string', ...],
                ...
            }
        }],
        ...
    }
}
Return type:<dict>
cadbiom_cmd.models.model_graph(model_file, output_dir=u'./graphs/', centralities=False, **kwargs)[source]

Get quick information and make a graph based on the model.

Parameters:
  • model_file (<str>) – File for the ‘.bcx’ model.
  • output_dir (<str>) – Output directory.
  • centralities (<boolean>) – If True with --json, compute centralities (degree, in_degree, out_degree, closeness, betweenness).
  • graph (<boolean>) – If True, make a GraphML file based on the graph maked from the model (optional).
  • json (<boolean>) – If True, make a JSON dump of results in output path(optional).
cadbiom_cmd.models.model_identifier_mapping(model_file, *args, **kwargs)[source]

Entry point for the mapping of identifiers from external databases

Parameters:model_file (<str>) – File for the model.
Key external_file:
 File with 1 external identifier per line.
Key external_identifiers:
 List of external identifiers to be mapped.
cadbiom_cmd.models.model_info(model_file, output_dir=u'./', all_entities=False, boundaries=False, genes=False, smallmolecules=False, default=True, **kwargs)[source]

Get quick and full informations about the model structure and places.

Parameters:model_file (<str>) – File for the ‘.bcx’ model.
Key output_dir:Output directory.
Key all_entities:
 If True, data for all places of the model are returned (optional).
Key boundaries:If True, only data for the frontier places of the model are returned (optional).
Key genes:If True, only data for the genes of the model are returned (optional).
Key smallmolecules:
 If True, only data for the smallmolecules of the model are returned (optional).
Key default:Display quick description of the model (Number of places, transitions, entities types, entities locations).
Key json:If True, make a JSON dump of results in output path(optional).
Key csv:If True, make a csv dump of informations about filtered places.

Merge Minimal Accessibility Conditions

cadbiom_cmd.solution_merge.merge_macs_to_csv(directory, output_dir, csvfile=u'merged_macs.csv')[source]

Merge *mac.txt files from a directory to a csv file.

Structure of the CSV file:
 <Final property formula>;<boundaries in the solution>

Handle generated files

Handle generated files

This module provides some functions to do some analyzis on the output files of Cadbiom.

Entry points:

Example of the content of a complete solution file:
 
Bx  Ax
% h2 h00
% h3
% h0 h1
% hlast
Bx  Ax
% h2
% h3 h00
% h0 h1
%
% hlast
Bx  Ax
% h2
% h3 h00
% h0 h1
% hlast
%
%
Bx  Ax
% h2 h00
% h3
% h0 h1
% hlast
%
%
%
cadbiom_cmd.solution_sort.get_solution_graphs(sol_steps, transitions)[source]

Generator that yields the graphs of the given solutions.

Note

See the doc of a similar function save_solutions_to_graphs().

cadbiom_cmd.solution_sort.occurrence_matrix(output_dir, model_file, path, matrix_filename=u'occurrence_matrix.csv')[source]

Make a matrix of occurrences for the solutions in the given path.

  • Compute occurrences of each place in all mac.txt files.
  • Save the matrix in csv format with the following columns:
    Fieldnames: “patterns (number)/places (number);mac_number;frontier places” Each request (pattern) is accompanied by the number of solutions found.

Todo

Split the creation and writing of the matrix in 2 functions.

Parameters:
  • output_dir (<str>) – Output path.
  • model_file (<str>) – Filepath of the model.
  • path (<str>) – Directory of many complete solutions files.
  • matrix_filename (<str>) – (Optional) Filename of the matrix file.
Returns:

A dictionnary with the matrix object. keys: queries, values: occurrences of frontier places

Return type:

<dict>

cadbiom_cmd.solution_sort.queries_2_common_graph(output_dir, model_file, path, make_graphs=True, make_csv=False, make_json=False, *args, **kwargs)[source]

Entry point for queries_2_common_graph

Create a GraphML formated file containing a unique representation of all trajectories corresponding to all solutions in each complete MAC files (*mac_complete files).

This is a function to visualize paths taken by the solver from the boundaries to the entities of interest.

CSV fields:

- query: Query giving the solutions
- solutions: nb trajectories/solutions
- boundaries: Number of boundary places
- events: Number of events in all solutions
- genes: Number of genes involved in solutions
- Protein: Number of boundaries with the type Protein
    (genes are not counted)
- Complex: Number of boundaries with the type Complex
    (genes are not counted)
- input_boundaries: Boundaries found only as input places
- guard_boundaries: Boundaries found only in guards
- mixed_boundaries: Boundaries found in guards AND in inputs of reactions
- graph_nodes: Total number of nodes in the graph
- graph_nodes_places: Nodes that are biomolecules (do not count reaction nodes)
- graph_edges: Number of edges
- strongly_connected: Is the graph strongly connected ?
- max_degree
- min_degree
- average_degree

This function tests if the given path is a directory or a file.

Parameters:
  • output_dir (<str>) – Output path.
  • model_file (<str>) – Filepath of the model.
  • path (<str>) – Filepath/directory of a/many complete solutions files.
Key make_graphs:
 

(optional) Make a GraphML for each query results in path. default: True

Key make_csv:

(optional) Make a global CSV for all query results in path. default: False

Key make_json:

(optional) Make a JSON dump of each query results in path. default: False

cadbiom_cmd.solution_sort.queries_2_json(output_dir, model_file, path, conditions=True)[source]

Entry point for queries_2_json

Create a JSON formated file containing all data from complete MAC files (*mac_complete files). The file will contain frontier places/boundaries and decompiled steps with their respective events for each solution.

This is a function to quickly search all transition attributes involved in a solution.

This function tests if the given path is a directory or a file.

Parameters:
  • output_dir (<str>) – Output path.
  • model_file (<str>) – Filepath of the model.
  • path (<str>) – Filepath/directory of a complete solution file.
  • conditions (<boolean>) – (Optional) If False, conditions of transitions will not be present in the JSON file. This allows to have only places/entities used inside trajectories; thus, inhibitors are avoided.
cadbiom_cmd.solution_sort.queries_2_occcurrence_matrix(output_dir, model_file, path, transposed=False, normalized=False)[source]

Entry point for queries_2_occcurrence_matrix

See occurrence_matrix().

Parameters:
  • output_dir (<str>) – Output path.
  • model_file (<str>) – Filepath of the model.
  • path (<str>) – Directory of many complete solutions files.
  • transposed (<boolean>) – (Optional) Transpose the final matrix (switch columns and rows).
cadbiom_cmd.solution_sort.save_solutions_to_graphs(output_dir, sol_steps, transitions)[source]

Build and export graphs based on the given solutions

Each solution is composed of a set of frontier places and steps, themselves composed of events. We construct a graph based on the transitions that occur in the composition of the events of the given solution.

Parameters:
  • output_dir (<str>) – Output path.
  • sol_steps (<tuple <str>, <list>>) –

    A generator of tuples of “frontier places” and a list of events in each step.

    Example:
    ("Bx Ax", [['h2', 'h00'], ['h3'], ['h0', 'h1'], ['hlast']])
    
  • transitions (<dict <list <tuple <str>, <str>, <dict <str>: <str>>>>) –

    A dictionnary of events as keys, and transitions as values. Since many transitions can define an event, values are lists. Each transition is a tuple with: origin node, final node, attributes like label and condition.

    Example:
    {'h00': [('Ax', 'n1', {'label': 'h00[]'}),]
    
cadbiom_cmd.solution_sort.solutions_2_graphs(output_dir, model_file, path)[source]

Entry point for solutions_2_graphs

Create GraphML formated files containing a representation of the trajectories for each solution in complete MAC files (*mac_complete files).

This is a function to visualize paths taken by the solver from the boundaries to the entities of interest.

This function tests if the given path is a directory or a file.

Parameters:
  • output_dir (<str>) – Output path.
  • model_file (<str>) – Filepath of the model.
  • path (<str>) – Filepath/directory of a/many complete solutions files.
cadbiom_cmd.solution_sort.solutions_sort(path)[source]

Entry point for sorting solutions.

Read a solution(s) file(s) (*mac* files) and sort all frontier places/boundaries in alphabetical order.

This function tests if the given path is a directory or a file.

Warning

The files will be modified in place.

Param:Filepath or directory path containing Cadbiom solutions.
Type:<str>
cadbiom_cmd.solution_sort.sort_solutions_in_file(filepath)[source]

Sort all solutions in the given file in alphabetical order.

Warning

The file is modified in place.

Param:Filepath to be opened and in which solutions will be sorted.
Arg:<str>
cadbiom_cmd.solution_sort.transpose_csv(input_file=u'occurrence_matrix.csv', output_file=u'occurrence_matrix_t.csv')[source]

Useful function to transpose a csv file x,y => y,x

Note

The csv file must be semicolon ‘;’ separated.

Parameters:
  • input_file (<str>) – Input file.
  • output_file (<str>) – Output file transposed.
cadbiom_cmd.solution_sort.write_json(output_dir, file_path, file_suffix, data)[source]

Write decompiled solutions to a JSON formated file

Called by queries_2_json TODO() and queries_2_common_graph()

Parameters:
  • output_dir (<str>) – Output directory
  • file_path (<str>) – Filepath of the original solution file. We extract the basename in order to name the JSON file.
  • file_suffix (<str>) – String added to the solution filename. Ex: filename + file_suffix + “.json”
  • data (<list> or <dict> or <whatever>) – Data to be serialized in JSON

Make an interaction graph based on molecules of interest

This module groups functions directly related to the design of an interaction weighted graph based on the search of molecules of interest.

Entry point: json_2_interaction_graph().

cadbiom_cmd.interaction_graph.build_graph(output_dir, all_genes, all_stimuli, genes_interactions, stimulis_interactions, genes_stimuli_interactions, molecule_stimuli_interactions)[source]

Make an interaction weighted graph based on the search of molecules of interest

Edges:
  • gene - gene: Two genes present simultaneously in a solution
  • stimulus - stimulus: Two stimuli present simultaneously in a solution
  • gene - stimulus: One gene and one stimulus present simultaneously in a solution (deprecated)
  • molecule of interest - stimulus: A molecule of interest in a trajectory related to a solution that contains a stimulus.
Legend of the edges:
 
  • gene - gene: red
  • stimulus - stimulus: blue (deprecated)
  • gene - stimulus: red
  • molecule of interest - stimulus: yellow
Legend of the nodes:
 
  • genes: red
  • stimuli: blue
  • molecules of interest: yellow
Parameters:
  • output_dir (<str>) – Output path.
  • all_genes (<set>) – All genes in all the solutions
  • all_stimuli (<set>) – All stimulis in all the solutions
  • genes_interactions (<Counter>) – Interactions between genes in the same solution
  • stimulis_interactions (<Counter>) – Interactions between stimuli in the same solution
  • genes_stimuli_interactions (<Counter>) – Interactions between genes and stimulis in the same solution
  • molecule_stimuli_interactions (<Counter>) – Counter interactions between molecules of interest and frontier places that are not genes (stimuli) in trajectories (i.e.: (molecule, stimulus)).
cadbiom_cmd.interaction_graph.build_interactions(filtered_macs, binary_interactions)[source]

Make binary interactions used by the graph as edges

PS: genes and stimulis are frontier places.

Parameters:
  • filtered_macs (<tuple <tuple <str>>>) –

    All solutions related to the molecules of interest.

    (("frontier_1", "frontier_2", "frontier_3"),)
    
  • binary_interactions (<dict <str>: <Counter <str>: <int>>>) –

    A dictionary of related frontier places.

    # For molecules of interest "A" and "B"
    {"A": {
       "frontier_1": 1,
       "frontier_2": 1,
     },
     "B": {
       "frontier_3": 1,
     },
    }
    
Returns:

Various Counters of binary interactions:

  • all_genes: All genes in all the solutions
  • all_stimuli: All stimulis in all the solutions
  • genes_interactions: Interactions between genes in the same solution
  • stimulis_interactions: Interactions between stimuli in the same solution
  • genes_stimuli_interactions: Interactions between genes and stimulis in the same solution
  • molecule_stimuli_interactions: Counter interactions between molecules of interest and frontier places that are not genes (stimuli) in trajectories (i.e.: (molecule, stimulus)).

Return type:

<set>, <set>, <Counter>, <Counter>, <Counter>, <Counter>

cadbiom_cmd.interaction_graph.filter_trajectories(trajectories, molecules_of_interest)[source]

Get solutions and count frontier places related to the given molecules of interest.

Parameters:
  • trajectories (<generator <tuple <tuple>, <set>>>) –

    A generator of tuples with tuple of frontier places as keys and set of places involved in transitions as values.

    (("Ax", "Bx"), {"n3", "Bx"})
    
  • molecules_of_interest (<tuple>) – Iterable of molecules of interest.
Returns:

A tuple of all solutions related to the molecules of interest, and a dictionary of related frontier places and their occurences for each molecule of interest.

# For molecules of interest "A" and "B"
((("frontier_1", "frontier_2", "frontier_3"),),
 {"A": {
    "frontier_1": 1,
    "frontier_2": 1,
  },
  "B": {
    "frontier_3": 1,
  },
 })

Return type:

<tuple <tuple <tuple <str>>>, <dict <str>: <Counter <str>: <int>>>>

Read decompiled solutions files (*.json* files)

This functions tests if the given path is a directory or a file.

Parameters:path (<str>) – Filepath/directory of a decompiled JSON file.
Returns:A generator of tuples with tuple of frontier places as keys and set of places involved in transitions as values.
(("Ax", "Bx"), {"n3", "Bx"})
Return type:<generator <tuple <tuple>, <set>>>

Get frontier places and other places involved in transitions.

Parameters:
  • file_path

    Path of a JSON file; this file is generated by convert_solutions_to_json().

    A solution is composed of steps with events, composed of transitions:
     
    [{
        "solution": "Ax Bx",
        "steps": [
            [
                {
                    "event": "_h_2",
                    "transitions": [{
                        "ext": "n3",
                        "ori": "Bx"
                    }]
                },
            ],
        ]
    }]
    
  • file_path – <str>
Returns:

A generator of tuples with tuple of frontier places as keys and set of places involved in transitions as values.

(("Ax", "Bx"), {"n3", "Bx"})

Return type:

<generator <tuple <tuple>, <set>>>

cadbiom_cmd.interaction_graph.json_2_interaction_graph(output_dir, molecules_of_interest, path)[source]

Entry point for json_2_interaction_graph

Read decompiled solutions files (*.json* files produced by the directive queries_2_json) and make a graph of the relationships between one or more molecules of interest, the genes and other frontier places/boundaries found among all the solutions.

More information about the graph and its legend: build_graph().

Parameters:
  • output_dir (<str>) – Output path.
  • molecules_of_interest (<tuple>) – Iterable of molecules of interest.
  • path (<str>) – Filepath/directory of a JSON solution file.

Make heatmaps

Module used to create a hierarchically-clustered heatmap of boundaries.

cadbiom_cmd.queries_2_clustermap.draw_matrix_heatmap(df, filepath)[source]

Draw and save clustermap from the given dataframe

Parameters:
  • df (<pandas.core.frame.DataFrame>) – Pandas dataframe
  • filepath (<str>) – Filepath of the matrix. Used to build the SVG file.
cadbiom_cmd.queries_2_clustermap.open_dataframe(filepath)[source]

Get Pandas dataframe from CSV file

Because yes, pandas knows to open a CSV file (not like R). It’s awesome. Don’t teach this in bio-info please. You should always prefer complex and legacy technologies it makes you smart (especially for the first ones ><).

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

Returns:Pandas dataframe
Return type:<pandas.core.frame.DataFrame>
cadbiom_cmd.queries_2_clustermap.payload(output_dir, filepath)[source]

Make a clustermap based on an occurrence matrix for the given solution file

Parameters:
  • output_dir (<str>) – Output path.
  • filepath (<str>) – Solution filepath.
cadbiom_cmd.queries_2_clustermap.queries_2_clustermap(output_dir, path, *args, **kwargs)[source]

Entry point for queries_2_clustermap

Create a hierarchically-clustered heatmap of boundaries in mac files.

Parameters:
  • output_dir (<str>) – Output path.
  • path (<str>) – Filepath/directory of a/many complete solutions files.
cadbiom_cmd.queries_2_clustermap.write_matrix(filepath, output_dir)[source]

Make an occurrence matrix of boundaries found in the given solution file

Example of CSV produced:

  • Columns: Frontier places
  • Lines: Solution with a ‘1’ in columns corresponding to an occurrence of the frontier place.
solution_number;boundary_1;boundary_2;...
1;0;1;...
2;1;0;...
Parameters:
  • filepath (<str>) – Solution filepath.
  • output_dir (<str>) – Output path.
Returns:

Filepath of the CSV file produced. Filename is of the form <solution_file>_sol_matrix.csv

Return type:

<str>