Documentation of the package for developers

biopax_converter

This module is used to translate BioPAX data to CADBIOM models.

biopax2cadbiom.biopax_converter.addCadbiomSympyCondToReactions(dictReaction, dictPhysicalEntity)[source]

Elaborate condition for each event attached to a reaction.

Note

Condition: i.e guard of transition in Cadbiom formalism.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_cadbiom_names_to_entities(dictPhysicalEntity)[source]

Add ‘listOfCadbiomNames’ attribute to entities.

The aim is to have the list of elements contained in each entities and their names.

Note

We process essentially entities with subunits: components or members (complexes or classes).

Note

The attribute ‘listOfCadbiomNames’ corresponds to a list of unique cadbiom IDs for the entity (Complex, Class). Each member of the list is the unique cadbiom ID of each subcomponent present in the attribute ‘flat_components’.

Warning

To fill ‘listOfCadbiomNames’, we first handle complexes that can be classes; BUT classes are not necessarily complexes (without ‘flat_components’), so a recursive decomposition is made. For that, see get_cadbiom_names()

Note

Because complexes are already developed in developComplexEntity(), this type of entities do not have to be decompiled recursively here.

Parameters:dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_controllers_to_reactions(dictReaction, dictControl)[source]

Fill the attribute controllers of Reaction objects with Controls objects.

Note

Thanks to filter_control() pathways are removed from the controllers; we have only entities in controllers.

Note

The controllers attribute of a Reaction corresponds to a set of controller entities involved in it.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions created, by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls created, by the function query.getControls()
biopax2cadbiom.biopax_converter.add_locations_to_entities(dictPhysicalEntity, dictLocation)[source]

Add Location objects to PhysicalEntities

biopax2cadbiom.biopax_converter.add_modifications_features_to_entities(dictPhysicalEntity, dictModificationFeatures)[source]

Add modifications and their number to the entity name

biopax2cadbiom.biopax_converter.add_reactions_and_controllers_to_entities(dictReaction, dictControl, dictPhysicalEntity)[source]

Fill the attribute reactions of PhysicalEntity objects with Reactions and Controls objects.

Note

The reactions attribute corresponds to a set of reactions in which the entity is involved (as controller or participant). We use this attribute in order to know if complexes have to be deconstructed (only if a subentity is used elsewhere in a reaction).

Note

Supported roles in reactions are: - productComponent - participantComponent - leftComponents - rightComponents - controller of

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls, created by the function query.getControls()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_unique_cadbiom_name_to_entities(dictPhysicalEntity)[source]

Add cadbiomName attribute to entities in dictPhysicalEntity.

Note

The attribute cadbiomName corresponds to a unique cadbiom ID for the entity (Protein, Complex, Class, etc.).

Parameters:dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_xrefs_to_entities(dictPhysicalEntity, dictEntities_db_refs)[source]

Add xrefs to entities

Parameters:dictEntities_db_refs (<dict <str>: <dict <str>: <list>>>) – Dictionary of entityRefs. keys: uris; values: dict of databases keys: database names; values: ids
biopax2cadbiom.biopax_converter.clean_name(name)[source]

Clean name for correct cadbiom parsing.

biopax2cadbiom.biopax_converter.compute_locations_names(dictLocation, numeric_compartments_names=False)[source]

Create a cadbiom ID for each location.

Warning

It updates the key ‘cadbiomName’ of entities in dictLocation[location].

Parameters:
  • dictLocation (<dict>) – Dictionnary of biopax locations created by query.getLocations(). keys: CellularLocationVocabulary uri; values: Location object
  • numeric_compartments_names (<bool>) – (optional) If True, names of compartments will be based on numeric values instead of their real names.
Returns:

Dict of encoded locations. keys: numeric value or real location name; values: Location object

Return type:

<dict <str>:<Location>>

biopax2cadbiom.biopax_converter.createControlFromEntityOnBothSides(dictReaction, dictControl)[source]

Remove entities on both sides of reactions and create a control instead.

We believe that these entities present in the reagents and products are in fact a catalysts without which the reaction can not take place.

We remove this entity from the reaction and add an ACTIVATION controller to the list of BioPAX Controls.

Note

This function must be called before adding reactions to entities.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls, created by the function query.getControls()
biopax2cadbiom.biopax_converter.detectMembersUsedInEntities(dictPhysicalEntity, convertFullGraph=False, keepEmptyClasses=False)[source]

Set the attribute ‘membersUsed’ of generic entities (classes).

Note

The value is False if the entity does not have members; if at least one member is involved in a reaction the value is True.

Warning

A generic entity can be any of the subclasses of PhysicalEntity (A complex can also be a class); BUT complexes are the only entities with ALWAYS ‘flat_components’ != None value. (Except for complexes/classes; We check that these entities have no flat_components in develop_complexes())

Parameters:
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • convertFullGraph (<bool>) – (optional) Convert all entities to cadbiom node, even the entities that are not used elsewhere.
  • keepEmptyClasses – (optional) (deprecated) If some members are not used, we add the entity to the membersUsed attribute with the aim to represent all the members not used. => This will break some conversions and unit tests because Translation implies the removal of genericity.
biopax2cadbiom.biopax_converter.developComplexEntity(complex_entity, dictPhysicalEntity, new_physical_entities)[source]

Fill flat_components attribute of the given complex.

Note

Search recursively all components of the given complex.

Some Complex have subcomplex like in Reactome 56 from PC8. Example:

  • Complex_c33f6c2be7551100a54e716b3bf8ec8a:
  • Complex_0088fc0fe989a0b0abc3635b20df8d90
  • Complex_b87d9cb2e60df79cdde88a9f8f45e80d

Warning

Here we handle ONLY COMPLEXES! Even if some complexes have sub-entities that are classes, flat_components contains ONLY uris of atomic entities. ONLY flat_components_primitives contains generic entities. In flat_components_primitives, we just want items (including generic ones) in the same order as any flat_component in flat_components.

Warning

Complexes are the only entities that have a flat_components attribute set. However, Complexes that are also classes should have an empty flat_components.

Note

Some complexes are classes, some of these classes may have components. We produce new complexes, copies of these classes without any member but only their components before erasing components from the classes.

  • The new complex is added to new_physical_entities and must be added

later to dictPhysicalEntity. - The class is left in dictPhysicalEntity.

Note

Empty complexes (without component) are processed like any basic entity. Cf VirtualCase19: ‘B_bottom’

Todo

When a class occurs multiple times through components of complexes we should remove it and make a set of primitives. This will avoid cartesian product of members, duplication of complexes on useless flat_components and multiple replacements in replace_and_build(). Cf VirtualCase19: ‘B’ class in C_top and C_bottom.

Note

Full explanations: developed_components is a list of tuples that contain combinations of all recursively searched sub-entities in the given complex.

developed_classes is a list of primitives sub-entities in the given complex. Classes are not replaced by their members. Entities is in the same order as in a flat_component. The aim is to dynamically rebuild the flat_component of a complex when we remove genericity in replace_and_build().

Example:

A: complex composed with components:
    B: complex with components:
        W: protein
        X: generic smallmolecule with members:
            Y: smallmolecule (used elsewhere)
            Z: smallmolecule (not used elsewhere)
    C: protein

For X: developed_components = [X, Y] (edit: just [Y] now)
For W: developed_components = [W]
So for B: developed_components = [[X, Y], [W]] (edit: [[Y], [W]])
and flat_components = [(X, W), (Y, W)] (edit: [(Y, W)])
For A: developed_components = [[C], [(X, W), (Y, W)]] (edit: [[c], [(Y, W)]])
and flat_components = [(C, X, W), (C, Y, W)]
where X is a class that represents Z, and Y a true smallmolecule
(edit: now X is removed, and the final result is [(C, Y, W)])

If Z had been used elsewhere, we would have had the following
final result for developed_components of A:
[[C], [(Y, W), (Z, W)]]
and flat_components:
[(C, Y, W), (C, Z, W)]

developed_classes = [C, X, W]
flat_components_primitives = [C, X, W]

PS:
'A' can be Complex_6e3d8ef563cbcc0c9e2a4afb2a920c38
(Reactome v56 inPC8); In this complex, Z is also used,
so X is totally removed.
Parameters:
  • complex_entity (<PhysicalEntity>) – Complex entity
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.develop_complexes(dictPhysicalEntity)[source]

Set the attribute ‘flat_components’ of complexes entities.

Note

‘flat_components’ is a list of tuples of component URIs.

Note

This function depends of detectMembersUsedInEntities()

Parameters:dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.filter_controls(controls, pathways_names, blacklisted_entities)[source]

Remove pathways and cofactors from controls and keep others entities.

Note

Remove also entities that control pathways.

Note

We want ONLY entities and by default, there are pathways + entities.

Parameters:
  • arg1 (<dict>) – Dict of Contollers. keys: URIs; values: <Control>
  • arg2 (<dict>) – Dict of pathways URIs and names. keys: URIs; values: names (or uri if no name)
Returns:

Filtered controllers dict.

Return type:

<dict>

biopax2cadbiom.biopax_converter.filter_entities(dictPhysicalEntity, blacklisted_entities)[source]

Remove blacklisted entities from BioPAX entities.

Note

Blacklisted entities are removed from dictPhysicalEntity, from components and from members.

Parameters:
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • blacklisted_entities (<set>) – set of entity uris blacklisted
Returns:

Dictionnary of biopax physicalEntities without blacklisted entities

Return type:

<dict <str>: <PhysicalEntity>>

biopax2cadbiom.biopax_converter.filter_reactions(dictReaction, blacklisted_entities)[source]

Remove blacklisted entities from reactions.

Note

Effects: - productComponent and participantComponent can be set to None - blacklisted entities are removed from leftComponents and rightComponents

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • blacklisted_entities (<set>) – set of entity uris blacklisted
biopax2cadbiom.biopax_converter.find_unique_synonyms(cadbiom_name, entity_uris, unique_cadbiom_names, dictPhysicalEntity)[source]

Build unique names for the given uris, having the same cadbiom name.

Note

First, we use synonyms from BioPAX database to find a unique name. When there is no more usable synonyms to build a unique name, we add a version number based on the given cadbiom name for all the remaining entities.

Note

The merging procedure for similar entities greatly reduces the number of entity groups proposed to this function.

Parameters:
  • cadbiom_name (<str>) – The redundant cadbiom name
  • entity_uris (<set>) – Set of uris of entities having the same name
  • unique_cadbiom_names (<set>) – Set of unique cadbiom names already used
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Dictionnary of uris as keys and unique names as values.

Return type:

<dict>

biopax2cadbiom.biopax_converter.getCadbiomName(entity, synonym=None)[source]

Get entity name formatted for Cadbiom.

Parameters:
  • arg1 (<PhysicalEntity>) – PhysicalEntity for which the name will be encoded.
  • arg2 (<dict>) – Dictionnary of biopax locations created by query.getLocations(). keys: CellularLocationVocabulary uri; values: Location object
  • arg3 (<str>) – (Optional) Synonym that will be used instead of the name of the given entity.
Returns:

Encoded name with location if it exists.

Return type:

<str>

biopax2cadbiom.biopax_converter.getPathwayToPhysicalEntities(dictReaction, dictControl, dictPhysicalEntity)[source]

This function creates the dictionnary pathwayToPhysicalEntities.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls, created by the function query.getControls()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

pathwayToPhysicalEntities keys: pathway uris; values set of entities involved in the pathway.

Return type:

<dict <str>: <set>>

biopax2cadbiom.biopax_converter.getTransitions(dictReaction, dictPhysicalEntity)[source]

Return transitions with (ori/ext nodes) and their respective events.

Warning

dictPhysicalEntity is modified in place. We add “virtual nodes” for genes that are not in BioPAX format.

Todo

handle TRASH nodes => will crash cadbiom writer because they are not entities…

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Dictionnary of transitions and their respective set of events.

Example:
subDictTransition[(cadbiomL,right)].append({
    ‘event’: transition[‘event’],
    ‘reaction’: reaction,
    ‘sympyCond’: transitionSympyCond
}

Return type:

<dict <tuple <str>, <str>>: <list <dict>>>

biopax2cadbiom.biopax_converter.get_cadbiom_names(entity, dictPhysicalEntity)[source]

To be called recursively or by add_cadbiom_names_to_entities()

Note

See add_cadbiom_names_to_entities() for more information.

Note

The attribute ‘listOfCadbiomNames’ corresponds to a list of unique cadbiom IDs for the entity (Complex, Class). Each member of the list is the unique cadbiom ID of each subcomponent present in the attribute ‘flat_components’.

Parameters:
  • entity (<PhysicalEntity>) – A PhysicalEntity.
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Set of cadbiom names for the given entity.

Return type:

<set>

biopax2cadbiom.biopax_converter.get_control_group_condition(controls, dictPhysicalEntity)[source]

Get condition for a group of controllers.

Activators are linked by a logical ‘OR’, inhibitors are linked by a logical ‘OR’, but activators and inhibitors are linked together by a logical ‘AND’.

Warning

controlType can be as follows (* are currently supported because they are general terms; others are from EcoCyc and will raise an exception):

  • ACTIVATION*
  • INHIBITION*
  • INHIBITION-ALLOSTERIC
  • INHIBITION-COMPETITIVE
  • INHIBITION-IRREVERSIBLE
  • INHIBITION-NONCOMPETITIVE
  • INHIBITION-OTHER
  • INHIBITION-UNCOMPETITIVE
  • ACTIVATION-NONALLOSTERIC
  • ACTIVATION-ALLOSTERIC

Note

Controllers/classes are processed in get_cadbiom_names(). Here we just use listOfCadbiomNames to distinguish entities.

Parameters:
  • controls (<set <Control>>) – Set of Control objects.
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Sympy condition.

Return type:

<sympy.core.symbol.Symbol>

biopax2cadbiom.biopax_converter.load_blacklisted_entities(blacklist)[source]

Get all URIs of blacklisted elements in the given file.

Note

The csv can be written with the following delimiters: ‘,;’. In the first column we expect the URI, In the second column users can put the corresponding cadbiom name (currently not used).

Param:Blacklist filename.
Type:<str>
Returns:Set of uris.
Return type:<set>
biopax2cadbiom.biopax_converter.main(params)[source]

Entry point

Here we detect the presence of the pickle backup and its settings. If there is no backup or if the user doesn’t want to use this functionality, queries are made against the triplestore.

Then, we construct a Cadbiom model with all the retrieved data.

biopax2cadbiom.biopax_converter.merge_duplicated_entities(dictPhysicalEntity, modelfilepath)[source]

Merge multiple occurrences of the same entity in the model

Note

The duplicates can come from the BioPAX database, as well as from the process of transferring post-translational modifications of classes to their daughter entities in transfer_class_attributes_on_child_entities()

Note

In order to group the entities, they are ordered according to some of their attributes:

  • entityType
  • entityRef
  • name
  • components_uris
  • location_uri
  • modificationFeatures

Note

About reactions attached to duplicate entities: Reactions from all duplicates are merged together.

Todo

During the merge of entities, prefer existing uris in the BioPAX model rather than those formed by duplication.

Parameters:
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • modelfilepath (<str>) – Filepath of the final model.
biopax2cadbiom.biopax_converter.shortening_modifications(modificationFeatures, length=1)[source]

Return a short version of all given modification names and occurences.

Note

Some terms can be corrected before shortening: - residue modification, inactive: inactive - residue modification, active: active

Parameters:
  • arg1 (<Counter>) – Counter of modificationFeatures
  • arg2 (<int>) – Length of the shortening; put None for entire strings.
Returns:

Short and merged version of the given modificationFeatures.

Return type:

<str>

biopax2cadbiom.biopax_converter.sort_callback(elem)[source]

Order of the sort of PhysicalEntities on their attributes

Note

The sort of all entities must respect lexicographic order of all attributes.

=> if component URI is not casted into a sorted list, the order is modified, and then, itertools.groupby will be fooled:

  • [‘W’, ‘X’] < [‘X’, ‘Y’] => True
  • {‘X’, ‘W’} < {‘X’, ‘Y’} => False
['W', 'X'] is < to ['X', 'Y']
Ater;A;['W', 'X'];http://simulated/test#anywhere;
A;A;['X', 'Y'];http://simulated/test#anywhere;
Abis;A;['X', 'Y'];http://simulated/test#anywhere;

If we do not cast set into list:
A;A;['Y', 'X'];http://simulated/test#anywhere;
Abis;A;['Y', 'X'];http://simulated/test#anywhere;
Ater;A;['X', 'W'];http://simulated/test#anywhere;
Parameters:elem (<PhysicalEntity>) – PhysicalEntity
biopax2cadbiom.biopax_converter.transfer_class_attributes_on_child_entities(dictPhysicalEntity)[source]

Transfer modificationFeatures and location of classes on child entities

If a child entity does not have the same attributes as its class, it is inserted in the list of BioPAX entities under a specific (new) URI, with its new inherited attributes. It is possible that an entity describing this state is already in the BioPAX ontology. In this case, the two entities will then be grouped by the function merge_duplicated_entities().

Todo

Si entité dupliquée déjà dans le dictionnaire:

  • elle est déjà utilisée ailleurs dans 1 classe
    => doit etre décompilée même si ne participe à aucune réaction.
  • Sinon, supprimer les réactions
    => sert à rien de créer des entités non utilisées dans le modèle.

Note

In a general way, sub-entities are not duplicated if the class doesn’t provide information that is not already in the sub-entity. The transfer of similar post-translational modifications AND location is useless.

Note

We try to no overwrite modifications or location if they are the same. => Avoid the duplication of entities. However, we can not exclude that inconsistent / conflicting modifications are applied to the sub-entities such as: residue modification, active and residue modification, inactive

Note

About reactions attached to duplicate entities: We reset all reactions (the attributes reactions of sub-entities) involving the entity in its old context (without the transfer of attributes that we operate here). This avoids appearing in the model entities that are not reused anywhere else. If the entity must be present in the model, it will be decided during the merge by the function merge_duplicated_entities() that also merges the reactions of the duplicates. There are four cases to consider:

  • none of the duplicates contains a reaction

    => the merged entity will be absent from the model

  • the duplicate entity has no reaction but the duplicate already in the model contains one

    => the merged entity will be in the model

  • the duplicate entity has a reaction but the duplicate already in the model does not contain one

    => if the attribute reactions is not reset, the merged entity will be wrongly in the model because of it will be flagged as being reused elsewhere by detectMembersUsedInEntities().

    => if the attribute reactions is reset, a side effect described in testCase xxx will appear: the decompilation of classes participating in reactions causes the formation of incorrect relations between the entities of these classes.

Warning

dictPhysicalEntity is modified here.

Todo

mettre à jour les réactions des entité dupliquées => test entité déjà présente dans le dico: ajout au set de réactions

cadbiom_writer

This module groups functions used to export BioPAX-processed data to a Cabiom model file.

biopax2cadbiom.cadbiom_writer.build_json_data(entity)[source]

Build JSON data about the given entities.

Note

We can handle reactions from dictReaction, or entities from dictPhysicalEntity

Note

Return these attributes if they exist:

  • PhysicalEntity:
    • uri
    • entityType
    • name + synonyms
    • entityRef
    • location
    • modificationFeatures
    • members
    • reactions
  • Reaction:
    • uri
    • interactionType
Parameters:arg1 (<str>) – URI of an entity or a list of reactions.
Returns:JSON formatted str.
Return type:<str>
biopax2cadbiom.cadbiom_writer.createCadbiomFile(dictTransition, dictPhysicalEntity, dictReaction, nameModel, filePath, no_scc_fix)[source]

Export data into a Cadbiom file format.

Parameters:
  • arg1 (<dict <tuple <str>, <str>>: <list <dict>>>) –

    Dictionnary of transitions and their respective set of events.

    Example:
    subDictTransition[(cadbiomL,right)].append({
        ‘event’: transition[‘event’],
        ‘reaction’: reaction,
        ‘sympyCond’: transitionSympyCond
    }
    
  • arg2 (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • arg3 (<str>) – Name of the model.
  • arg4 (<str>) – File path.
biopax2cadbiom.cadbiom_writer.formatCadbiomSympyCond(cadbiomSympyCond)[source]
Return type:<str>
biopax2cadbiom.cadbiom_writer.formatEventAndCond(setOfEventAndCond)[source]
Return type:<str>
biopax2cadbiom.cadbiom_writer.get_names_of_missing_physical_entities(dictPhysicalEntity)[source]

Get URI and cadbiom name for each entity in the model.

Param:Dictionnary of uris as keys and PhysicalEntities as values.
Type:<dict>
Returns:Dictionnary of names as keys and uris as values.
Return type:<dict>

classes

This module describes the classes that wrap the BioPAX formalism.

class biopax2cadbiom.classes.Control(uri, interactionType, controlType, reaction_uri, controller)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity, biopax2cadbiom.classes.GenericBioPAXInteraction

Class for Control

Attributes:
  • interactionType => subclass of Control
  • controlType => type of control (ACTIVATION or INHIBITION)

Since some Controls (Catalysis) have no controlType, this attribute is optional. By default controlType for Catalysis object will be “ACTIVATION”. - reaction_uri => entity that is controlled

(supposed to be a subclass of Interaction)
  • controller => entity that controls the reaction

  • controllers => entities that controls the reaction.

    It happens in some borderline cases (Kegg). TODO: Not currently supported!

  • evidences => set of evidences uris (identify controllers of

    the same reaction)

Warning

Only Catalysis is allowed to have a default controlType. If you try to create a Modulation or any other class with a controlType which is None, an exception will be raised.

class biopax2cadbiom.classes.GenericBioPAXEntity[source]

Bases: object

Generic class for BioPAX entities which brings basic common functions

short_uri

Return the URI without the prefix of the host.

Example:

Note

This attribute is read-only.

class biopax2cadbiom.classes.GenericBioPAXInteraction[source]

Bases: object

Generic class for BioPAX interactions which brings basic common functions

class biopax2cadbiom.classes.Location(uri, locationTerm)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity

Class for Location

Attributes:
  • locationTerm
  • dbRef
Optional:
  • xrefs => UnificationXref with db as keys and terms as values
  • cadbiomName
add_xref(dbref, idref)[source]

Add xref to the existant dict

Note

dbref as key, idref as value in set

class biopax2cadbiom.classes.PhysicalEntity(uri, name, location_uri, entityType, entityRef)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity

Class for Physical Entity

Attributes:
  • name
  • location_uri
  • entityType
  • entityRef (str) (not a set!)
Optional:
  • synonym (set)
  • components_uris (set)
  • members (set)
  • location (Location)
  • xrefs (dict) => UnificationXref with db as keys and terms as values
  • reactions (set)
  • membersUsed (set)
  • cadbiomName (set)
  • modificationFeatures (dict/Counter)
  • flat_components (list)
  • flat_components_primitives (list)

=> list of all primitive objects in flat_components Only intermediate complexes are replaced by their components. Classes and nested classes are not decompiled and are kept as they are. This attribute is used to rebuild a flat_component when we remove the genericity during the duplication of the reactions. - listOfCadbiomNames (list)

add_xref(dbref, idref)[source]

Add xref to the existant dict

Note

dbref as key, idref as value in set

is_class

Return True if the object is a class, False otherwise

is_complex

Return True if the object is a complex, False otherwise

class biopax2cadbiom.classes.Reaction(uri, name, interactionType, productComponent, participantComponent)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity, biopax2cadbiom.classes.GenericBioPAXInteraction

Class for reaction

Attributes:
  • name
  • interactionType => subclass of Interaction
  • productcomponent
  • participantcomponent
Optional:
  • pathways (set)
  • leftcomponents
  • rightcomponents
  • controllers (set) => Control entities that control the reaction
  • cadbiomSympyCond
  • event
  • complexes => used during duplication of reactions to replace

generic complexes by their unique flat_component.

TODO: Handle conversionDirection attr.

sparql_biopaxQueries

This module contains a list of functions to query any SPARQL endpoint with BIOPax data.

biopax2cadbiom.sparql_biopaxQueries.getControls(listOfGraphUri, provenance_uri)[source]

Note

controlType is in (ACTIVATION, INHIBITION) Since some Controls (Catalysis) have no controlType, this attribute is optional. By default controlType in Control object will be “ACTIVATION”.

Warning

Only Catalysis is allowed to have a default controlType. If you try to create a Modulation or any other class with a controlType which is None, an exception will be raised.

Note

PID: Evidences nb: 15523, for controls nb: 8203

biopax2cadbiom.sparql_biopaxQueries.getModificationFeature(listOfGraphUri, provenance_uri)[source]

Get ModificationFeatures that occur on PhysicalEntities, grouped by entity, modification type and number of modifications per type.

Returns:A dict of dicts (not Counters)! Each dict contains the modifications as keys and their number as values.
Return type:<dict <dict>>
biopax2cadbiom.sparql_biopaxQueries.getPhysicalEntities(listOfGraphUri, provenance_uri)[source]

Note

From the BioPAX documentation, about the use of memberPhysicalEntity:

Using this property is not recommended. memberPhysicalEntity is only defined to support legacy data in certain databases. It is used to define a generic physical entity that is a collection of other physical entities. In general, EntityReference class should be used to create generic groups of physical entities, however, there are some cases where this is not possible, and the property has to be used. For instance, when an entity reference is used to define a generic physical entity with generic features, the generic features of the same type must be grouped. If you do not have grouping information for features of generic physical entities, you cannot use entity reference to define generic physical entities and must use the memberPhysicalEntity property. Another example for using this property is to create generic complexes, which are currently not supported with the EntityReference scheme (there is no “ComplexReference” class).

biopax2cadbiom.sparql_biopaxQueries.getReactions(listOfGraphUri, provenance_uri)[source]

Query all Interactions of the database, minus Control objects.

Warning

We also get ‘Control’, if we do ‘rdfs: subClassOf * biopax3: Interaction’, but this must be done by getControls().

THEREFORE: Suppression of the controls from the results via MINUS {}

Note

Control class contains (Catalysis, TemplateReactionRegulation, …)

Note

We correct the BioPAX hierarchy generated by some tools like BiNOM. This tool defines the entire hierarchy of parent classes for each BioPAX object instead of let users to use the RDFS reasoner and the rdfs: subclassof property. As a result, objects are queried as many times as they have parent classes. Fortunately, we remove Control derivatives from Interaction objects. However, Interaction objects are far too generic to be interpreted/used in the program, so we must ensure that objects created here have the most accurate interactionType attribute possible. In practice Virtuoso returns first the rdf: type most accurate property, then the parent classes (Ex: BiochemicalReaction then Conversion in the case of an object that would include these 2 properties). In theory, nothing seems to guarantee that this happens all the time.

Note

FR version: Nous corrigeons la hierarchie BioPAX générée par certains outils comme BiNOM. Cet outil définit toute la hiérarchie des classes parentes pour chaque objet BioPAX au lieu de laisser les utilisateurs d’utiliser le raisonneur RDFS et la propriété rdfs:subclassof. Par conséquent les objets sont requêtés autant de fois qu’ils ont de classes parentes. Heureusement nous enlevons les dérivés de Control des objets de type Interaction. Toutefois les objets Interaction sont bien trop génériques pour être interprétés/utilisés dans le programme, nous devons donc veiller à ce que les objets créés ici aient un attribut interactionType le plus précis possible. En pratique Virtuoso renvoie en premier la propriété rdf:type la plus précise, puis ensuite les classes parentes (Ex: BiochemicalReaction puis Conversion dans le cas d’un objet qui comporterait ces 2 propriétés). En théorie, rien ne semble garantir que cela se produise tout le temps.

Note

conversionDirection and catalysisDirection are respectively for Conversion and Catalysis subclasses. Do not forget that Catalysis direction overrides Conversion direction. Currently we assume that Conversion are LEFT_TO_RIGHT (although this is not recommended in the standard). Order of priority for directions: catalysisDirection > conversionDirection > spontaneous > thermodynamic constants and FBA analysis

biopax2cadbiom.sparql_biopaxQueries.get_graphs_from_triplestore()[source]

Get the list of graphs URIs in the triplestore

Note

The queried graphs are named graphs.

Returns:Iterable of tuples (1 graph URI per tuple)
Return type:<generator>
biopax2cadbiom.sparql_biopaxQueries.get_info_from_triplestore(listOfGraphUri=[])[source]

List graphs and subgraphs from the triplestore and retrieve some metadata

Parameters:arg1 (<list>) – List of graphs uris (optional)
Returns:Generator of tuples: (graph_uri, provenance_uri, name, comment)
Return type:<generator>
biopax2cadbiom.sparql_biopaxQueries.get_subgraphs_from_triplestore(listOfGraphUri)[source]

Get URIs of BioPAX graphs in the configured triplestore

Note

We assume that graphs are in full BioPAX format, i.e that dataSource attribute is set on entities. That’s the only way to extract a database from another in a merged graph (Cf PathwayCommons).

Note

SPARQL query:

PREFIX bp: <http://www.biopax.org/release/biopax-level3.owl#>

SELECT ?graph ?Provenance ?dname ?name ?comment
WHERE {
    GRAPH ?graph {
        ?Provenance a bp:Provenance.
    OPTIONAL {
        ?Provenance bp:standardName ?name.
    }
    OPTIONAL {
        ?Provenance bp:displayName ?dname.
    }
    OPTIONAL {
        ?Provenance bp:comment ?comment.
    }
    }}
ORDER BY ?graph ?name
Returns:Iterable of tuples. (graph_uri, provenance_uri, name, comment)

Note

If you get an encoding error in name or comment, please put ‘from __future__ import unicode_literals’ at the begining of your Python script.

Type:<generator>
biopax2cadbiom.sparql_biopaxQueries.get_xref_from_database(listOfGraphUri, provenance_uri, database_name=None)[source]

Get xrefs of all entities in the given database (if specified)

  • An Xref is a reference from an instance of a class in the current ontology
    to an object in external resource.
  • An xref can be an instance of PublicationXref, RelationshipXref,
    UnificationXref.

Warning

WE DO NOT filter the references according to the relation of identity or similarity that they define. i.e, UnificationXref relationships have the same weight as RelationshipXref relationships, and the relationshipType attributes of RelationshipXref objects are not used to show the degree of similarity between the current object and the object in the external database (see the note below).

Note

Classes inherit xref from their members.

Note

Each ontology can differently name their databases. Ex: ‘UniProt’ vs ‘uniprot knowledgebase’, ‘ChEBI’ vs ‘chebi’

Note

Some objects (RelationshipXref, ?) have relationshipType attributes pointing to RelationshipTypeVocabulary objects. These objects use the PSI Molecular Interaction ontology (MI).

Returns:Dictionary of entityRefs. keys: uris; values: dict of databases keys: database names; values: ids
Return type:<dict <str>: <dict <str>: <list>>>

namespaces

This module is used to load all RDF Namespaces.

Use: from namespaces import *

biopax2cadbiom.namespaces.get_RDF_prefixes()[source]

Prefixes sent in SPARQL queries.

sparql_wrapper

Module used to query SPARQL endpoint.

biopax2cadbiom.sparql_wrapper.auto_add_prefixes(func)[source]

Decorator: Add all prefixes to the SPARQL query at first argument of sparql_query()

biopax2cadbiom.sparql_wrapper.load_sparql_endpoint()[source]

Make a connection to SPARQL endpoint & retrieve a cursor.

Returns:sparql cursor in version 1! => we don’t use SPARQLWrapper2 cursor that provides SPARQLWrapper.SmartWrapper.Bindings-class to convert JSON from server.
Return type:<SPARQLWrapper>
biopax2cadbiom.sparql_wrapper.order_results(query, orderby='?uri', limit=4000)[source]

Build nested query for access points with restrictions.

Build the nested query by encapsulating the original between the same SELECT command (minus useless DISTINCT clause), and the OFFSET & LIMIT clauses at the end. PS: don’t forget to add the ORDER BY at the end of the original query.

http://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksHowToHandleBandwidthLimitExceed https://etl.linkedpipes.com/components/e-sparqlendpointselectscrollablecursor

Warning

WE ASSUME THAT THE SECOND LINE OF THE QUERY CONTAINS THE FULL SELECT COMMAND !!!

Parameters:
  • arg1 (<str>) – Original normal SPARQL query.
  • arg2 (<str>) – Order queries by this variable.
  • arg3 (<int>) – Max items queried for 1 block.
Returns:

A generator of lines of results.

Return type:

<dict>

biopax2cadbiom.sparql_wrapper.sparql_query(*args, **kwargs)[source]

Return modified function with prefix added on the first argument