Documentation of the package for developers

biopax_converter

This module is used to translate BioPAX data to CADBIOM models.

biopax2cadbiom.biopax_converter.addCadbiomSympyCondToReactions(dictReaction, dictPhysicalEntity)[source]

Elaborate condition for each event attached to a reaction.

Note

Condition: i.e guard of transition in Cadbiom formalism.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_cadbiom_names_to_entities(dictPhysicalEntity)[source]

Add ‘listOfCadbiomNames’ attribute to entities.

The aim is to have the list of elements contained in each entities and their names.

Note

We process essentially entities with subunits: components or members (complexes or classes).

Note

The attribute ‘listOfCadbiomNames’ corresponds to a list of unique cadbiom IDs for the entity (Complex, Class). Each member of the list is the unique cadbiom ID of each subcomponent present in the attribute ‘flat_components’.

Warning

To fill ‘listOfCadbiomNames’, we first handle complexes that can be classes; BUT classes are not necessarily complexes (without ‘flat_components’), so a recursive decomposition is made. For that, see get_cadbiom_names()

Note

Because complexes are already developed in developComplexEntity(), this type of entities do not have to be decompiled recursively here.

Parameters:dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_controllers_to_reactions(dictReaction, dictControl)[source]

Fill the attribute controllers of Reaction objects.

Note

Thanks to filter_control() pathways are removed from the controllers; we have only entities in controllers.

Note

The controllers attribute of a Reaction corresponds to a set of controller entities involved in it.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions created, by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls created, by the function query.getControls()
biopax2cadbiom.biopax_converter.add_locations_to_entities(dictPhysicalEntity, dictLocation)[source]

Add Location objects to PhysicalEntities

biopax2cadbiom.biopax_converter.add_modifications_features_to_entities(dictPhysicalEntity, dictModificationFeatures)[source]

Add modifications and their number to the entity name

biopax2cadbiom.biopax_converter.add_reactions_to_entities(dictReaction, dictControl, dictPhysicalEntity)[source]

Fill the attribute reactions of PhysicalEntity objects.

Note

The reactions attribute corresponds to a set of reactions in which the entity is involved (as controller or participant). We use this attribute in order to know if complexes have to be deconstructed (only if a subentity is used elsewhere in a reaction).

Note

Supported roles in reactions are: - productComponent - participantComponent - leftComponents - rightComponents - controller of

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls, created by the function query.getControls()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_unique_cadbiom_name_to_entities(dictPhysicalEntity)[source]

Add cadbiomName attribute to entities in dictPhysicalEntity.

Note

The attribute cadbiomName corresponds to a unique cadbiom ID for the entity (Protein, Complex, Class, etc.).

Parameters:dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.add_xrefs_to_entities(dictPhysicalEntity, dictEntities_db_refs)[source]

Add xrefs to entities

Parameters:dictEntities_db_refs (<dict <str>: <dict <str>: <list>>>) – Dictionary of entityRefs. keys: uris; values: dict of databases keys: database names; values: ids
biopax2cadbiom.biopax_converter.clean_name(name)[source]

Clean name for correct cadbiom parsing.

biopax2cadbiom.biopax_converter.compute_locations_names(dictLocation, numeric_compartments_names=False)[source]

Create a cadbiom ID for each location.

Warning

It updates the key ‘cadbiomName’ of entities in dictLocation[location].

Parameters:
  • dictLocation (<dict>) – Dictionnary of biopax locations created by query.getLocations(). keys: CellularLocationVocabulary uri; values: Location object
  • numeric_compartments_names (<bool>) – (optional) If True, names of compartments will be based on numeric values instead of their real names.
Returns:

Dict of encoded locations. keys: numeric value or real location name; values: Location object

Return type:

<dict <str>:<Location>>

biopax2cadbiom.biopax_converter.createControlFromEntityOnBothSides(dictReaction, dictControl)[source]

Remove entities on both sides of reactions and create a control instead.

We believe that these entities present in the reagents and products are in fact a catalysts without which the reaction can not take place.

We remove this entity from the reaction and add an ACTIVATION controller to the list of BioPAX Controls.

Note

This function must be called before adding reactions to entities.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls, created by the function query.getControls()
biopax2cadbiom.biopax_converter.detectMembersUsedInEntities(dictPhysicalEntity, convertFullGraph=False, keepEmptyClasses=False)[source]

Set the attribute ‘membersUsed’ of generic entities (classes).

Note

The value is False if the entity does not have members; if at least one member is involved in a reaction the value is True.

Warning

A generic entity can be any of the subclasses of PhysicalEntity (A complex can also be a class); BUT complexes are the only entities with ALWAYS ‘flat_components’ != None value.

Parameters:
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • convertFullGraph (<bool>) – (optional) Convert all entities to cadbiom node, even the entities that are not used elsewhere.
  • keepEmptyClasses – (optional) (deprecated) If some members are not used, we add the entity to the membersUsed attribute with the aim to represent all the members not used. => This will break some conversions and unit tests because Translation implies the removal of genericity.
biopax2cadbiom.biopax_converter.developComplexEntity(complex_entity, dictPhysicalEntity)[source]

Fill ‘flat_components’ attribute of the given complex.

Note

Search recursively all components of the given complex.

Some Complex have subcomplex like in Reactome 56 from PC8. Example:

  • Complex_c33f6c2be7551100a54e716b3bf8ec8a:
  • Complex_0088fc0fe989a0b0abc3635b20df8d90
  • Complex_b87d9cb2e60df79cdde88a9f8f45e80d

Warning

Here we handle ONLY COMPLEXES! Even if some complexes have sub-entities that are classes, ‘flat_components’ can contains uri of classes partially decompiled. EDIT: not anymore

Warning

Complexes are the only entities that have a ‘flat_components’ attribute set.

Note

Full explanations: developped_components is a list of tuples that contain combinations of all recursively searched sub-entities in the given complex.

Example:

A: complex composed with components:
    B: complex with components:
        W: protein
        X: generic smallmolecule with members:
            Y: smallmolecule (used elsewhere)
            Z: smallmolecule (not used elsewhere)
    C: protein

For X: developped_components = [X, Y] (edit: just [Y] now)
For W: developped_components = [W]
So for B: developped_components = [[X, Y], [W]]
and flat_components = [(X, W), (Y, W)]
For A: developped_components = [[C], [(X, W), (Y, W)]]
and flat_components = [(C, X, W), (C, Y, W)]
where X is a class that represents Z, and Y a true smallmolecule
(edit: now X is removed, and the final result is [(C, Y, W)])

If Z had been used elsewhere, we would have had the following
final result for developped_components of A:
[[C], [(Y, W), (Z, W)]]
and flat_components:
[(C, Y, W), (C, Z, W)]

PS:
'A' can be Complex_6e3d8ef563cbcc0c9e2a4afb2a920c38
(Reactome v56 inPC8); In this complex, Z is also used,
so X is totally removed.
Parameters:
  • complex_entity (<PhysicalEntity>) – Complex entity
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.develop_complexes(dictPhysicalEntity)[source]

Set the attribute ‘flat_components’ of complexes entities.

Note

‘flat_components’ is a list of tuples of component URIs.

Note

This function depends of detectMembersUsedInEntities()

Parameters:dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
biopax2cadbiom.biopax_converter.filter_controls(controls, pathways_names, blacklisted_entities)[source]

Remove pathways and cofactors from controls and keep others entities.

Note

Remove also entities that control pathways.

Note

We want ONLY entities and by default, there are pathways + entities.

Parameters:
  • arg1 (<dict>) – Dict of Contollers. keys: URIs; values: <Control>
  • arg2 (<dict>) – Dict of pathways URIs and names. keys: URIs; values: names (or uri if no name)
Returns:

Filtered controllers dict.

Return type:

<dict>

biopax2cadbiom.biopax_converter.filter_entities(dictPhysicalEntity, blacklisted_entities)[source]

Remove blacklisted entities from BioPAX entities.

Note

Blacklisted entities are removed from dictPhysicalEntity, from components and from members.

Parameters:
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • blacklisted_entities (<set>) – set of entity uris blacklisted
Returns:

Dictionnary of biopax physicalEntities without blacklisted entities

Return type:

<dict <str>: <PhysicalEntity>>

biopax2cadbiom.biopax_converter.filter_reactions(dictReaction, blacklisted_entities)[source]

Remove blacklisted entities from reactions.

Note

Effects: - productComponent and participantComponent can be set to None - blacklisted entities are removed from leftComponents and rightComponents

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • blacklisted_entities (<set>) – set of entity uris blacklisted
biopax2cadbiom.biopax_converter.find_unique_synonyms(cadbiom_name, entity_uris, unique_cadbiom_names, dictPhysicalEntity)[source]

Build unique names for the given uris, having the same cadbiom name.

Note

First, we use synonyms from BioPAX database to find a unique name. When there is no more usable synonyms to build a unique name, we add a version number based on the given cadbiom name for all the remaining entities.

Note

The merging procedure for similar entities greatly reduces the number of entity groups proposed to this function.

Parameters:
  • cadbiom_name (<str>) – The redundant cadbiom name
  • entity_uris (<set>) – Set of uris of entities having the same name
  • unique_cadbiom_names (<set>) – Set of unique cadbiom names already used
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Dictionnary of uris as keys and unique names as values.

Return type:

<dict>

biopax2cadbiom.biopax_converter.getCadbiomName(entity, synonym=None)[source]

Get entity name formatted for Cadbiom.

Parameters:
  • arg1 (<PhysicalEntity>) – PhysicalEntity for which the name will be encoded.
  • arg2 (<dict>) – Dictionnary of biopax locations created by query.getLocations(). keys: CellularLocationVocabulary uri; values: Location object
  • arg3 (<str>) – (Optional) Synonym that will be used instead of the name of the given entity.
Returns:

Encoded name with location if it exists.

Return type:

<str>

biopax2cadbiom.biopax_converter.getEntityNameUnmatched(entities, entityToEntitiesMatched, dictPhysicalEntity)[source]
biopax2cadbiom.biopax_converter.getListOfPossibilitiesAndCadbiomNames(entity, dictPhysicalEntity)[source]

Return list of tuples with URIs associated to a cadbiomName.

Todo

listOfCadbiomNames devrait etre un dictionnaire avec en clé les uris des composants et en valeurs les noms associés => sans ça, impossible de réassocier les noms aux uris qui en sont à l’origine. Ça pourrait servir par ex dans get_names_of_missing_physical_entities() pour ne pas risquer d’associer un membre à une uri qui n’est pas la sienne (son parent ici)…

Returns:list of tuples ((uris), name)
Return type:<list <tuple <tuple, <str>>>
biopax2cadbiom.biopax_converter.getPathwayToPhysicalEntities(dictReaction, dictControl, dictPhysicalEntity)[source]

This function creates the dictionnary pathwayToPhysicalEntities.

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictControl (<dict <str>: <Control>> keys: uris; values control objects) – Dictionnary of biopax controls, created by the function query.getControls()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

pathwayToPhysicalEntities keys: pathway uris; values set of entities involved in the pathway.

Return type:

<dict <str>: <set>>

biopax2cadbiom.biopax_converter.getProductCadbioms(entities_uris, entityToListOfEquivalentsAndCadbiomName)[source]

Get all cartesian products of possible names of entities.

Example:
entities_uris:
{
    'http://pathwaycommons.org/pc2/#Complex_aa82041945ad0f6a68f33a25a9720863',
    'http://pathwaycommons.org/pc2/#Complex_3fe118eaca425fc3b269b691cd9239df',
    'http://pathwaycommons.org/pc2/#Protein_597e0393013973540c8ec5d34766c8b0',
}

return:
[
    ('AP_2_adaptor_complex', 'IL8_CXCR2_v2_integral_to_membrane', 'beta_Arrestin1'),
    ('AP_2_adaptor_complex', 'IL8_CXCR2_v2_integral_to_membrane', 'beta_Arrestin2_v1')
]
Parameters:arg1 (<set>) – Set of entities.
Returns:List of tuples of possible names for the given entities.
Return type:<list <tuple <str>>>
biopax2cadbiom.biopax_converter.getProductCadbiomsMatched(entities_uris, entityToListOfEquivalentsAndCadbiomName, entityToEntitiesMatched)[source]
biopax2cadbiom.biopax_converter.getSetOfCadbiomPossibilities(entity, dictPhysicalEntity)[source]

=> obsolète set de composants possibles pour 1 complexe

Note

First call: entity is a controller of a/multiple reaction(s)

Note

The attribute ‘listOfCadbiomNames’ corresponds to a list of unique cadbiom IDs for the entity (Complex, Class). Each member of the list is the unique cadbiom ID of each subcomponent present in the attribute ‘flat_components’.

Parameters:
  • entity (<PhysicalEntity>) – A Physical entity object that is a controller.
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Return type:

<set>

biopax2cadbiom.biopax_converter.getTransitions(dictReaction, dictPhysicalEntity)[source]

Return transitions with (ori/ext nodes) and their respective events.

Warning

dictPhysicalEntity is modified in place. We add “virtual nodes” for genes that are not in BioPAX format.

Todo

handle TRASH nodes => will crash cadbiom writer because they are not entities…

Parameters:
  • dictReaction (<dict <str>: <Reaction>> keys: uris; values reaction objects) – Dictionnary of biopax reactions, created by the function query.getReactions()
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Dictionnary of transitions and their respective set of events.

Example:
subDictTransition[(cadbiomL,right)].append({
    ‘event’: transition[‘event’],
    ‘reaction’: reaction,
    ‘sympyCond’: transitionSympyCond
}

Return type:

<dict <tuple <str>, <str>>: <list <dict>>>

biopax2cadbiom.biopax_converter.get_cadbiom_names(entity, dictPhysicalEntity)[source]

To be called recursively or by add_cadbiom_names_to_entities()

Note

See add_cadbiom_names_to_entities() for more information.

Parameters:
  • entity (<PhysicalEntity>) – A PhysicalEntity.
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Set of cadbiom names for the given entity.

Return type:

<set>

biopax2cadbiom.biopax_converter.get_control_group_condition(controls, dictPhysicalEntity)[source]

Get condition for a group of controllers.

Activators are linked by a logical ‘OR’, inhibitors are linked by a logical ‘OR’, but activators and inhibitors are linked together by a logical ‘AND’.

Warning

controlType can be as follows (* are currently supported because they are general terms; others are from EcoCyc and will raise an exception):

  • ACTIVATION*
  • INHIBITION*
  • INHIBITION-ALLOSTERIC
  • INHIBITION-COMPETITIVE
  • INHIBITION-IRREVERSIBLE
  • INHIBITION-NONCOMPETITIVE
  • INHIBITION-OTHER
  • INHIBITION-UNCOMPETITIVE
  • ACTIVATION-NONALLOSTERIC
  • ACTIVATION-ALLOSTERIC
Parameters:
  • controls (<set <Control>>) – Set of Control objects.
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

Sympy condition.

Return type:

<sympy.core.symbol.Symbol>

biopax2cadbiom.biopax_converter.load_blacklisted_entities(blacklist)[source]

Get all URIs of blacklisted elements in the given file.

Note

The csv can be written with the following delimiters: ‘,;’. In the first column we expect the URI, In the second column users can put the corresponding cadbiom name (currently not used).

Param:Blacklist filename.
Type:<str>
Returns:Set of uris.
Return type:<set>
biopax2cadbiom.biopax_converter.main(params)[source]

Entry point

Here we detect the presence of the pickle backup and its settings. If there is no backup or if the user doesn’t want to use this functionality, queries are made against the triplestore.

Then, we construct a Cadbiom model with all the retrieved data.

biopax2cadbiom.biopax_converter.merge_duplicated_entities(dictPhysicalEntity, modelfilepath)[source]

Merge multiple occurrences of the same entity in the model

Note

The duplicates can come from the BioPAX database, as well as from the process of transferring post-translational modifications of classes to their daughter entities in transfer_class_attributes_on_child_entities()

Note

In order to group the entities, they are ordered according to some of their attributes:

  • entityType
  • entityRef
  • name
  • components_uris
  • location_uri
  • modificationFeatures

Note

About reactions attached to duplicate entities: Reactions from all duplicates are merged together.

Todo

During the merge of entities, prefer existing uris in the BioPAX model rather than those formed by duplication.

Parameters:
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • modelfilepath (<str>) – Filepath of the final model.
biopax2cadbiom.biopax_converter.refInCommon(entities1, entities2, dictPhysicalEntity)[source]

Check common references between 2 sets of entities.

Note

This function is used to make a transition between 2 sets of entities. => Is there any transition between these 2 sets?

Parameters:
  • entities1 (<list>) – List of uris of entities.
  • entities2 (<list>) – List of uris of entities.
  • dictPhysicalEntity (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
Returns:

False if one set of entities have no entityRefs, or [if entityRefs in the first are not in the second and entityRefs in the second are not in the first]. True otherwhise, or if entities1 is a subset of entities2, or if entities2 is a subset of entities1.

Return type:

<bool>

biopax2cadbiom.biopax_converter.shortening_modifications(modificationFeatures, length=1)[source]

Return a short version of all given modification names and occurences.

Note

Some terms can be corrected before shortening: - residue modification, inactive: inactive - residue modification, active: active

Parameters:
  • arg1 (<Counter>) – Counter of modificationFeatures
  • arg2 (<int>) – Length of the shortening; put None for entire strings.
Returns:

Short and merged version of the given modificationFeatures.

Return type:

<str>

biopax2cadbiom.biopax_converter.sort_callback(elem)[source]

Order of the sort of PhysicalEntities on their attributes

Note

The sort of all entities must respect lexicographic order of all attributes.

=> if component URI is not casted into a sorted list, the order is modified, and then, itertools.groupby will be fooled:

  • [‘W’, ‘X’] < [‘X’, ‘Y’] => True
  • {‘X’, ‘W’} < {‘X’, ‘Y’} => False
['W', 'X'] is < to ['X', 'Y']
Ater;A;['W', 'X'];http://simulated/test#anywhere;
A;A;['X', 'Y'];http://simulated/test#anywhere;
Abis;A;['X', 'Y'];http://simulated/test#anywhere;

If we do not cast set into list:
A;A;['Y', 'X'];http://simulated/test#anywhere;
Abis;A;['Y', 'X'];http://simulated/test#anywhere;
Ater;A;['X', 'W'];http://simulated/test#anywhere;
Parameters:elem (<PhysicalEntity>) – PhysicalEntity
biopax2cadbiom.biopax_converter.transfer_class_attributes_on_child_entities(dictPhysicalEntity)[source]

Transfer modificationFeatures and location of classes on child entities

If a child entity does not have the same attributes as its class, it is inserted in the list of BioPAX entities under a specific (new) URI, with its new inherited attributes. It is possible that an entity describing this state is already in the BioPAX ontology. In this case, the two entities will then be grouped by the function merge_duplicated_entities().

Todo

Si entité dupliquée déjà dans le dictionnaire:

  • elle est déjà utilisée ailleurs dans 1 classe
    => doit etre décompilée même si ne participe à aucune réaction.
  • Sinon, supprimer les réactions
    => sert à rien de créer des entités non utilisées dans le modèle.

Note

In a general way, sub-entities are not duplicated if the class doesn’t provide information that is not already in the sub-entity. The transfer of similar post-translational modifications AND location is useless.

Note

We try to no overwrite modifications or location if they are the same. => Avoid the duplication of entities. However, we can not exclude that inconsistent / conflicting modifications are applied to the sub-entities such as: residue modification, active and residue modification, inactive

Note

About reactions attached to duplicate entities: We reset all reactions (the attributes reactions of sub-entities) involving the entity in its old context (without the transfer of attributes that we operate here). This avoids appearing in the model entities that are not reused anywhere else. If the entity must be present in the model, it will be decided during the merge by the function merge_duplicated_entities() that also merges the reactions of the duplicates. There are four cases to consider:

  • none of the duplicates contains a reaction

    => the merged entity will be absent from the model

  • the duplicate entity has no reaction but the duplicate already in the model contains one

    => the merged entity will be in the model

  • the duplicate entity has a reaction but the duplicate already in the model does not contain one

    => if the attribute reactions is not reset, the merged entity will be wrongly in the model because of it will be flagged as being reused elsewhere by detectMembersUsedInEntities().

    => if the attribute reactions is reset, a side effect described in testCase xxx will appear: the decompilation of classes participating in reactions causes the formation of incorrect relations between the entities of these classes.

Warning

dictPhysicalEntity is modified here.

Todo

mettre à jour les réactions des entité dupliquées => test entité déjà présente dans le dico: ajout au set de réactions

biopax2cadbiom.biopax_converter.updateTransitions(original_reaction, dictPhysicalEntity, dictTransition)[source]

TODO: verif ce que fait addCadbiomSympyCondToReactions … la création des events semble redondante pusiqu’on duplique maintenant les réactions par contre l’ajout des conditions dues aux controlleurs est encore utile

cadbiom_writer

This module groups functions used to export BioPAX-processed data to a Cabiom model file.

biopax2cadbiom.cadbiom_writer.build_json_data(entity)[source]

Build JSON data about the given entities.

Note

We can handle reactions from dictReaction, or entities from dictPhysicalEntity

Note

Return these attributes if they exist:

  • PhysicalEntity:
    • uri
    • entityType
    • name + synonyms
    • entityRef
    • location
    • modificationFeatures
    • members
    • reactions
  • Reaction:
    • uri
    • reactiontype
Parameters:arg1 (<str>) – URI of an entity or a list of reactions.
Returns:JSON formatted str.
Return type:<str>
biopax2cadbiom.cadbiom_writer.createCadbiomFile(dictTransition, dictPhysicalEntity, dictReaction, nameModel, filePath, no_scc_fix)[source]

Export data into a Cadbiom file format.

Parameters:
  • arg1 (<dict <tuple <str>, <str>>: <list <dict>>>) –

    Dictionnary of transitions and their respective set of events.

    Example:
    subDictTransition[(cadbiomL,right)].append({
        ‘event’: transition[‘event’],
        ‘reaction’: reaction,
        ‘sympyCond’: transitionSympyCond
    }
    
  • arg2 (<dict <str>: <PhysicalEntity>> keys: uris; values entity objects) – Dictionnary of biopax physicalEntities, created by the function query.getPhysicalEntities()
  • arg3 (<str>) – Name of the model.
  • arg4 (<str>) – File path.
biopax2cadbiom.cadbiom_writer.formatCadbiomSympyCond(cadbiomSympyCond)[source]
Return type:<str>
biopax2cadbiom.cadbiom_writer.formatEventAndCond(setOfEventAndCond)[source]
Return type:<str>
biopax2cadbiom.cadbiom_writer.get_names_of_missing_physical_entities(dictPhysicalEntity)[source]

Get URI and cadbiom name for each entity in the model.

Param:Dictionnary of uris as keys and PhysicalEntities as values.
Type:<dict>
Returns:Dictionnary of names as keys and uris as values.
Return type:<dict>

classes

This module describes the classes that wrap the BioPAX formalism.

class biopax2cadbiom.classes.Control(uri, classType, controlType, reaction_uri, controller)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity

Class for Control

Attributes:
  • classType => subclass of Control

  • controlType => type of control (ACTIVATION or INHIBITION)

  • reaction_uri => entity that is controlled

    (supposed to be a subclass of Interaction)

  • controller => entity that controls the reaction

  • evidences => set of evidences uris (identify controllers of

    the same reaction)

class biopax2cadbiom.classes.GenericBioPAXEntity[source]

Bases: object

Generic class for BioPAX entities which brings basic common functions

short_uri

Return the URI without the prefix of the host.

Example:

Note

This attribute is read-only.

class biopax2cadbiom.classes.Location(uri, locationTerm)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity

Class for Location

Attributes:
  • locationTerm
  • dbRef
Optional:
  • xrefs => UnificationXref with db as keys and terms as values
  • cadbiomName
add_xref(dbref, idref)[source]

Add xref to the existant dict

Note

dbref as key, idref as value in set

class biopax2cadbiom.classes.PhysicalEntity(uri, name, location_uri, entityType, entityRef)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity

Class for Physical Entity

Attributes:
  • name
  • location_uri
  • entityType
  • entityRef (str) (not a set!)
Optional:
  • synonym (set)
  • components_uris (set)
  • members (set)
  • location (Location)
  • xrefs (dict) => UnificationXref with db as keys and terms as values
  • reactions (set)
  • membersUsed (set)
  • cadbiomName (set)
  • modificationFeatures (dict/Counter)
  • flat_components (list)
  • listOfCadbiomNames (list)
add_xref(dbref, idref)[source]

Add xref to the existant dict

Note

dbref as key, idref as value in set

is_class

Return True if the object is a class, False otherwise

is_complex

Return True if the object is a complex, False otherwise

class biopax2cadbiom.classes.Reaction(uri, name, reactiontype, productComponent, participantComponent)[source]

Bases: biopax2cadbiom.classes.GenericBioPAXEntity

Class for reaction

Attributes:
  • name
  • reactiontype => subclass of Interaction
  • productcomponent
  • participantcomponent
Optional:
  • pathways (set)
  • leftcomponents
  • rightcomponents
  • controllers (set) => controls/entities that control the reaction
  • cadbiomSympyCond
  • event
class biopax2cadbiom.classes.SetEncoder(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)[source]

Bases: json.encoder.JSONEncoder

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

sparql_biopaxQueries

This module contains a list of functions to query any SPARQL endpoint with BIOPax data.

biopax2cadbiom.sparql_biopaxQueries.getControls(listOfGraphUri, provenance_uri)[source]

Note

controlType is in (ACTIVATION, INHIBITION)

Note

PID: Evidences nb: 15523, for controls nb: 8203

biopax2cadbiom.sparql_biopaxQueries.getModificationFeature(listOfGraphUri, provenance_uri)[source]

Get ModificationFeatures that occur on PhysicalEntities, grouped by entity, modification type and number of modifications per type.

Returns:A dict of dicts (not Counters)! Each dict contains the modifications as keys and their number as values.
Return type:<dict <dict>>
biopax2cadbiom.sparql_biopaxQueries.getPhysicalEntities(listOfGraphUri, provenance_uri)[source]

Note

From the BioPAX documentation, about the use of memberPhysicalEntity:

Using this property is not recommended. memberPhysicalEntity is only defined to support legacy data in certain databases. It is used to define a generic physical entity that is a collection of other physical entities. In general, EntityReference class should be used to create generic groups of physical entities, however, there are some cases where this is not possible, and the property has to be used. For instance, when an entity reference is used to define a generic physical entity with generic features, the generic features of the same type must be grouped. If you do not have grouping information for features of generic physical entities, you cannot use entity reference to define generic physical entities and must use the memberPhysicalEntity property. Another example for using this property is to create generic complexes, which are currently not supported with the EntityReference scheme (there is no “ComplexReference” class).

biopax2cadbiom.sparql_biopaxQueries.getReactions(listOfGraphUri, provenance_uri)[source]

Warning

We also get ‘Control’, if we do ‘rdfs: subClassOf * biopax3: Interaction’, but this must be done by getControls().

THEREFORE: Suppression of the controls from the results via MINUS {}

Note

Control class contains (Catalysis, TemplateReactionRegulation, …)

biopax2cadbiom.sparql_biopaxQueries.get_graphs_from_triplestore()[source]

Get the list of graphs URIs in the triplestore

Note

The queried graphs are named graphs.

Returns:Iterable of tuples (1 graph URI per tuple)
Return type:<generator>
biopax2cadbiom.sparql_biopaxQueries.get_infos_from_triplestore(listOfGraphUri=[])[source]

List graphs and subgraphs from the triplestore and retrieve some metadata

Parameters:arg1 (<list>) – List of graphs uris (optional)
Returns:Generator of tuples: (graph_uri, provenance_uri, name, comment)
Return type:<generator>
biopax2cadbiom.sparql_biopaxQueries.get_subgraphs_from_triplestore(listOfGraphUri)[source]

Get URIs of BioPAX graphs in the configured triplestore

Note

We assume that graphs are in full BioPAX format, i.e that dataSource attribute is set on entities. That’s the only way to extract a database from another in a merged graph (Cf PathwayCommons).

Note

SPARQL query:

PREFIX bp: <http://www.biopax.org/release/biopax-level3.owl#>

SELECT ?graph ?Provenance ?dname ?name ?comment
WHERE {
    GRAPH ?graph {
        ?Provenance a bp:Provenance.
    OPTIONAL {
        ?Provenance bp:standardName ?name.
    }
    OPTIONAL {
        ?Provenance bp:displayName ?dname.
    }
    OPTIONAL {
        ?Provenance bp:comment ?comment.
    }
    }}
ORDER BY ?graph ?name
Returns:Iterable of tuples. (graph_uri, provenance_uri, name, comment)

Note

If you get an encoding error in name or comment, please put ‘from __future__ import unicode_literals’ at the begining of your Python script.

Type:<generator>
biopax2cadbiom.sparql_biopaxQueries.get_xref_from_database(listOfGraphUri, provenance_uri, database_name=None)[source]

Get xrefs of all entities in the given database (if specified)

  • An Xref is a reference from an instance of a class in the current ontology
    to an object in external resource.
  • An xref can be an instance of PublicationXref, RelationshipXref,
    UnificationXref.

Warning

WE DO NOT filter the references according to the relation of identity or similarity that they define. i.e, UnificationXref relationships have the same weight as RelationshipXref relationships, and the relationshipType attributes of RelationshipXref objects are not used to show the degree of similarity between the current object and the object in the external database (see the note below).

Note

Classes inherit xref from their members.

Note

Each ontology can differently name their databases. Ex: ‘UniProt’ vs ‘uniprot knowledgebase’, ‘ChEBI’ vs ‘chebi’

Note

Some objects (RelationshipXref, ?) have relationshipType attributes pointing to RelationshipTypeVocabulary objects. These objects use the PSI Molecular Interaction ontology (MI).

Returns:Dictionary of entityRefs. keys: uris; values: dict of databases keys: database names; values: ids
Return type:<dict <str>: <dict <str>: <list>>>

namespaces

This module is used to load all RDF Namespaces.

Use: from namespaces import *

biopax2cadbiom.namespaces.get_RDF_prefixes()[source]

Prefixes sent in SPARQL queries.

sparql_wrapper

Module used to query SPARQL endpoint.

biopax2cadbiom.sparql_wrapper.auto_add_prefixes(func)[source]

Decorator: Add all prefixes to the SPARQL query at first argument of sparql_query()

biopax2cadbiom.sparql_wrapper.load_sparql_endpoint()[source]

Make a connection to SPARQL endpoint & retrieve a cursor.

Returns:sparql cursor in version 1! => we don’t use SPARQLWrapper2 cursor that provides SPARQLWrapper.SmartWrapper.Bindings-class to convert JSON from server.
Return type:<SPARQLWrapper>
biopax2cadbiom.sparql_wrapper.order_results(query, orderby='?uri', limit=4000)[source]

Build nested query for access points with restrictions.

Build the nested query by encapsulating the original between the same SELECT command (minus useless DISTINCT clause), and the OFFSET & LIMIT clauses at the end. PS: don’t forget to add the ORDER BY at the end of the original query.

http://vos.openlinksw.com/owiki/wiki/VOS/VirtTipsAndTricksHowToHandleBandwidthLimitExceed https://etl.linkedpipes.com/components/e-sparqlendpointselectscrollablecursor

Warning

WE ASSUME THAT THE SECOND LINE OF THE QUERY CONTAINS THE FULL SELECT COMMAND !!!

Parameters:
  • arg1 (<str>) – Original normal SPARQL query.
  • arg2 (<str>) – Order queries by this variable.
  • arg3 (<int>) – Max items queried for 1 block.
Returns:

A generator of lines of results.

Return type:

<dict>

biopax2cadbiom.sparql_wrapper.sparql_query(*args, **kwargs)[source]

Return modified function with prefix added on the first argument