Data Integrator (Python API)
|
Functional similarity computation base class. More...
Public Member Functions | |
def | __init__ (s) |
def | ComputePWSSMatrix (s, sp1, sp2) |
Compute pairwise semantic similarity matrix for a pair of proteins. More... | |
def | ComputePWGO (s, go1, go2) |
Compute pairwise semantic similarity for a pair of GOTerms. More... | |
def | Initialize (s, paths, ns, graphByFileName="") |
Initialize module. More... | |
def | SetSimMeasure (s, measure) |
Choose computational model for semantic similarity. More... | |
def | SetManual (s, only) |
Select manual filtering of GO annotations. More... | |
def | SetNDRemoval (s, flag) |
Remove annotations to ND. More... | |
def | GetIC (s, goID) |
Get information content of a go term ID. More... | |
def | SimAvg (s, sp1, sp2) |
Average similarity. More... | |
def | SimMax (s, sp1, sp2) |
Maximum similarity. More... | |
def | SimRCAvgMax (s, sp1, sp2) |
Maximum row/column averaged maxima. More... | |
def | SimBMA (s, sp1, sp2) |
Best match average. More... | |
def | SimBMA2 (s, sp1, sp2) |
Best match average averaged. More... | |
Functional similarity computation base class.
Computes functional similarity as a combination of GO-annotated pairs of protein identifiers.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.__init__ | ( | s | ) |
Reimplemented in cls.FunctionalSimilarity.CFunctionalSimilarity.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.ComputePWGO | ( | s, | |
go1, | |||
go2 | |||
) |
Compute pairwise semantic similarity for a pair of GOTerms.
The similarity measure has to be set before via @ref SetSimMeasure. @param go1 First GOTerm. @param go2 Second GOTerm. @return @c Float if the value could be computed, else @c None.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.ComputePWSSMatrix | ( | s, | |
sp1, | |||
sp2 | |||
) |
Compute pairwise semantic similarity matrix for a pair of proteins.
The similarity measure has to be set before via @ref SetSimMeasure. @param sp1 Swissprot accession number of first protein. @param sp2 Swissprot accession number of second protein. @return @c Tuple (@c goIDs1, @c goIDs2, @c matrix). @c matrix is a numpy array with rows corresponding to @c goIDs1 and columns corresponding to @c goIDs2. If a pair of GO IDs does not have a semantic similarity value, @c NaN will be in the corresponding matrix element. This is important when using numpy functions, as normally @c NaN propagates and special functions for ignoring @c Nans are needed.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.GetIC | ( | s, | |
goID | |||
) |
Get information content of a go term ID.
@param goID GO term ID. @return @c Float or @c None, in case of unknown term ID.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.Initialize | ( | s, | |
paths, | |||
ns, | |||
graphByFileName = "" |
|||
) |
Initialize module.
Set the GO namespace and load the corresponding populated graph. @param paths cls.Paths.CPaths object. @param ns Gene ontology namespace, eg. 'biological_process'. @param graphByFileName Full path file name to graph. This replaces the default graph files normally read from the DI repository. @return @c True; namespace has been set and the graph file has been loaded, @c False, unknown namespace, or graph file not found (error message issued for the latter case).
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SetManual | ( | s, | |
only | |||
) |
Select manual filtering of GO annotations.
@param only @c True, only consider manual annotations, ie. ignore 'IEA' type records. @c False, consider all annotations.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SetNDRemoval | ( | s, | |
flag | |||
) |
Remove annotations to ND.
@param flag If @c True, annotations with evidence code ND will be omitted. If set to @c False, ND annotations are included.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SetSimMeasure | ( | s, | |
measure | |||
) |
Choose computational model for semantic similarity.
Contrary to the namespace, this can be changed on the fly. So if multiple different semantic similarities should be used to compute the functional similarities, a call to this function before a call to @ref SimMax, @ref SimRCAvgMax, @ref SimBMA, or @ref SimBMA2 is enough! @param measure One of the strings defined in @ref SS_MEASURES. @return @c True; measure is known and has been set, @c False; unknown measure.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SimAvg | ( | s, | |
sp1, | |||
sp2 | |||
) |
Average similarity.
Each of the two input proteins is associated with GO IDs. For each of the possible pairs of GO IDs, the semantic similarity is computed according to the measure selected by a prior call to @ref SetSimMeasure. The so-resulted semantic similarity matrix is then combined into a single value. In this case, the average over all matrix entries is computed. @param sp1 First UniProt/Swiss-Prot accession number. @param sp2 Second UniProt/Swiss-Prot accession number. @return (@c float) Protein similarity. @c None, if a protein is not associated with any GO term.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SimBMA | ( | s, | |
sp1, | |||
sp2 | |||
) |
Best match average.
Computes the sum of the row maxima and the sum of the column maxima and normalizes this by dividing by the number of columns and rows of the semantic similarity matrix. See @ref SimAvg.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SimBMA2 | ( | s, | |
sp1, | |||
sp2 | |||
) |
Best match average averaged.
Computes the average of the row maxima and the average of the column maxima of the semantic similarity matrix and divides the sum of these two values by two. See @ref SimAvg.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SimMax | ( | s, | |
sp1, | |||
sp2 | |||
) |
Maximum similarity.
Computes the maximum value of the semantic similarity matrix. See @ref SimAvg.
def cls.FunctionalSimilarity.CFunctionalSimilarityBase.SimRCAvgMax | ( | s, | |
sp1, | |||
sp2 | |||
) |
Maximum row/column averaged maxima.
Computes the maximum of the averaged row and column maxima of the semantic similarity matrix. See @ref SimAvg.