Data Integrator (Python API)
Functions | Variables
cls.Stats Namespace Reference

Collection of statistics tools. More...

Functions

def ComputePValueFromSets (setA, setB, total, alternative=TEST_GREATER)
 Compute p-value based on the overlap of two sets and the total background. More...
 
def ComputePValueFromMatrix (a, b, c, d, alternative=TEST_GREATER)
 Compute p-value using Fisher's exact test on a 2x2 contingency table: More...
 
def PAdjust (p, method)
 Adjust p-values for multiple testing. More...
 

Variables

string TEST_GREATER = "greater";
 
string TEST_LESS = "less";
 
string TEST_2SIDED = "two-sided";
 
string MTEST_BONFERRONI = "bonferroni";
 
string MTEST_BH = "benjamini-hochberg";
 

Detailed Description

Collection of statistics tools.

Provides some basic functions used in hypothesis testing.

Authors
Chris X. Weichenberger
Hagen Blankenburg
Date
2014-11-24

Function Documentation

◆ ComputePValueFromMatrix()

def cls.Stats.ComputePValueFromMatrix (   a,
  b,
  c,
  d,
  alternative = TEST_GREATER 
)

Compute p-value using Fisher's exact test on a 2x2 contingency table:

| SetA | Not SetA | Sum

#--------—+------------—+-------------------—+----------—

Set B | a (=SetA&SetB)| b (=SetB\SetA) | SetB

Not Set B | c (=SetA\SetB)| d (total\SetA\SetB) | total\SetB

#--------—+------------—+-------------------—+-----------—

Sum | SetA | total\SetA | total

Parameters
aUpper left element of the matrix
bUpper right element of the matrix
cLower left element of the matrix
dLower right elelement of the matrix
alternativeOne of TEST_GREATER, TEST_LESS, TEST_2SIDED.
Returns
A tuple containing the odds-ratio and p-value. The OR may be any positive number, but can also be nan or inf.

◆ ComputePValueFromSets()

def cls.Stats.ComputePValueFromSets (   setA,
  setB,
  total,
  alternative = TEST_GREATER 
)

Compute p-value based on the overlap of two sets and the total background.

| SetA | Not SetA | Sum

#--------—+------------—+-------------------—+----------—

Set B | a (=SetA&SetB)| b (=SetB\SetA) | SetB

Not Set B | c (=SetA\SetB)| d (total\SetA\SetB) | total\SetB

#--------—+------------—+-------------------—+-----------—

Sum | SetA | total\SetA | total

Parameters
setAThe first set (i.e. positives on one condition)
setBThe second set (i.e. positives on the other condition)
totalThe total background set, includes the negatives of both conditions.
alternativeOne of TEST_GREATER, TEST_LESS, TEST_2SIDED.
Returns
A tuple containing the odds-ratio and p-value

◆ PAdjust()

def cls.Stats.PAdjust (   p,
  method 
)

Adjust p-values for multiple testing.

This code is based on the source of the function @c p.adjust in the
statistcal programming language R.
@param p  (@c List) List of float values ideally between 0 and 1. Values
 greater than 1 are truncated to 1 (except @c inf values which are
 kept). Negative values are not truncated and receive the highest weights,
 as they are the lowest 'p'-values. @c nan entries in the list are ignored,
 and they do not count for the number of p-values to correct for. A
 probability of 0 remains 0.
@param method Constants @c MTEST_BONFERRONI or @c MTEST_BH for multiple
 test correction according to Bonferroni and Benjamini-Hochberg,
 respectively.
@return (@c List) List of corrected p-values in the same order as the
 original list and with the same number of entries, including @c inf and
 @c nan entries.

Variable Documentation

◆ MTEST_BH

string cls.Stats.MTEST_BH = "benjamini-hochberg";

◆ MTEST_BONFERRONI

string cls.Stats.MTEST_BONFERRONI = "bonferroni";

◆ TEST_2SIDED

string cls.Stats.TEST_2SIDED = "two-sided";

◆ TEST_GREATER

string cls.Stats.TEST_GREATER = "greater";

◆ TEST_LESS

string cls.Stats.TEST_LESS = "less";