Programs

 

1.     GO term Summarization: Given a list of genes (downstream), use a number of GO terms (biological process domain) to summarize them. (Put all four files into the same fold and run userCode_1.py)

a.     userCode_1.py --- The sample codes of how to use the class GoGraph.

b.     GotermSummarization_PV_AllGenesInAssociationFile_quick.py ---The class that uses GO term to summarize a list of genes.

c.      gene_association.goa_human_2012 --- The gene to GO term association file from http://www.geneontology.org.

d.     newWeightedPubMedGO.xml --- Weighted GO structure file. The file is obtained by adding semantic distance*1 to all edges in the Gene Ontology structure from http://www.geneontology.org.

 

2.     Find highly dense sub-graph: Given a bipartite graph, find a high density sub-graph with maximum score. (Put two files into the same fold and run userCode_2.py)

a.     userCode_2.py --- The sample codes of how to use the class DenseSubGraph.

b.     DenseSubGraph_Sorted.py --- The class that finds a highly dense sub-graph.

 

3.     ME algorithm: Given a gene-tumor relation graph, where each gene has a real weight, find a set of genes with minimum weight sum that covers maximum number of tumors. (Put all four files into the same fold and compile the userCode_3.cpp in Linux or Unix system)

a.     userCode_3.cpp --- The sample codes of how to use the ME algorithm.

b.     BiGraph_ME.h, MutuallyExclusive.h --- main ME program.

c.      sampleData_4_ME_algorithm.txt --- sample gene-tumor relation graphs.

 

Note: The codes need packages networkX v1.3 or a later version.

Footnote:

*1--Bo Jin, Xinghua Lu: Identifying informative subsets of the Gene Ontology with

information bottleneck methods. Bioinformatics 26(19): 2445-2451 (2010).