Programs
1. GO term Summarization: Given a list of genes (downstream), use a number of GO terms
(biological process domain) to summarize them. (Put all four files into the same fold and run userCode_1.py)
a. userCode_1.py
--- The sample codes of how to use the class GoGraph.
b. GotermSummarization_PV_AllGenesInAssociationFile_quick.py
---The class that uses GO term to summarize a list of genes.
c. gene_association.goa_human_2012 --- The
gene to GO term association file from http://www.geneontology.org.
d. newWeightedPubMedGO.xml
--- Weighted GO structure file. The file is obtained by adding semantic
distance*1 to all edges in the Gene Ontology structure from http://www.geneontology.org.
2. Find highly dense sub-graph: Given a bipartite graph, find a high density sub-graph with
maximum score. (Put two files into the
same fold and run userCode_2.py)
a. userCode_2.py
--- The sample codes of how to use the class DenseSubGraph.
b. DenseSubGraph_Sorted.py
--- The class that finds a highly dense sub-graph.
3. ME algorithm:
Given a gene-tumor relation graph, where each gene has a real weight, find a
set of genes with minimum weight sum that covers maximum number of tumors. (Put all four files into the same fold and
compile the userCode_3.cpp in Linux or Unix system)
a. userCode_3.cpp
--- The sample codes of how to use the ME algorithm.
b. BiGraph_ME.h, MutuallyExclusive.h --- main ME program.
c. sampleData_4_ME_algorithm.txt --- sample
gene-tumor relation graphs.
Note: The codes need packages networkX v1.3 or a later version.
Footnote:
*1--Bo Jin, Xinghua Lu: Identifying
informative subsets of the Gene Ontology with
information bottleneck methods. Bioinformatics
26(19): 2445-2451 (2010).