GOSSIP: Biological Profiling of Gene Groups utilizing Gene Ontology

Download GOSSIP software package for Windows

Papers using GOSSIP

Supplemental Tables and figures

Preprint at arxiv q-bio

Preprint at q-bio, accession number 0407034

Complete tables showing the results from the paper

Entries in bold face would match our criteria of p ≤ 5%

Whitfield et.al. cell cycle data

An heuristic approach to control the FWER

Luthi-Carter et.al. Huntington data

As an example of a study, where there was no prior expectation whether a ceratain term is enriched, we profiled genes differentially expressed in the brain of the R6/2 mouse model for Huntington's disease (data from Luthi-Carter et al., 2002)

For the profiles, we used the family-wise error rate as the multiple testing correction with a threshold of 0.05. Details on our method to estimate the family-wise error rate are given below

Huntington's Disease (HD) is a progressive neurodegenerative disorder characterised by neuronal loss in cortex and striatum. It is caused by the expansion of a CAG repeat translated into a polyglutamine stretch in the first exon of the IT15 gene encoding huntingtin (Htt) (Huntington 1993). Extended Htt forms intracellular inclusions, in which several transcription factors (e.g. TBP, CBP, TFIID, mSin3a and mSin3b) are sequestred. Additionally, a number of transcription factors, like CBP, CA150, TAFII130, p53, are detected as interacting with Htt (MacDonald 2003). Altered transcription is therefore one of the hypotheses for the death or dysfunctionality of neurons in HD. Others include excitotoxicity (Albin 1992), proteasomal insufficiency or changes in neurotransmission across synapses. Considering the diversity of processes Htt is involved in and the variety of transcription factors which are affected by Htt, it is not clear whether changes in gene expression in the affected cells are rather unspecific or focussed on just a few biological processes. We address this question by profiling gene groups obtained in a microarray study by Luthi-Carter et.al., who surveyed the expression of approx. 11000 genes in the cerebral cortex, cerebellum, and striatum of the symptomatic R6/2 mice model for HD utilizing Affymetrix Mu11K chips. For all three brain regions we profiled groups of up- and down-regulated genes, using the significantly expressed genes as the reference group. Since we do not have an expectation whether any GO term will be enriched in these groups, we used the p-value to control the FWER to define signifiance. To our surprise, functional profiles of the differentially expressed genes show several significantly enriched biological processes. This figure (pdf) shows the GO terms in the biological processes branch that are significantly enriched in the gene groups, a detailed list can be found here. Interestingly, the lists of GO terms that are enriched in both up- and down-regulated gene groups contain several terms related to signal transduction. In the down-regulated gene groups G-protein-coupled receptors, like dopamine D1 and D2, cannabinoid CB1 are among the most significant terms. Dopamine has also been postulated to play a toxic role in HD, with an excess of dopamine potentially producing neurotoxicity (Jakel 2000). Our profiles also imply calcium dysregulation in HD. This might be explained through physical association of mutant huntingtin with a calmodulin-containing complex (Bao 1996). Moreover, many of the proteins in signal transduction group are regulated by calcium or calcium-sensing proteins. The chemokine related terms as well as growth factor binding cannot be assigned readily to current knowledge of Huntington's disease progression. Consequently, they can be taken as a starting point to derive new hypothesis.

The data used in this analysis is from http://www.neumetrix.com with kind permission of Jim Olson

GO-annotation for all genes on the arrays Gossip Annotation File

Background: genes which have a present call in at least half of the experiments (as evaluated with the bioconductor affymetrix package).

Our method to estimate the FWER

The discription of our heuristic approach to calculate the FWER can be found here (pdf).

Additional material

Profiles of Moncytes after 240min of LPS stimulation

The data has been produced by the Alliance of Cellular Signaling (Sangdun Choi's and Robert Hsueh's labs)

These profiles have been analyzed in the Paper: Inferring Combinatorial Regulation of Transcription in silico.

Raw data is available with accession numbers MAE040216Z53, MAE040217Z53, MAE040218Z53, MAE040216Z63, MAE040217Z63 and MAE040218Z63 under the URL: http://www.signaling-gateway.org/data/micro/cgi-bin/operon.cgi

These are expression profiles of RAW 264.7 mouse monocyte four hours after the LPS-treatment, including dye-swap and three replications

The probesets were mapped to UniGene clusters and for each unigene cluster. If one of the probesets is both well above the background (collumn isWellAboveBG) and the p-value (PValueLogRatio) is below the threshold (0.01 and 0.05) the gene is included to the up/down regulated gene list based on the log ratio (collumn LogRatio)

All genes that are well above the background are used as the background set for this microarray