topGO: get the genes after GO term enrichment

Retrive the genes (from your query or from the annotation) in a GO term after enrichment.

topGO has a build in function to retrive genes that associated with a GO term genesInTerm, but by default it gives you all annotated genes instead of those siginficant. We can get them from the result table allResult:

selcTerm <- allRes$GO.ID[which(allRes$topGO<0.05)] # select those terms with a p-value < 0.05
selcGenes <- genesInTerm(myGOdata, whichGO=selcTerm)

But what if we just want to list those significant genes? Thanks to the tip from Lidia, which select them by a function:

allRes$genes <- sapply(allRes$GO.ID, function(x)
      genes<-genesInTerm(myGOdata, x)
      genes[[1]][genes[[1]] %in% myGenes] # myGenes is the queried gene list
allRes$genes[which(allRes$topGO<0.05)] # print those only with p-value < 0.05

To add the genes to the GenTable (e.g. “results <- GenTable()") and export the new table, we need to convert the list to a charactor vector:

allRes$genes <-vapply(allRes$genes, paste, collapse = ",", character(1L))

Then we can use write.table function.

Update 2019-09-11:

The subset code to output terms with p<0.05 is not flawless: it will ignore terms with p-val as scientific notation e.g., 1e-10. To amend this, take the column as numeric or grep “e-":

allRes<-subset(allRes, as.numeric(allRes[,"topGO"])<0.05 | grepl("e-", allRes[,"topGO"]))
Z. Lu avatar
Z. Lu
Data science, bioinfo, scripting, parasites, retro, plain text.
comments powered by Disqus