Most of my research uses the chick model of refractive error, but at the moment I’m working to characterize the overlap in transcriptome and proteomics findings across the range of model species used in my field. The GeneOverlap R package has been a great tool for testing the overlap between my gene lists, however, I recently came across the SuperExactTest R package that provides some additional neat options for testing and visualizing multi-set intersections.
I used SuperExactTest to make the circular plot below showing intersections between differentially-expressed protein lists from tilapia, chick, mouse, guinea pig, and tree shrew. The five tracks in the middle represent the five protein lists, with individual blocks showing the presence (fill) or absence (no fill) of the protein sets in each intersection. The height of the bars on the outermost track is proportional to the intersection size (also indicated by the number next to each bar), and the fill colour of the outermost bars represents the statistical significance of the intersection. I’ve ordered this plot from lowest to highest p-value; the arrow on the right points to the most significant intersection (six shared protein findings between guinea pig and tree shrew).
If you’re interested, check out the paper ‘Efficient Test and Visualization of Multi-Set Intersections’ (Wang et al., 2015), which describes the theoretical framework for the SuperExactTest package.
Thursday, 4 August 2016
Thursday, 2 June 2016
GOPlot package for visualizing functional omics data
I recently stumbled upon the GOPlot R package, developed by Wencke Walter and colleagues. The package provides some great tools for visualizing omics data. My favourite is the chord diagram, which captures the complex relationship between differential gene expression (shown on the left) and enriched functional categories (shown on the right)- here’s one I made for my ARVO conference poster this year:
The package is easy to use and provides a range of other plot options (bar plot, bubble plot, circle plot, cluster plot). I highly recommend you check it out!
The package is easy to use and provides a range of other plot options (bar plot, bubble plot, circle plot, cluster plot). I highly recommend you check it out!
Tuesday, 10 May 2016
ARVO 2016: Seattle
In defiance of the 'rainy city' monkier, Seattle turned on the sunshine for ARVO 2016. It was a hectic 5 days packed with some great talks, many posters, and a few blissful quiet moments soaking up some rays in the park outside of the conference center.
This year I presented a systematic review and meta-analysis comparing exploratory omics findings across human and animal studies of refractive error. I got some great ideas for refining the methods of this study from other ARVO goers; can't wait to get home and start working on it again!
A high res version of my poster is available on ResearchGate
This year I presented a systematic review and meta-analysis comparing exploratory omics findings across human and animal studies of refractive error. I got some great ideas for refining the methods of this study from other ARVO goers; can't wait to get home and start working on it again!
A high res version of my poster is available on ResearchGate
Thursday, 1 October 2015
Chord diagrams: when there’s too many categories to Venn!
At the moment I’m comparing differentially-expressed genes from six different transcriptome studies. With this many categories, graphically representing cross-study commonalities is challenging. Chord diagrams are visually appealing, and provide several advantages over the tradition Venn:
Venn diagrams
Venn diagrams
- Gene lists are represented as shapes
- List representation is not proportional to list size (although Eulerian circles can do this)
- Common elements (shared gene findings) are shown by shape intersection
- Gene lists are represented as annulus segments (with larger segments indicating more implicated genes)
- Common elements (shared gene findings) are represented by the lines connecting two segments
- This allows emphasis of important commonalities (e.g. lines are shown in colour when two gene lists contain statistically significant overlap)
Creating a Chord diagram
Using NetworkAnalyst you can create a Chord diagram within a few minutes.- Choose the ‘starting with gene or protein lists’ module
- Upload your gene lists
- Choose ‘chord diagrams’
- Save the image as an .svg so you can edit it in Illustrator (or freeware vector editing software)
- Click somewhere in the middle of the cord diagram
- From the top menu, choose ‘select/same/fill colour’. This should select everything in the centre of the diagram, and leave the annulus and text unselected
- In the colour window, change the fill colour to null (leaving only a stoke colour). For help using the colour tool see my earlier post
- Add the finishing touches to the diagram (labels, colour scheme etc)
Wednesday, 30 September 2015
15th International Myopia Conference
I recently got back from the 15th International Myopia Conference in Wenzhou, China. Dr. Jia Qu did a great job organising speakers (and some interesting local food- I got my first taste of cuttlefish)! I particularly liked the ON-OFF visual pathways session where Machelle Pardue and colleagues discussed their knockout models.
Sunday, 12 July 2015
Creating figures to depict enriched KEGG pathways: A first attempt
Gene set enrichment analysis (GSEA) of my microarray dataset implicated a number of KEGG pathways. I’m now exploring ways to create figures summarising the findings. In my last post I described mapping the core implicated genes (leading edge subset) from the analysis onto KEGG pathway diagrams. Although useful when interpreting the data, these figures are difficult to take in at a glance. In the figure below I’ve attempted to provide a stylized adaptation of the KEGG pathways to convey that many of the leading edge genes are involved in acetyl-CoA production feeding into the citrate cycle. It’s still not quite right, but getting there!
Wednesday, 1 July 2015
Mapping a gene list onto KEGG pathways
In a previous post I discussed using GSEA to identify enriched KEGG pathways in my microarray data-set. After running the GSEA leading edge analysis I had a list of the genes that contributed most to the enrichment score for each pathway. I then wanted to examine where these genes fell within the KEGG pathway. The ‘user data mapping’ function in KEGG Mapper is a nice tool to achieve this quickly.
Clicking ‘pathway mapping’ updates the reference to highlight the entered gene list. In this case it appears that my leading edge includes a sub-set of genes involved in methylation of histone lysine residues (in addition to the main pathway functions).
User data mapping
Select the ‘user data mapping’ option when viewing the reference KEGG pathway. In the pop-up window enter the gene symbols followed by the background and foreground color in hexadecimal numbers (if no color is specified the default is red).Clicking ‘pathway mapping’ updates the reference to highlight the entered gene list. In this case it appears that my leading edge includes a sub-set of genes involved in methylation of histone lysine residues (in addition to the main pathway functions).
Subscribe to:
Posts (Atom)