Friday, 12 August 2016

Validating RNA-seq results using previously published data

Large-scale omics studies usually require some form of validation following the initial analysis. For RNA-seq or microarray studies, this validation frequently takes the form of qPCR. However, qPCR isn't always possible and, as pointed out here and here, is not without limitations.

Comparing single gene results against a previously published dataset of similar design offers an alternative (or complementary) approach to traditional qPCR validation. The 2010 paper by Suarez-Farinas et al. provides a nice example of how to do this using GSEA. The authors' approach is based on the reasoning that the genes classified as significantly up- and down-regulated in a given study should be enriched at the top and bottom, respectively, of a microarray or RNA-seq dataset testing the same treatment effect. This method is potentially more meaningful than choosing a handful of results for qPCR replication as it allows a large number of differentially-expressed genes to be tested (e.g. 15-1000). Moreover, it can validate genes that respond robustly to the treatment effect of interest under the varied conditions encompassed by the two datasets, whilst ruling out the influence of many small undesirable methodological differences likely to be reproduced by a within-lab qPCR validation.

If you’re interested, check out the paper by Suarez-Farinas et al. (2010) and our recently published RNA-seq study.

Thursday, 4 August 2016

Plotting multi-set intersections

Most of my research uses the chick model of refractive error, but at the moment I’m working to characterize the overlap in transcriptome and proteomics findings across the range of model species used in my field. The GeneOverlap R package has been a great tool for testing the overlap between my gene lists, however, I recently came across the SuperExactTest R package that provides some additional neat options for testing and visualizing multi-set intersections.

I used SuperExactTest to make the circular plot below showing intersections between differentially-expressed protein lists from tilapia, chick, mouse, guinea pig, and tree shrew. The five tracks in the middle represent the five protein lists, with individual blocks showing the presence (fill) or absence (no fill) of the protein sets in each intersection. The height of the bars on the outermost track is proportional to the intersection size (also indicated by the number next to each bar), and the fill colour of the outermost bars represents the statistical significance of the intersection. I’ve ordered this plot from lowest to highest p-value; the arrow on the right points to the most significant intersection (six shared protein findings between guinea pig and tree shrew).


If you’re interested, check out the paper ‘Efficient Test and Visualization of Multi-Set Intersections’ (Wang et al., 2015), which describes the theoretical framework for the SuperExactTest package.

Thursday, 2 June 2016

GOPlot package for visualizing functional omics data

I recently stumbled upon the GOPlot R package, developed by Wencke Walter and colleagues. The package provides some great tools for visualizing omics data. My favourite is the chord diagram, which captures the complex relationship between differential gene expression (shown on the left) and enriched functional categories (shown on the right)- here’s one I made for my ARVO conference poster this year:

The package is easy to use and provides a range of other plot options (bar plot, bubble plot, circle plot, cluster plot). I highly recommend you check it out!

Tuesday, 10 May 2016

ARVO 2016: Seattle

In defiance of the 'rainy city' monkier, Seattle turned on the sunshine for ARVO 2016. It was a hectic 5 days packed with some great talks, many posters, and a few blissful quiet moments soaking up some rays in the park outside of the conference center. 

This year I presented a systematic review and meta-analysis comparing exploratory omics findings across human and animal studies of refractive error. I got some great ideas for refining the methods of this study from other ARVO goers; can't wait to get home and start working on it again!

A high res version of my poster is available on ResearchGate