I’ve recently started experimenting with making Shiny apps, and today I wanted to make a basic app for calculating and visualizing principal components analysis (PCA). Here is the basic interface I came up with. Test drive the app for yourself or check out the the R code HERE.
Above is an example of the user interface which consists of data upload (from.csv for now), and options for conducting PCA using the pcaMethods package. The various outputs include visualization of the eigenvalues and cross-validated eigenvalues (q2), which are helpful for selecting the optimal number of model components.The PCA scores plot can be used to evaluate extreme (leverage) or moderate (DmodX) outliers. A Hotelling’s T-squared confidence intervals as an ellipse would also be a good addition for this.
Here is an example of a network leveraging three dominant aspects of metabolomic experiments (biochemical, chemical and mass spectral knowledge) to connect measured variables. This is a network for a blinded data set (sample ids are not known), which I’ve made for a member of my lab presenting their work at the Metabolomics Society Conference in Glasgow, Scotland.
With out knowing the experimental design we can still analyze our data for analytical effects. For example below is a principal components analysis of ~400 samples and 600 variables, where I’ve annotated the sample scores to show data aquisition date (color) and experimental samples or laboratory quality controls (shape). One thing to look for are trends or scores grouping in the PCA scores which are correlated to analytical conditions like, batch, date, technician, etc.
Finally we can take a look at the PCA variable loadings which highlights a major bottleneck in metabolomics experiments, the large amount of structurally unknown molecular features.
Even using feature rich electron impact mass spectra (GC/TOF) only 40% of the unknown variables could be connected to known species based on a cosine correlation >0.75. To give you an idea the cosine correlation or dot product between the mass spectra of two structurally very similar molecules xylose and xylitol is ~ 0.8.
After being busy the last two weeks teaching and attending academic conferences, I finally found some time to do what I love, program data visualizations using R. After being interested in Shiny for a while, I finally decided to pull the trigger and build my first Shiny app!
I wanted to make a proof of concept app which contained the following dynamics which are the basics of any UI design:
1) dynamic UI options
2) dynamically updated plot based on UI inputs
Here is what I came up with.
The app consists of a user interface (UI) for selecting the data, variable to plot , grouping factor for colors and four plotting options: boxplot (above), histogram, density plot and bar graph. As an added bonus the user can select to show or hide jittered points in the boxplot visualization.
Generally #2 above was well described and easy to implement, but it took a lot of trial and error to figure out how to implement #1. Basically to generate dynamic UI objects, the UI objects need to be called using the function shiny:::uiOutput() in the ui.R file and their arguments set in the server.R file using the function shiny:::renderUI(). After getting this to work everything else fell in place.
Having some experience with making UI’s in VBA (visual basic) and gWidgets; Shiny is a joy to work with once you understand some of its inner workings. One aspect I felt which made the learning experience frustrating was the lack of informative errors coming from Shiny functions. Even using all the R debugging tools having Shiny constantly tell me something was not correctly called from a reactive environment or the error was in the runApp() did not really help. My advice to anyone learning Shiny is to take a look at the tutorials, and particularly the section on Dynamic UI. Then pick a small example to reverse engineer. Don’t start off too complicated else you will have a hard time understanding which sections of code are not working as expected.
Finally here are some screen shots, and keep an eye out for more advanced shiny apps in the near future.
I am getting ready to present at the upcoming American Society for Mass Spectrometry (ASMS) conference in Minneapolis, Minnesota (dont’cha know).
If you are around check out my talk in the section Oral: ThOB am – Informatics: Metabolomics on Thursday (06/14) at 8:30 am in room L100. Here is teaser
Above is a network representation of biochemical (red edges, KEGG RPAIRS) and structural similarities (gray edges, Tanimoto coefficient> 0.7) of > 1100 biological molecules (see here for some of their descriptions). Keep an eye out for all the R code used to generate this network as well as all the slides from my talk.
Here is my talk abstract.