In part 1, we went over how to use data visualization and data analysis prior to machine learning. For example, we discussed how to visualize the data to identify potential issues in the dataset, examine the variable distributions, etc. In this blog post, we’ll continue by building a very simple model and using data visualization
Introduction The new R package, manhattanly, creates interactive manhattan plots using the plotly.js engine. The plots are usable from the R console, the RStudio viewer pane, R Markdown documents, in Shiny apps, embeddable in websites and can be exported as .png files. By hovering the mouse over a point, you can see annotation information such
In every statistical analysis, the first thing one should do is try and visualise the data before any modeling. In microarray studies, a common visualisation is a heatmap of gene expression data. In this post I simulate some gene expression data and visualise it using the heatmaply package in R by Tal Galili. This package
Today we’re excited to announce flexdashboard, a new package that enables you to easily create flexible, attractive, interactive dashboards with R. Authoring and customization of dashboards is done using R Markdown and you can optionally include Shiny components for additional interactivity. Highlights of the flexdashboard package include: Support for a wide variety of components including interactive htmlwidgets; base, lattice, and
Today is my birthday and it happened to be the release day of Bioconductor 3.3. It’s again the time to reflect what I’ve done in the past year.
Although ChIPseeker was designed for ChIP-seq annotation, I am very glad to find that someone else use it to annotate other data including
I’ve been staring at this homeless data set for a few weeks now since I’m using it both here and in the data science class I’m teaching. It’s been one of the most mindful data sets I’ve worked with in a while. Even when reduced to pure numbers in named columns, the names really stick... Continue reading →
IntroductionSo, I'm not really a geographer. But any good analyst worth their salt will eventually have to do some kind of mapping or spatial visualization. Mapping is not really a forte of mine, though I have played around with it some in the past.I was working with some shapefile data a while ago and thought about how...
R 3.2.4 (codename “Very Secure Dishes”) was released today. You can get the latest binaries version from here. (or the .tar.gz source code from here). The full list of new features and bug fixes is provided below. Upgrading to R 3.2.4 on Windows If you are using Windows you can easily upgrade to the latest version of R using the installr … Continue reading...
It is a truth universally acknowledged that sentiment analysis is super fun, and Pride and Prejudice is probably my very favorite book in all of literature, so let’s do some Jane Austen natural language processing.
Project Gutenberg makes e-texts available for many, many books, including Pride and Prejudice which is available here. I am using the plain text...
If you’ve read my blog, taken one of my classes, or sat next to me on an airplane, you probably know I’m a big fan of Hadley Wickham’s ggplot2 package, especially compared to base R plotting.
Not everyone agrees. Among the anti-ggplot2 crowd is JHU Professor Jeff Leek, who yesterday wrote up his thoughts on the Simply Statistics...