Where do letters occur in words

July 26, 2015
By
Where do letters occur in words

A while back I encountered an interesting graphic showing where letters were located in english words (http://www.prooffreader.com/2014/05/graphing-distribution-of-english.html). The other day I decided to do a similar one for letters in danish words and for this I used R. I downloaded all abstracts from the danish Wikipedia and made my own version as you can see... Read more »

Predicting Titanic deaths on Kaggle II: gbm

July 26, 2015
By
Predicting Titanic deaths on Kaggle II: gbm

Following my previous post I have decided to try and use a different method: generalized boosted regression models (gbm). I have read the background in Elements of Statistical Learning and arthur charpentier's nice post on it. This data ...

Read more »

Rcpp 0.12.0: Now with more Big Data!

July 25, 2015
By
Rcpp 0.12.0: Now with more Big Data!

A new release 0.12.0 of Rcpp arrived on the CRAN network for GNU R this morning, and I also pushed a Debian package upload. Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 423 packages on CRAN depend on Rcpp...

Read more »

Roll Your Own Gist Comments Notifier in R

July 25, 2015
By

As I was putting together the coord_proj ggplot2 extension I had posted a (https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7) that I shared on Twitter. Said gist received a comment (several, in fact) and a bunch of us were painfully reminded of the fact that there is no built-in way to receive notifications from said comment activity. @jennybryan posited that it

Read more »

IEEE Spectrum Puts R in 6th Place

July 25, 2015
By
IEEE Spectrum Puts R in 6th Place

R has moved up three positions to 6th place on IEEE Spectrum ranking. How long will it be before Julia is on the list? The post IEEE Spectrum Puts R in 6th Place appeared first on Exegetic Analytics.

Read more »

Logistic Growth, S Curves, Bifurcations, and Lyapunov Exponents in R

July 24, 2015
By
Logistic Growth, S Curves, Bifurcations, and Lyapunov Exponents in R

If you’ve ever wondered how logistic population growth (the Verhulst model), S curves, the logistic map, bifurcation diagrams, sensitive dependence on initial conditions, “orbits”, deterministic chaos, and Lyapunov exponents are related to

Read more »

New quantmod and TTR on CRAN

July 24, 2015
By

I just sent quantmod_0.4-5 to CRAN, and TTR_0.23-0 has been there for a couple weeks. I'd like to thank Ivan Popivanov for many useful reports and patches to TTR. He provided patches to add HMA (Hull MA), ALMA, and ultimateOscillator functions.Jam...

Read more »

A Path Towards Easier Map Projection Machinations with ggplot2

July 24, 2015
By
A Path Towards Easier Map Projection Machinations with ggplot2

The $DAYJOB doesn’t afford much opportunity to work with cartographic datasets, but I really like maps and tinker with shapefiles and geo-data when I can, plus answer a ton of geo-questions on StackOverflow. R makes it easy—one might even say too easy—to work with maps. All it takes to make a map of the continental

Read more »

{Long Vs. Wide} Data Frames

July 24, 2015
By
{Long Vs. Wide} Data Frames

Introduction This is an excellent resource to understand 2 types of data frame format: Long and Wide. Just take a look at figure 1 inside the article 1) Long format: ggplot2 needs in certain scenarios this kind of format to work (generally grouped...

Read more »

R #6 in IEEE 2015 Top Programming Languages, Rising 3 Places

July 24, 2015
By
R #6 in IEEE 2015 Top Programming Languages, Rising 3 Places

IEEE Spectrum has published its 2015 list of Top Programming Languages, and R ranks in 6th place, jumping 3 places from its 2014 ranking. Here's what the IEEE has to say about the top 10 from the table above: The big five—Java, C, C++, Python, and C#—remain on top, with their ranking undisturbed, but C has edged to within...

Read more »

Why I use Panel/Multilevel Methods

July 24, 2015
By
Why I use Panel/Multilevel Methods

I don’t understand why any researcher would choose not to use panel/multilevel methods on panel/hierarchical data. Let’s take the following linear regression as an example: , where is a random effect for the i-th group. A pooled OLS regression model for the above is unbiased and consistent. However, it will be inefficient, unless for all

Read more »

mapView: basic interactive viewing of spatial data in R

July 24, 2015
By
mapView: basic interactive viewing of spatial data in R

Working with spatial data in R I find myself quite often in the need to quickly visually check whether a certain analysis has produced reasonable results. There are two ways I usually do this. Either I: (sp)plot the data in … Continue reading →

Read more »

CACM Highlights R

July 23, 2015
By
CACM Highlights R

The Association for Computing Machinery is the main professional organization for computer science, largely for academia but still with a broad membership. ACM publishes a number of journals, most of them for research but its flagship publication is a magazine, the Communications of the ACM. The current issue of the CACM includes an article, “Bringing … Continue reading...

Read more »

A 15-Week Intro Statistics Course Featuring R

July 23, 2015
By
A 15-Week Intro Statistics Course Featuring R

Do you teach introductory statistics or data science? Need some help planning your fall class? I apply the 10 Principles of Burning Man in the design and conduct of all my undergraduate

Read more »

An alternative presentation of the ProPublica Surgeon Scorecard

July 23, 2015
By
An alternative presentation of the ProPublica Surgeon Scorecard

ProPublica, an independent investigative journalism organisation, have published surgeon-level complications rates based on Medicare data. I have already highlighted problems with the reporting of the data: surgeons are described as having a “high adjusted rate of complications” if they fall in the red-zone, despite there being too little data to say whether this has happened

Read more »

Revolution R Open 3.2.1 now available

July 23, 2015
By

The latest update to Revolution R Open, RRO 3.2.1, is now available for download from MRAN. This release upgrades to the latest R engine (3.2.1), enables package downloads via HTTPS by default, and adds new supported Linux platforms. Revolution R Open 3.2.1 includes: The latest R engine, R 3.2.1. Improvements in this release include more flexible character string handling,...

Read more »

Sunbelt XXXV, Social Network Analysis, Statnet and R

July 23, 2015
By
Sunbelt XXXV, Social Network Analysis, Statnet and R

by Joseph Rickert The XXXV Sunbelt Conference of the International Network for Social Network Analysis (INSNA) was held last month at Brighton beach in the UK. (And I am still bummed out that I was not there.) A run of 35 conferences is impressive indeed, but the social network analysts have been at it for an even longer time...

Read more »

Administrative Maps and Projections in R

July 23, 2015
By
Administrative Maps and Projections in R

Today I will demonstrate how to create maps of “other countries”, and use map projections, with the choroplethr package in R. I write “other countries” in quotes because, like most things, translating a high level wish into software can be complicated. If you want to skip ahead and play with a web app that lets you explore The post

Read more »

Waterfall plots – what and how?

July 23, 2015
By
Waterfall plots – what and how?

“Waterfall plots” are nowadays often used in oncology clinical trials for a graphical representation of the quantitative response of each subject to treatment. For an informative article explaining waterfall plots see Understanding Waterfall Plots. In this post, we illustrate the creation of waterfall plots in R. In a typical waterfall plot, the x-axis serves as

Read more »

Call for participation: AusDM 2015, Sydney, 8-9 August

July 23, 2015
By
Call for participation: AusDM 2015, Sydney, 8-9 August

************************************************************* The 13th Australasian Data Mining Conference (AusDM 2015) Sydney, Australia, 8–9 August 2015 URL: http://ausdm15.ausdm.org/ ************************************************************* The Australasian Data Mining Conference is devoted to the art and science of intelligent data mining: the meaningful analysis of (usually large) data … Continue reading →

Read more »

htmltab v.0.6.0

July 23, 2015
By
htmltab v.0.6.0

The next version of the htmltab package has just been released on CRAN and GitHub. The goal behind htmltab is to make the collection of structured information from HTML tables as easy and painless as possible (read about the package here and here). The most recent update got rid of many smaller bug fixes, inconsistencies...

Read more »

Stan 2.7 (CRAN, variational inference, and much much more)

July 22, 2015
By
Stan 2.7 (CRAN, variational inference, and much much more)

Stan 2.7 is now available for all interfaces. As usual, everything you need can be found starting from the Stan home page: http://mc-stan.org/ Highlights RStan is on CRAN!(1) Variational Inference in CmdStan!!(2) Two new Stan developers!!!  A whole new logo!!!!  Math library with autodiff now available in its own repo!!!!!  (1) Just doing install.packages(“rstan”) isn’t The post

Read more »

Le Monde puzzle [#920]

July 22, 2015
By
Le Monde puzzle [#920]

A puzzling Le Monde mathematical puzzle (or blame the heat wave): A pocket calculator with ten keys (0,1,…,9) starts with a random digit n between 0 and 9. A number on the screen can then be modified into another number by two rules: 1. pressing k changes the k-th digit v whenever it exists into

Read more »

Introducing the cymruservices R Package

July 22, 2015
By

The R world has come a long way since Jay & I wrote Data-Driven Security. We had to make a conscious decision to stick with R 2.14.0 (R is at version 3.2.1 now) and packages such as knitr and dplyr either didn’t exist or were in their infancy. In Chapter 4, we showed some very basic exploratory data analysis and...

Read more »

Count data: To Log or Not To Log

July 22, 2015
By
Count data: To Log or Not To Log

Count data are widely collected in ecology, for example when one count the number of birds or the number of flowers. These data follow naturally a Poisson or negative binomial distribution and are therefore sometime tricky to fit with standard LMs. A traditional approach has been to log-transform such data and then fit LMs to

Read more »

New R Markdown articles section, plus .Rmd to .docx super powers!

July 22, 2015
By
New R Markdown articles section, plus .Rmd to .docx super powers!

We’ve added a new articles section to the R Markdown development center at rmarkdown.rstudio.com/articles.html. Here you can find expert advice and tips on how to use R Markdown efficiently. In one of the first articles, Richard Layton of graphdoctor.com explains the best tips for using R Markdown to generate Microsoft Word documents. You’ll learn how to set Word styles tweak

Read more »

New caret Version (6.0-52)

July 22, 2015
By

A new version of caret (6.0-52) is on CRAN. Here is the news file but the Cliff Notes are: sub-sampling for class imbalances is now integrated with train and is used inside of standard resampling. There are four methods available right now: up- and...

Read more »

Doodling With 3d Animated Charts in R

July 22, 2015
By
Doodling With 3d Animated Charts in R

Doodling with some Gapminder data on child mortality and GDP per capita in PPP$, I wondered whether a 3d plot of the data over the time would show different trajectories over time for different countries, perhaps showing different development pathways over time. Here are a couple of quick sketches, generated using R (this is the

Read more »

Chronicles from useR! – summing up

July 21, 2015
By
Chronicles from useR! – summing up

Dear R users, Here you are my last post on the useR! in

Read more »