Blog Archives

Age of U.S. President Candidates

January 8, 2016
By
Age of U.S. President Candidates

This is a remake of a chart at reddit 6 months ago. I had an idea back then, but did not work it out and now the discussion is closed. The data comes from wikipedia, dimdat and NYT. The graph was created with R, here is the source code.

Read more »

Consecutive Numbers in Lottery Draws

March 2, 2014
By

A historian, a data scientist, a programmer, a mathematician, and a philosopher discuss the question, how likely it is that a lottery draw (6 out of 49) contains two consecutive numbers. The historian The historian argues that from 1955 up to 2011, there were 5026 lottery draws in Germany, every Saturday, and from 2000 on, two draws every...

Read more »

Unit conversion in R

May 17, 2013
By

Last weekend I submitted an update of my R package datamart to CRAN. It has been more than a half year since the last update, however there are only minor advances. The package is still in its early stages, and very experimental.One new feature is the function uconv. Think iconv, but instead of converting character vectors between different encodings,...

Read more »

Some of Excel’s Finance Functions in R

February 16, 2013
By

Last year I took a free online class on finance by Gautam Kaul. I recommend it, although there are other classes I can not compare it to. The instructor took great efforts in motivating the concepts, structuring the material, and enable critical thinking / intuition. I believe this is an advantage of video...

Read more »

ScraperWiki in R

July 29, 2012
By

ScraperWiki describes itself as an online tool for gathering, cleaning and analysing data from the web. It is a programming oriented approach, users can implement ETL processes in Python, PHP or Ruby, share these processes among the community (or pay for privacy) and schedule automated runs. The software behind the service is open source, and there is...

Read more »

Convenient access to Gapminder’s datasets from R

July 16, 2012
By
Convenient access to Gapminder’s datasets from R

In April, Hans Rosling examined the influence of religion on fertility. I used R to replicate a graphic of his talk:> library(datamart) > gm <- gapminder() > #queries(gm) > # > # babies per woman > tmp <- query(gm, "TotalFertilityRate") > babies <- as.vector(tmp) > names(babies) <- names(tmp) > babies <- babies > countries <- names(babies) > # > # income per capita, PPP adjusted > tmp <- query(gm, "IncomePerCapita") >...

Read more »

Querying DBpedia from R

June 24, 2012
By

DBpedia is an extract of structured information from wikipedia. The structured data can be retrieved using an SQL-like query language for RDF called SPARQL. There is already an R package for this kind of queries named SPARQL.There is an S4 class Dbpedia part of my datamart package that aims to support the creation of predefined parameterized queries. Here is...

Read more »

A wrapper for R’s data() function

June 19, 2012
By

The workflow for statistical analyses is discussed at several places. Often, it is recommended:never change the raw data, but transform it, keep your analysis reproducible, separate functions and data, use R package system as organizing structure. In some recent projects I tried an S4 class approach for this workflow, which I want to present and discuss. It makes use of...

Read more »

Working with strings

April 10, 2012
By

R has a lot of string functions, many of them can be found with ls("package:base", pattern="str"). Additionally, there are add-on packages such as stringr, gsubfn and brew that enhance R string processing capabilities. As a statistical language and environment, R has an edge compared to other programming languages when it comes to text mining algorithms or natural language processing....

Read more »

Berlin’s children

February 4, 2012
By
Berlin’s children

Few years ago, a newspaper claimed the block I live in — Prenzlauer Berg in Berlin — is the most fertile region in Europe. It was a hoax, as this (German) newspaper article points out. (The article has become quite famous because it coined the term Bionade Biedermeier to describe the life style in this area.)However,...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)