Blog Archives

Some of Excel’s Finance Functions in R

February 16, 2013
By

Last year I took a free online class on finance by Gautam Kaul. I recommend it, although there are other classes I can not compare it to. The instructor took great efforts in motivating the concepts, structuring the material, and enable critical thinking / intuition. I believe this is an advantage of video...

Read more »

ScraperWiki in R

July 29, 2012
By

ScraperWiki describes itself as an online tool for gathering, cleaning and analysing data from the web. It is a programming oriented approach, users can implement ETL processes in Python, PHP or Ruby, share these processes among the community (or pay for privacy) and schedule automated runs. The software behind the service is open source, and there is...

Read more »

Convenient access to Gapminder’s datasets from R

July 16, 2012
By
Convenient access to Gapminder’s datasets from R

In April, Hans Rosling examined the influence of religion on fertility. I used R to replicate a graphic of his talk:

> library(datamart)
> gm <- gapminder()
> #queries(gm)
> #
> # babies per woman
> tmp <- query(gm, "TotalFertilityRate")
> babies <- as.vector(tmp)
> names(babies) <- names(tmp)
> babies <- babies
> countries <- names(babies)
> #
> # income per capita, PPP adjusted
> tmp <- query(gm, "IncomePerCapita")
>...

Read more »

Querying DBpedia from R

June 24, 2012
By

DBpedia is an extract of structured information from wikipedia. The structured data can be retrieved using an SQL-like query language for RDF called SPARQL. There is already an R package for this kind of queries named SPARQL.

There is an S4 class Dbpedia part of my datamart package that aims to support the creation of predefined parameterized queries. Here is...

Read more »

A wrapper for R’s data() function

June 19, 2012
By

The workflow for statistical analyses is discussed at several places. Often, it is recommended:

  • never change the raw data, but transform it,
  • keep your analysis reproducible,
  • separate functions and data,
  • use R package system as organizing structure.

In some recent projects I tried an S4 class approach for this workflow, which I want to present and discuss. It makes use of...

Read more »

Working with strings

April 10, 2012
By

R has a lot of string functions, many of them can be found with ls("package:base", pattern="str"). Additionally, there are add-on packages such as stringr, gsubfn and brew that enhance R string processing capabilities. As a statistical language and environment, R has an edge compared to other programming languages when it comes to text mining algorithms or natural language processing....

Read more »

Berlin’s children

February 4, 2012
By
Berlin’s children

Few years ago, a newspaper claimed the block I live in — Prenzlauer Berg in Berlin — is the most fertile region in Europe. It was a hoax, as this (German) newspaper article points out. (The article has become quite famous because it coined the term Bionade Biedermeier to describe the life style in this area.)

However,...

Read more »

Categorizing my expenses

January 28, 2012
By
Categorizing my expenses

In order to analyse my expenses, a classification scheme is necessary. I need to identify categories that are meaningful to me. I decided to go with the “Classification of Individual Consumption by Purpose” (COICOP), for three reasons:

  • It is made by people who have thought more about consumption classification than I ever will.
  • It is feasible to assign bank transactions...

    Read more »

Tracking my expenses

January 8, 2012
By
Tracking my expenses

One new-year resolution I made last year was to understand where my money goes. From previous experiments I know that expense tracking has to be as simple as possible. My approach is to

  • Use my cash card as often as possible. This automatically tracks the date and some information on the vendor.
  • Use twitter to track my cash expenses. This supplements...

    Read more »

How much is a shower?

December 29, 2011
By
How much is a shower?

After looking at my heating expenses, I turned to the costs for water heating. For some time, I looked at my water meter before and after taking a shower or a bath. Quite often, I forgot one or the other measurement, but I collected about 40 observations. Here is what they look like:

The data suggest that for a...

Read more »