Optimal sorting using rpart

June 24, 2012
By
Optimal sorting using rpart

Some time ago I read a nice post Solving easy problems the hard way where linear regression is used to solve an interesting puzzle. Following the idea I used rpart to find optimal decision tree sorting five elements.It is well known that...

Read more »

Querying DBpedia from R

June 24, 2012
By

DBpedia is an extract of structured information from wikipedia. The structured data can be retrieved using an SQL-like query language for RDF called SPARQL. There is already an R package for this kind of queries named SPARQL.There is an S4 class Dbpedia part of my datamart package that aims to support the creation of predefined parameterized queries. Here is...

Read more »

Blog with R Markdown and tumblr: Part I

June 24, 2012
By
Blog with R Markdown and tumblr: Part I

I finally got a chance this weekend to settle on a way to include R Markdown into my blogging process. I needed to do this as my subsequent postings will involve more code chunks regarding Rook deployment and examples, and R Markdown formats and highlights code chunks like a boss! If you want to incorporate R code,...

Read more »

R 2.15.1 Available

June 24, 2012
By

Ubuntu packages for the latest release of R are now available on CRAN and RutteR PPA. If you have either repository installed, Ubuntu should have updated R automatically. If you do not have either repository isnallted, see the Installing R tab above....

Read more »

Highlights of the useR! 2012 – Review of the reviews

June 24, 2012
By
Highlights of the useR! 2012 – Review of the reviews

It’s gutted that I could not attend useR! 2012 after having a great time at useR!2011 at Warwick. However, a great active online community of useRs allows me to get lots of goods stuffs from the conference even though I did not attend physically. By keep following the #useR2012 and #rstats in Twitter and reading some great posts from R-bloggerRs on the

Read more »

Visualization in R with ggplot2 taught by Hadley Wickham at Statistics.com

June 24, 2012
By

Hadley Wickham, the creator of ggplot2, will present his course “Visualization in R with ggplot2,” online at Statistics.com, July 20 – Aug 17. Upcoming “R” courses, and others: Jun 22:  Smoothing with P-splines using R (taught by Brian Manly and Paul Eilers) Jun 29:  Data Mining in R (taught by Luis Torgo) Jul  6:  Intro to Resampling Methods (taught...

Read more »

Mexico City’s Metro Statistics

June 24, 2012
By
Mexico City’s Metro Statistics

I used R and ggplot2 to make a bubble map of Mexico City’s Metro passenger count from January to February 2012. The statistics are stunning, some stations for example Indios Verdes, reached 10 million passengers in jus three months. You can see the code below and get the data for the project here.The post Mexico City’s Metro Statistics...

Read more »

setting your working directory permanently in R

June 24, 2012
By

Most of us R users are using a special working directory for the daily work in R. But I was bothered in typing everytime in my command line prior using R. Also using this line at the first position in scripts was not pleasent enough. So how to get around this? There is a special

Read more »

Using twitteR to see, what german press secretary tweets about

June 24, 2012
By
Using twitteR to see, what german press secretary tweets about

Find the HTML-slides here, and the .Rmd-file that was used to generate here. How to deal with .Rmd-files, see here What this is about These are my first steps to play around with the interface from R to twitter, using the twitteR-package. We will load the latest 1500 (maximum the API allows) tweets from the

Read more »

Actuarial models with R, Meielisalp

June 23, 2012
By
Actuarial models with R, Meielisalp

I will be giving a short course in Switzerland next week, at the 6th R/Rmetrics Meielisalp Workshop & Summer School on Computational Finance and Financial Engineering organized by ETH Zürich, https://www.rmetrics.org/. The long...

Read more »

Framing investing as a decision-making process

June 23, 2012
By
Framing investing as a decision-making process

Brian Peterson and I had a chance to visit the University of Washington a couple of weeks ago at the behest of Doug Martin, where we gave a seminar covering various R packages we’ve written. Here are the slides we used. We also had quite a bit of time that we spent with Doug, Eric

Read more »

Retweets, Modified Tweets, vias: what’s in the SoMeLab dataset

June 23, 2012
By
Retweets, Modified Tweets, vias: what’s in the SoMeLab dataset

  Since October we have been collecting tweets related to the Occupy movement and so far we’ve picked up 64,298,104 tweets. In future posts we will give you some insight into our process, but today the question is, what is the difference between new style retweets, old style retweets and new emerging tags like MT?

Read more »

Handling large FASTA sequence datasets in R: Shuffle and retrieve "n" number of sequences of fixed length from the whole FASTA file and export them in a new FASTA file.

June 23, 2012
By

When you are working with large FASTA datasets is probable to find out that the sequences are in sort of a mixed quality (obviously, depending on your scientific question),I mean for example, imagine that you retrieve the whole collection of exons of a...

Read more »

The R-Podcast Screencast 2: Visualization with ggplot2

June 23, 2012
By

Here is the second screencast episode of the R-Podcast to accompany episode 8 of the R-Podcast: Visualization with ggplot2. In this screencast I demonstrate a real-time session of using ggplot2 to create boxplots for a visualization of hockey attendance in the NHL. The R code created in this screencast is available in our GitHub repository,

Read more »

Rcpp 0.9.11

June 22, 2012
By

Release 0.9.11 of Rcpp arrived on CRAN this morning and in Debian later today. This is a somewhat incremental release with a few internal improvements and few new features. One interesting new development has been contributed by John Chambers who i...

Read more »

Maps of changes in area boundaries, with R

June 22, 2012
By
Maps of changes in area boundaries, with R

Today a coworker needed some maps showing boundary changes. I used what I learned last week in the useR 2012 geospatial data course to make a few simple maps in R, overlaid on OpenStreetMap tiles. I’m posting my maps and … Continue reading →

Read more »

Updates to Old code

June 22, 2012
By
Updates to Old code

I notice from time to time that people download code from the drop box. The drop box lags  CRAN.  I’ve upload the newest versions to the drop box. In the future I will post new releases to the drop box before I submit to CRAN.

Read more »

R 2.15.1 includes performance improvements inspired by dataframe package

June 22, 2012
By

The latest update to open-source R, R 2.15.1, was released this morning. (You can grab sources now, and binary versions will hit the CRAN mirrors over the next couple of days.) In addition to several new features and bug fixes (including the new globalVariables function, which will be a boon to package developers), this update also includes some significant...

Read more »

R and the web (for beginners), Part II: XML in R

June 22, 2012
By

This second post of my little series on R and the web deals with how to access and process XML-data with R. XML is a markup language that is commonly used to interchange data over the Internet. If you want to access some online data over a webpage's AP...

Read more »

Plotting non-overlapping circles…

June 22, 2012
By
Plotting non-overlapping circles…

It's holiday today in Sweden, so happy Midsummer to everyone!!! (I know, it's delayed)No work for me, so I checked up the latest XKCD:Gorgeous, right? So I decided to see if I could make something similar - but being lazy, I didn't feel like drawing al...

Read more »

Analytics with SAP and R (Windows version)

Analytics with SAP and R (Windows version)

My good friend and programming guru Piers Harding wrote a blog called Analytics with SAP and R where he showed us how to link the wonderful worlds of R and SAP. Yes...SAP...not SAP HANA...but the good old NetWeaver...Piers build the RSAP extension usin...

Read more »

When the going gets tough…

June 22, 2012
By
When the going gets tough…

Getting closer to my personal Euro2012 derby: England v Italy. I find amusing that both sets of media think that their respective team have been gifted a good tie. The English are very happy to have avoided Spain, while the Italians don't mind not...

Read more »

eoda publishes interactiveGGPLOT – interactive graphics with ggplot2

June 22, 2012
By

One of Rs great strengths compared to other statistic solutions or programming languages ​​is the amount of possibilities for creating well-designed publication-quality plots. Almost all plot types can be created with any amount of fine tuning. R works on small data sets as well as on big data. In addition to Rs base-graphics various add-on

Read more »

Two new, important books on R

June 22, 2012
By
Two new, important books on R

Two books were recently published that are sure to help R grow even faster. R has a reputation, partially deserved, for being hard to learn.  These books will help.  The first makes learning easier, the second can make learning less necessary for initiates. I have not yet touched either book. R for Dummies The authors … Continue reading...

Read more »

Video: Getting staRted with R: An accelerated primer by Lyndon Walker – Melbourne R Users

June 22, 2012
By
Video: Getting staRted with R: An accelerated primer by Lyndon Walker – Melbourne R Users

This post shares the video from a talk presented on June 20 2012 by Dr Lyndon Walker (see Meetup page). The talk was titled “Getting staRted with R: An accelerated primer”. To quote the outline of the talk : R … Continue reading →

Read more »

Nonlinear systems

June 21, 2012
By
Nonlinear systems

There is a long standing debate if financial systems are truly random or contain some structure. From the study of non-linear dynamical systems and chaos one finds it is possible that even perfectly deterministic systems can appear to be random. … Continue reading →

Read more »

Learning a new language

June 21, 2012
By
Learning a new language

It had been a very long time since I’d tried to learn a new programming language. I started C in 1987, S in 1992, and Perl in 1997, but nothing really new in the subsequent 15 years. A friend now has me doing D, wanting to find time to learn ruby, and, most recently, playing

Read more »

Background to my book project “Empirical Software Engineering with R”

June 21, 2012
By

This post provides background information that can be referenced by future posts. For the last 18 months I have been working in fits and starts on a book that has the working title “Empirical Software Engineering with R”. The idea is to provide broad coverage of software engineering issues from an empirical perspective (i.e., the

Read more »

Confidence intervals with tiers: functions for between-subjects (independent measures) ANOVA

June 21, 2012
By
Confidence intervals with tiers: functions for between-subjects (independent measures) ANOVA

In a previous post I showed how to plot difference-adjusted CIs for between-subjects (independent measures) ANOVA designs (see here). The rationale behind this kind of graphical display is introduced in Chapter 3 of Serious stats (and summarized in my earlier blog post). In a between-subjects – or in indeed in a within-subjects (repeated measures) – design

Read more »