Daily casualties in Syria

February 9, 2012
By
Daily casualties in Syria

Every new day brings its statistics of new deaths in Syria… Here is an attempt to learn about the Syrian uprising by the figures. Data vary among sources: the Syrian opposition provides the number of casualties by day (here on Dropbox), updated on 8 February 2012, with a total exceeding 8 000. We note first

Read more »

Slides and replay for "A backstage tour of ggplot2"

February 9, 2012
By

Many thanks to Hadley Wickham for his informative and entertaining webinar yesterday, "A backstage tour of ggplot2". Thanks also to everyone who submitted questions -- with more than 800 attendees live on the line we had many more questions than we had time to answer. For more ggplot2 information, Hadley kindly provided the following resources in his slides: ggplot2...

Read more »

Monitoring Progress Inside a Foreach Loop

February 9, 2012
By

The foreach package for R is excellent, and allows for code to easily be run in parallel. One problem with foreach is that it creates new RScript instances for each iteration of the loop, which prevents status messages from being logged to the console output. This is particularly frustrating during long-running tasks, when we are often unsure...

Read more »

Intentional Homicide in South America 1995-2010

February 9, 2012
By
Intentional Homicide in South America 1995-2010

Intentional homicide is defined as unlawful death purposefully inflicted on a person by another person. The source of this stat is The United Nations Office on Drugs and Crime (UNODC). I created the above image using ggplot2 which does 98% of the leg-work in most cases. Count is the number of homicides in a calendar year

Read more »

The reshape function

February 9, 2012
By
The reshape function

The other day I wrote about the R functions by, apply and friends, which allow me to operate on subsets of data. All those functions work nicely, if the data is given in the right format. More often than not it isn't and I have to reshape the data befo...

Read more »

Monitoring Progress Inside a Foreach Loop

February 9, 2012
By

The foreach package for R is excellent, and allows for code to easily be run in parallel. One problem with foreach is that it creates new RScript instances for each iteration of the loop, which prevents status messages from being logged to the console output. This is particularly frustrating during long-running tasks, when we are often unsure how much...

Read more »

GARCH estimation using maximum likelihood

February 9, 2012
By

In my previous post I presented my findings from my finance project under the guidance of Dr Susan Thomas. The results in my paper suggested that there are macroeconomic variables, particularly the INR/USD exchange rates, that help us understand the dynamics of stock returns. Although the results that I obtained were significant at 5%...

Read more »

Successful Two Day Workshop at UNC-Chapel Hill

This week the Odum Institute at UNC held a two day short course on text classification with RTextTools. The workshop, led by Loren Collingwood, covered the basics of content analysis, supervised learning and text classification, introduction to R, and how to use RTextTools. Participants brought in their own data on the second day, which the instructor helped them classify....

Read more »

Successful Two Day Workshop at UNC-Chapel Hill

This week the Odum Institute at UNC held a two day short course on text classification with RTextTools. The workshop, led by Loren Collingwood, covered the basics of content analysis, supervised learning and text classification, introduction to R, and how to use RTextTools. Participants brought in their own data on the second day, which the instructor helped them classify....

Read more »

Analyzing Twitter Data in R – Part 1

February 8, 2012
By

I recently began using the TwitteR package in R to examine my tweeting patterns. One of my first projects was to identify each of my Twitter followers, where they were located, how many tweets they had, and then plot their location on a map using a bubble which was related to their total number of

Read more »

Movie Recommendations and More via MapReduce and Scalding

February 8, 2012
By
Movie Recommendations and More via MapReduce and Scalding

Scalding is an in-house MapReduce framework that Twitter recently open-sourced. Like Pig, it provides an abstraction on top of MapReduce that makes it easy to write big data jobs in a syntax that’s simple and concise. Unlike Pig, Scalding is written in pure Scala – which means all the power of Scala and the JVM is already built-in....

Read more »

Trust in the EU and National Parliaments

February 8, 2012
By
Trust in the EU and National Parliaments

I have been playing around with some data from Eurobarometer, to support some arguments for a small comment I am writing for the Maastricht Law Review. I got the data for the following two questions: I would like to ask you a question about how much ...

Read more »

What is the Potential Audience Size for a Hashtag Community?

February 8, 2012
By
What is the Potential Audience Size for a Hashtag Community?

What’s the potential audience size around a Twitter hashtag? Way back when, in the early days of webs stats, reported figures tended to centre around the notion of hits, the number of calls made to a server via website activity. I forget the details, but the metric was presumably generated from server logs. This measure

Read more »

Oracle’s strange understanding of R users

February 8, 2012
By

After reading David Smith’s tweet on the price of Oracle R Enterprise (actually free, but it requires Oracle Data Mining at $23K/core as pointed out by Joshua Ulrich.) I went to Oracle’s site to see what was all about. Oracle … Continue reading →

Read more »

discrimination between CpG islands and random sequences using Markov chains

February 8, 2012
By
discrimination between CpG islands and random sequences using Markov chains

Major part of modern research is trying to find patterns in the given dataset using learning methods. One of the methods that can use a priori information for such purpose is Markov chains, in which the probability of symbol occurrence … Continue reading →

Read more »

Revolution R update adds Red Hat 6 support

February 8, 2012
By

The Dev Team at Revolution Analytics recently released an update to the Revolution R 5 family. Version 5.0.1 adds compatibility with Red Hat Enterprise Linux 6 for all editions (Community, Academic and Enterprise). This expands the platform support to Red Hat 5, Red Hat 6 and Microsoft Windows. For Revolution R Enterprise customers and users of the free Academic...

Read more »

"R": PLS Regression (Gasoline) – 005

February 8, 2012
By
"R": PLS Regression (Gasoline) – 005

Let´s see know how to plot the scores for the 3 PLS Components:  We can see the explained variance from each component in the diagonal.We can get it from R with:> explvar(gas1)   Comp 1      Comp 2  &nbs...

Read more »

Zero rates with futile.paradigm

February 8, 2012
By
Zero rates with futile.paradigm

Here’s a short example of calculating zero rates and discount factors from cash rates using futile.paradigm. Of note is how …Continue reading »

Read more »

We Keep Our Vehicles Longer

February 8, 2012
By
We Keep Our Vehicles Longer

Description: Average age of passenger cars and light trucks in the United States since 1995. The gray area represents the possible variance of the trend line with 95% confidence. Data: https://www.polk.com/company/news/average_age_of_vehicles_...

Read more »

OpenCPU, R in the Cloud

February 8, 2012
By
OpenCPU, R in the Cloud

I ran across OpenCPU today. If you have any intest in R and reproducible research this is definitely worth checking out. Also, it looks like I might want to explore the potential of embedding functions in websites. Hm . . . .

Read more »

Hadley Wickham: ggplot2 Webinar (Today!)

February 8, 2012
By

Title: A Backstage Tour of ggplot2 with Hadley WickhamDate: Wednesday, February 8, 2012Time: 11:00AM - 12:00PM PacificPresenter: Hadley Wickham, Professor of Statistics, Rice UniversityRegister here.I used ggplot2 extensively a few years ago, but rever...

Read more »

recents advances in Monte Carlo Methods

February 8, 2012
By
recents advances in Monte Carlo Methods

Next Thursday (Jan. 16), at the RSS, there will be a special half-day meeting (afternoon, starting at 13:30) on Recent Advances in Monte Carlo Methods organised by the General Application Section. The speakers are Richard Everitt, University of Oxford, Missing data, and what to do about it Anthony Lee, Warwick University, Auxiliary variables and many-core

Read more »

RStudio Server part 3: using an ssh tunnel for high performance

February 8, 2012
By

In part 2 of this series of posts on RStudio Server, I commented that I suspected that RStudio Server would be fast. The first time I tried this from a remote connection, I was disappointed with the performance. Many companies… See more ›

Read more »

A spell-checker in R

February 7, 2012
By
A spell-checker in R

I came across Dr. Peter Norvig’s blog about writing a basic spell-checker (http://norvig.com/spell-correct.html), and just had to try to implement it in R. Please excuse the ugly-ish code (I have not optimized it or commented it adequately at this point, but you can get the idea of what it does by reading Dr. Norvig’s blog).

Read more »

Two incredibly useful functions to throw into your .rprofile

February 7, 2012
By

I’ve neglected this blog for quite some time but I’m getting around to finishing up a bunch of draft posts. But here is a quick one: Listing objects in your global environment A simple ls() doesn’t really tell you enough useful information at a glance. Most often I just want to know what I named

Read more »

What’s new in futile.matrix 1.1.2

February 7, 2012
By
What’s new in futile.matrix 1.1.2

This is an exciting release of futile.matrix, which in some ways the package grows up and finds its purpose. It …Continue reading »

Read more »

updated slides for ABC PhD course

February 7, 2012
By
updated slides for ABC PhD course

Over the weekend, I have added a few slides referring to recent papers mentioning the convergence of ABC algorithms, in particular the very relevant paper by Dean et al. I had already discussed in an earlier post. (This is taking a larger chunk of my time than expected! I am glad I will use the

Read more »

Example 9.20: visualizing Simpson’s paradox

February 7, 2012
By
Example 9.20: visualizing Simpson’s paradox

Simpson's paradox is always amazing to explain to students. What's bad for one group, and bad for another group is good for everyone, if you just collapse over the grouping variable. Unlike many mathematical paradoxes, this arises in a number of real...

Read more »

"R": PLS Regression (Gasoline) – 004

February 7, 2012
By
"R": PLS Regression (Gasoline) – 004

In the previous post we plot the Cross Validation predictions with:> plot(gas1, ncomp = 3, asp = 1, line = TRUE)We can plot the fitted values instead with:> plot(gas1, ncomp = 3, asp = 1, line = TRUE,which=train) Graphics are different:Of course, using "train" we get  overoptimisc statistics and we should look...

Read more »

Sponsors

Mango solutions





RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series













Contact us if you wish to help support R-bloggers, and place your banner here.