Plotting large amounts of atmospheric data

November 4, 2012
By

Plotting large amounts of hourly atmospheric data body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console',...

Read more »

Wikipedia Attention and the US elections

November 3, 2012
By
Wikipedia Attention and the US elections

One of the most interesting challenges of data science are predictions for important events such as national elections. With all those data streams of billions of posts, comments, likes, clicks etc. there should be a way to identify the most important correlations to make predictions about real-world behavior such as: going to the voting booth

Read more »

Generation of a normal distribution from "scratch" – The box-muller method

November 3, 2012
By
Generation of a normal distribution from "scratch" – The box-muller method

My previous post is about a method to simulate a Brownian motion. A friend of mine emailed me yesterday to tell me that this is useless if we do not know how to simulate a normally distributed variable. My first remark is: use the rnorm() function if t...

Read more »

Reordering factor levels in R plots

November 3, 2012
By
Reordering factor levels in R plots

A few days ago a post doctoral researcher asked me if I could help him reorder the factor levels on a bar chart. The problem is that R automatically alphabetizes factor levels. I thought this would be fairly straight-forward but...

Read more »

Project Euler — problem 21

November 3, 2012
By
Project Euler — problem 21

It’s been over one month since my last post on Euler problem 20, when  I was planning to post at least one on either Euler project or visualization. So I am four posts behind; I’ll try to catch up. Tonight, I’ll solve the 21st Euler … Continue reading →

Read more »

SAP HANA and R (The way of the widget)

November 3, 2012
By
SAP HANA and R (The way of the widget)

A real developer never stops learning that's a quote I always love to repeat...because it applies to my life...you can know a lot of things but there's always something new to learn, or to re-learn. That's why a couple of days ago I start reading wxPyt...

Read more »

Breakthroughs in the sas7bdat Reverse Engineering Effort

November 3, 2012
By

Due largely to the work of Clint Cummins, the sas7bdat file format has become a bit less shrouded. In particular, we now know the following: how to detect files with compressed data (and fail graciously) more details about the platform that generated the file (e.g., endianess, OS details) how to read files that were generated

Read more »

Using R to Compare Hurricane Sandy and Hurricane Irene

November 3, 2012
By
Using R to Compare Hurricane Sandy and Hurricane Irene

Having just lived through two back to back hurricanes (Irene in 2011 and Sandy in 2012) that passed through the New York metro area I was curious how the paths of the hurricanes differed.  I worked up a quick graph in R using data from Unisys.  The data also includes wind speed and barometric pressure.

Read more »

Unstable parallel simulation, or after finishing testing, test some more

November 2, 2012
By

Lately I have been working on a trading system based on Support Vector Machine (SVM) regression (and yes, if you wonder, there are a few posts planned to share the results). In this post however I want to share an interesting problem I had to deal with. Few days ago, I started running simulations using

Read more »

Simple Bayesian bootstrap

November 2, 2012
By

Bootstrapping is a very popular statistical technique. However, its Bayesian analogue proposed by Rubin (1981) is not very common. I was looking for an example of its implementation in GNU R and could not find one so I decided to write a snippet presen...

Read more »

Which functions in plyr do people use?

November 2, 2012
By
Which functions in plyr do people use?

This is the question that Hadley Wickham recently set out to discovering by asking frequent R and plyr users how they use it in an online survey. Once a decent number of people have responded, Hadley quickly went forward and produced a short analysis of the plyr usage survey, and published it in RPubs.  With his permission, I am...

Read more »

googleVis 0.3.3 is released and on its way to CRAN

November 2, 2012
By
googleVis 0.3.3 is released and on its way to CRAN

I am very grateful to all who provided feedback over the last two weeks and tested the previous versions 0.3.1 and 0.3.2, which were not released on CRAN. So, what changed since version 0.3.2?Not much, but plot.gvis didn't open a browser window when op...

Read more »

Ryan Peek on Customizing Your R Setup

November 2, 2012
By

Ryan Peek showed us how to use an .Rprofile file to customize your R setup. Here are his instructions and script: For Windows To change profile for R, go here: C:\Program Files\R\R-2.15.1\etc (or whatever version you are using) Edit the “Rprofile.site” file Restart R For Macs Create your Rprofile file. -use TextEdit or another editor to create a file called Rprofile.txt In a...

Read more »

Slides and replay for "The Rise of Data Science"

November 2, 2012
By

I had a great time presenting my new webinar yesterday, thanks to everyone who attended "The Rise of Data Science in the Age of Big Data Analytics" and especially those who submitted questions. Sorry I didn't have time to get to them all, but feel free to ask here in the comments. There's been some discussion recently about whether...

Read more »

The New Madrid Fault – Past, Present and Future

November 2, 2012
By
The New Madrid Fault – Past, Present and Future

New Madrid, Territory of Missouri, March 22, 1816 Dear Sir, In compliance with your request, I will now give you a history, as full in detail as the limits of the letter will permit, of the late awful visitation of Providence in this place and vicinity.  On the 16th of December, 1811, about two o'clock, A.M., we were visited by a violent...

Read more »

Mapping Capabilities in R

November 2, 2012
By
Mapping Capabilities in R

From time-to-time creating a basic map of the United States or other parts of the world to complement some statistical analysis is useful to emphasize a point. The maps package in R provide a good way to produce these these maps.  These maps axes are based on latitude and longitude so overlaying other information on

Read more »

GGtutorial: Day 5 – Gradient Colors and Brewer Palettes

November 2, 2012
By
GGtutorial: Day 5 – Gradient Colors and Brewer Palettes

So, continuing with the short tutorials on how to do relatively simple (but sometimes very frustrating things) in ggplot, today’s post looks at how to use gradient colors and Brewer colors to color either continuous or discrete dependent variab...

Read more »

RAppArmor video tutorials: security in R!

November 2, 2012
By

Security and R One of the more challenging aspects of OpenCPU is security in R (or lack thereof). This is actually one of the reasons OpenCPU runs on Linux only at this point; other operating systems simply lack superpowers to implement open computing. (Maybe one exception is BSD, for which I lack superpowers). Security is ...

Read more »

PrettyR R

November 1, 2012
By

When it comes to R blogging I'm a complete newbie. So I'm still struggling with the technical details.Part of the process is prettifying the code snippets. One of the standard ways of doing this involves copy-and-paste-ing the R code into the Pretty R ...

Read more »

Data types, part 1: Ways to store variables

November 1, 2012
By
Data types, part 1: Ways to store variables

I've been alluding to different R data types, or classes, in various posts, so I want to go over them in more detail. This is part 1 of a 3 part series on data types. In this post, I'll describe and give a general overview of useful data types.  I...

Read more »

Watch Obama and Romney criss-cross the US

November 1, 2012
By
Watch Obama and Romney criss-cross the US

The Washington Post has an interactive graphic showing the rate at which the US presidential candidates Barack Obama and Mitt Romney have visited the various states for campaign rallies and fundraisers. Here's how it looks today: You can clearly see the focus on key swing states like Florida and Ohio, as well as non-competitive (but donor-rich) states like California...

Read more »

R in the Press

November 1, 2012
By
R in the Press

Here is the list of press reports and news about R Bits (A bog under The New York Times) R you ready for R? by Ashlee Vance Published: January 8, 2009, 1:52 PM The New York Times Data Analysts Captivated by R’s Power by Ashlee Vance Published: January 6, 2009  InfoWorld The BI battle isn’t

Read more »

Variable probability Bernoulli outcomes – Fast and Slow

November 1, 2012
By
Variable probability Bernoulli outcomes – Fast and Slow

I am working on a project that requires the generation of Bernoulli outcomes. Typically, I would go about this using the built in sample() function like so: This works great and is fast, even for large n. Problem is, I want to generate each sample with its own unique probability. Seems straight forward enough, I

Read more »

Correlation: Easy as 1-2-3?

November 1, 2012
By

I recently had a task to take a look at some assessment (audit) data. I was assuming, rather hoping for data with a normal distribution and thought it would be a quick case of Pearson correlation between two columns: "Duration" and "Score". Just conjecture at this point as I did not understand what the assessment process

Read more »

Upcoming R training by Hadley Wickham: SF Dec 3-4, DC Dec 10-11

November 1, 2012
By

(By Hadley Wickham) Hi all, I’d like to let you know about four R training courses that RStudio will be offering in December: * Effective data visualization (http://bit.ly/TY2ONI) Dec 3. San Francisco, CA * Reports and reproducible research (http://bit.ly/RsZmYr) Dec 4. San Francisco, CA * Advanced R programming (http://bit.ly/RvZDsd) Dec 10. Washington, DC * Package development (http://bit.ly/UhTIWz) Dec 11....

Read more »

New version of RStudio (v0.97)

November 1, 2012
By
New version of RStudio (v0.97)

Today a new version of RStudio (v0.97) is available for download from our website.  The principal focus of this release was creating comprehensive tools for R package development. We also implemented many other frequently requested enhancements including a new Vim editing mode and a much improved Find and Replace pane. Here’s a summary of what’s

Read more »

GGtutorial: Day 4 – More Colors

November 1, 2012
By
GGtutorial: Day 4 – More Colors

So far we’ve covered Melting and Casting data using the reshape() package and today we’re going to look at different ways of coloring and selecting palettes for plots. For these plots, we’re going to use the built in diamonds data...

Read more »

Why pictures are so important when modeling data?

October 31, 2012
By
Why pictures are so important when modeling data?

(bis repetita) Consider the following regression summary,Call: lm(formula = y1 ~ x1)   Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.0001 1.1247 2.667 0.02573 * x1 0.5001 0.1179 4.241 0.00217 **...

Read more »

Regime Detection

October 31, 2012
By
Regime Detection

Regime Detection comes handy when you are trying to decide which strategy to deploy. For example there are periods (regimes) when Trend Following strategies work better and there are periods when Mean Reversion strategies work better. Today I want to show you one way to detect market Regimes. To detect market Regimes, I will fit

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.