Free e-book on Data Science with R

February 22, 2013
By
Free e-book on Data Science with R

A new book by Jeffrey Stanton from Syracuse Iniversity School of Information Studies, An Introduction to Data Science, is now available for free download. The book, developed for Syracuse's Certificate for Data Science, is available under a Creative Commons License as a PDF (20Mb) or as an interactive eBook from iTunes. The book begins with the following clear definition...

Read more »

Shiny 0.4.0 now available

February 22, 2013
By
Shiny 0.4.0 now available

Shiny version 0.4.0 is now available on CRAN. The most visible change is that the API has been slightly simplified. Your existing code will continue to work, although Shiny will print messages about how to migrate your code. Migration should be straightforward, as described below. It will take a bit of work to switch to

Read more »

Video: IBM Opinionated Infrastructure Hangout

February 22, 2013
By

Had a great time earlier this week on a Google Hangout as part of the IBM Opinionated Infrastructure series. Moderator James Governor (analyst from RedMonk) kept the conversation lively, with topics ranging from to the value of information to the benefits of predictive analytics and evolution of Hadoop. R gets a mention at several points in the conversation, which...

Read more »

Migrating from SPSS to R #rstats

February 22, 2013
By
Migrating from SPSS to R #rstats

Preface I will every now and then post my experience with R, a package for statistical analyses. I try to show some solutions for common types of analyses or problems you are facing when you start working with R. These … Weiterlesen →

Read more »

Don’t use correlation to track prediction performance

February 22, 2013
By
Don’t use correlation to track prediction performance

Using correlation to track model performance is “a mistake that nobody would ever make” combined with a vague “what would be wrong if I did do that” feeling. I hope after reading this feel a least a small urge to double check your work and presentations to make sure you have not reported correlation where Related posts:

Read more »

What’s my daughter listening to? HTML chart gen in R

February 22, 2013
By

  My daughter, who turns 10 in April, has discovered pop music. She’s been listing to Virgin Radio 99.9, one of our local stations. Virgin provides an online playlist that goes back four days, so I scraped the data and brought it into R. The chart shown at top shows all of the songs played

Read more »

bigcor: Large correlation matrices in R

February 22, 2013
By
bigcor: Large correlation matrices in R

As I am working with large gene expression matrices (microarray data) in my job, it is sometimes important to look at the correlation in gene expression of different genes. It has been shown that by calculating the Pearson correlation between genes, one can identify (by high values, i.e. > 0.9) genes that share a common

Read more »

Why does IFELSE logic work differently on what appear to be the same values?

February 22, 2013
By

 Embarrassingly I'm stumped on this...I have a program in R for looking at grade distributions in my class. I found something weird recently with my 'ifelse' processing. I noticed that my program seemed to be over counting Cs and under counting...

Read more »

Does native R usage exist?

February 22, 2013
By

Note to R users: Users of other languages enjoy spending lots of time discussing the minutiae of the language they use, something R users don’t appear to do; perhaps you spend your minutiae time on statistics which I don’t yet know well enough to spot when it occurs). There follows a minutiae post that may

Read more »

knitr: Changing chunk options like fig.height programmatically, mid-chunk

February 22, 2013
By

Knitr is a great tool for doing reproducible research. You can produce all kinds of output inside a single knitr chunk, e.g. you can write a loop to produce lots of figures or tables. The only catch is if you want your figures to have differing captions, heights, etc (and usually you do). The standard

Read more »

R in the news: Interviews with Revolution Analytics execs

February 22, 2013
By

Here are three recent news articles that feature interviews with members of the Revolution Analytics team talking about the importance of the R language: In Forbes, CEO Dave Rich talks to Gil Press about the business landscape for Big Data. In the article, Dave says: SAS and SPSS remind me of Cobol and Fortran circa 1995. The scientific and...

Read more »

Simulated Power/Precision Analysis

February 21, 2013
By
Simulated Power/Precision Analysis

I cringe when I see research proposals that describe a sophisticated statistical approach, yet do not evaluate this approach in their power/precision/sample size planning. It's often the case that a simplified version of the proposed statistical approach is used instead. Presumably, this is due to the limited availability of power/precision/sample size planning software for sophisticated

Read more »

±∞

February 21, 2013
By
±∞

The Cauchy distribution (?dcauchy in R) nails a flashlight over the number line and swings it at a constant speed from 9 o’clock down to 6 o’clock over to 3 o’clock. (Or the other direction, from 3→6→9.) Then counts Read more »

Removing white space around R figures

February 21, 2013
By

When I want to insert figures generated in R into a LaTeX document, it looks better if I first remove the white space around the figure. Unfortunately, R does not make this easy as the graphs are generated to look good on a screen, not in a document. There are two things that can be done to fix this...

Read more »

Le Monde puzzle [#809]

February 21, 2013
By
Le Monde puzzle [#809]

Another number theory puzzle, completed in the plane to Hamburg: Integers n are called noble if they can be decomposed as a sum n=a+b+… of distinct integers such that 1/a+1/b+…=1. They are called bourgeois if they are not noble but can be decomposed as a sum n=a+b+… of integers, some of them identical, such that

Read more »

Additional Plots on French Breakpoints as Valuation

February 21, 2013
By
Additional Plots on French Breakpoints as Valuation

I feel like there might be some merit in Slightly Different Measure of Valuation using Ken French’s Market(ME) to Book(BE) Breakpoints by percentile to offer an additional valuation metric for US stocks.  I thought some additional plots might he...

Read more »

Elevation Profiles in R

February 21, 2013
By
Elevation Profiles in R

First, let's load up our data. The data are available in a gist. You can convert your own GPS data to .csv by following the instructions here, using gpsbabel.gps <- read.csv("callan.csv",  header = TRUE)Next, we can use the function SMA fr...

Read more »

A slightly different introduction to R, part IV

February 21, 2013
By
A slightly different introduction to R, part IV

Now, after reading in data, making plots and organising commands with scripts and Sweave, we’re ready to do some numerical data analysis. If you’re following this introduction, you’ve probably been waiting for this moment, but I really think it’s a good idea to start with graphics and scripting before statistical calculations. We’ll use the silly

Read more »

Plot ranges of data in R

February 21, 2013
By
Plot ranges of data in R

How to control the limits of data values in R plots. R has multiple graphics engines.  Here we will talk about the base graphics and the ggplot2 package. We’ll create a bit of data to use in the examples: one2ten <- 1:10 ggplot2 demands that you have a data frame: ggdat <- data.frame(first=one2ten, second=one2ten) Seriously The post Plot...

Read more »

plot textual differences in Shiny

February 21, 2013
By
plot textual differences in Shiny

Wordclouds such as Wordle are pretty rubbish, so I thought I'd try to make a better one, one that actually produces (statistically) meaningful results. I was so happy with the outcome I decided to make it interactive, so go on, have a play!Compare any...

Read more »

Zurich, Feb 2013 – Spring Lecture

February 21, 2013
By

(This article was first published on Rmetrics blogs, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on their blog: Rmetrics blogs. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave,...

Read more »

Examining Overlapping Meetup Memberships with Venn Diagrams

February 21, 2013
By
Examining Overlapping Meetup Memberships with Venn Diagrams

As of the beginning of 2013, Data Community DC ran three Meetup groups: Data Science DC, Data Business DC, and R Users DC. We’ve often wondered how much these three groups overlapped. In this post, I’m going to show you … Continue reading → The post Examining Overlapping Meetup Memberships with Venn Diagrams appeared first on Data...

Read more »

Social Media Monitoring tools in R with just a few lines

February 21, 2013
By
Social Media Monitoring tools in R with just a few lines

Social Media Analysis has become some kind of new obsession in Marketing. Every company wants to engage existing customers or attract new ones through this communication channel. Therefore, they hire designers, editors, community managers, etc. However, when it comes to … Continue reading →

Read more »

Video: Survey Package in R

February 20, 2013
By
Video: Survey Package in R

Sebastián Duchêne presented a talk at Melbourne R Users on 20th February 2013 on the Survey Package in R. Talk Overview: Complex designs are common in survey data. In practice, collecting random samples from a populations is costly and impractical. … Continue reading →

Read more »

RcppArmadillo 0.3.6.3

February 20, 2013
By

A new Armadillo version 3.6.3 came out this morning, and the corresponding RcppArmadillo version is now on CRAN. Changes are incremental: Changes in RcppArmadillo version 0.3.6.3 (2013-02-20) Upgraded to Armadillo release Version 3.6.3 ...

Read more »

Model Selection and Multi-Model Inference

February 20, 2013
By
Model Selection and Multi-Model Inference

At D-RUG this week Rosemary Hartman presented a really useful case study in model selection, based on her work on frog habitat. Here is her code run through ‘knitr’. Original code and data are posted here. (yes, I am just doing this for the flying monkey) Editor’s note: we’re giving away flying monkey dolls from our...

Read more »

Quandl: A Wikipedia for Time Series Data

February 20, 2013
By

This guest post is by Tammer Kamel, Founder of Quandl Finding and formatting numerical data for analysis in R or Excel or indeed any application is a pain that all real world data analysts know all too well. In aggregate I have probably spent weeks of my life trying to find data on the web. And several more weeks...

Read more »

Analysis of Public .Rhistory Files

February 20, 2013
By
Analysis of Public .Rhistory Files

GitHub recently launched a more powerful search feature which has been used on more than one occasion to identify sensitive files that may be hosted in a public GitHub repository. When used innocently, there are all sorts of fun things you can find with this search feature. Inspired by Aldo Cortesi's post documenting his exploration

Read more »

Fixing My Internet With R and Python

February 20, 2013
By
Fixing My Internet With R and Python

Last summer, I had some internet connectivity problems. Specifically, I would have massive latency issues that affected my conversations on Skype and my relatively pathetic under the best of circumstances efforts at online gaming. It was driving me up a wall and I couldn't figure it out. It hadn't...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.