Passing columns of a dataframe to a function without quotes

May 6, 2013
By
Passing columns of a dataframe to a function without quotes

I love the syntax of calls to lm and ggplot, wherein the dataframe is specified as a variable and specific columns are referenced as though they were separate variables. While developing some of my functions, I’d wanted to introduce something similar. I often find that I have a single large dataframe and want to execute

Read more »

xkcd: Visualized

May 6, 2013
By
xkcd: Visualized

IntroductionIt's been said that the ideal job is one you love enough to do for free but are good enough at that people will pay you for it. That if you do what you love no matter what others may say, and if you work at it hard enough, and long enough, eventually people will recognize it and...

Read more »

Explaining real-time predictive analytics with big data (video)

May 6, 2013
By

In my presentation to the Strata Santa Clara 2013 conference earlier this year, my goal was to give a succinct (under 20 minutes!) explanation of three terms that are two often used as mere buzzwords: predictive analytics, real time, and big data. You can download the slides for my presentation, Real-time Big Data Analytics: From Deployment to Production, from...

Read more »

Veterinary Epidemiologic Research: Count and Rate Data – Zero Counts

May 6, 2013
By
Veterinary Epidemiologic Research: Count and Rate Data – Zero Counts

Continuing on the examples from the book Veterinary Epidemiologic Research, we look today at modelling count when the count of zeros may be higher or lower than expected from a Poisson or negative binomial distribution. When there’s an excess of zero counts, you can fit either a zero-inflated model or a hurdle model. If zero

Read more »

When the “reorder” function just isn’t good enough…

May 6, 2013
By
When the “reorder” function just isn’t good enough…

The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something).  Take the following simple data frame: df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”)) I expect that if I call the reorder function on the … Continue reading →

Read more »

Oracle R Distribution for R 2.15.2 available on public-yum

May 6, 2013
By
Oracle R Distribution for R 2.15.2 available on public-yum

Oracle R Distribution (ORD) for R 2.15.2 on Linux is now available for download from Oracle's public-yum repository.  R 2.15.2 is a maintenance update that includes improved performance and reduced memory usage for some commonly-used functions, increased memory available for data on 64-bit systems, enhanced localization for Polish language users, and a number of bug fixes.  Detailed updates...

Read more »

Bayesian and Frequentist Approaches: Ask the Right Question

May 6, 2013
By
Bayesian and Frequentist Approaches: Ask the Right Question

It occurred to us recently that we don’t have any articles about Bayesian approaches to statistics here. I’m not going to get into the “Bayesian versus Frequentist” war; in my opinion, which style of approach to use is less about philosophy, and more about figuring out the best way to answer a question. Once you Related posts:

Read more »

Incomplete Data by Design: Bringing Machine Learning to Marketing Research

May 6, 2013
By
Incomplete Data by Design: Bringing Machine Learning to Marketing Research

Survey research deals with the problem of question wording by always asking the same question.  Thus, the Gallup Daily Tracking is filled with examples of moving averages for the exact same question asked precisely the same way every day. &nb...

Read more »

New fixed.angle() Function

Hello morphometricians,Below you can find a new fixed angle function addressing the problem discovered by Fabio Machado in the morphmet mail archive. We will include this function in our next schedule update to geomorph. Cheers, Erik CODE: ...

Read more »

Mixed Model Example — Wagner et al. (2006)

May 6, 2013
By
Mixed Model Example — Wagner et al. (2006)

I am preparing for a workshop on mixed models and looked at the paper “Accounting for multilevel data structures in fisheries data using mixed models” by Wagner et al. (2006) (PDF available here).  Wagner et al. (2006) used two examples, with the … Continue reading →

Read more »

Monitoring des médias 2

May 6, 2013
By
Monitoring des médias 2

(This article was first published on Learning Data Science , and kindly contributed to R-bloggers) Petit monitoring de notre observatoire des médias sur Twitter. Chez Mediapart : Le Monde Le Figaro Le parisien Vue globale Le code pour réaliser ce post : To leave a comment for the author, please follow the link and comment on their blog: Learning...

Read more »

Creating a QGIS-Style (qml-file) with an R-Script

May 6, 2013
By
Creating a QGIS-Style (qml-file) with an R-Script

How to get from a txt-file with short names and labels to a QGIS-Style (qml-file)? I used the below R-script to create a style for this legend table where I copy-pasted the parts I needed to a txt-file, like for the WRB-FULL (WRB-FULL: Full soil code o...

Read more »

The half variance approximation for mean returns

May 6, 2013
By
The half variance approximation for mean returns

What’s that thing about arithmetic and geometric returns and the variance? Previously An introduction to the difference between simple and log returns is: A tale of two returns Issue Suppose you are predicting the mean annual return of an asset for some number of years.  To simplify the discussion, let’s buy into the fantasy that … Continue reading...

Read more »

analyze the social security administration public use microdata files (ssapumf) with r

May 5, 2013
By

the social security administration (ssa) must be overflowing with quiet heroes, because their public-use microdata files are as inconspicuous as they are thorough.  sure, ssa publishes enough great statistical research of their own that outside re...

Read more »

Google Analytics + R = FUN!

May 5, 2013
By
Google Analytics + R = FUN!

The scope of this post it to show how simple it is to get data out of the Google Analytics and create your own reports (that you hope that they can be semi-automated at least) and you favourite statistical graphs (those that GA is currently missing). As you already know R is a favourite tool

Read more »

Google Analytics + R = FUN!

May 5, 2013
By
Google Analytics + R = FUN!

The scope of this post it to show how simple it is to get data out of the Google Analytics and create your own reports (that you hope that they can be semi-automated at least) and you favourite statistical graphs (those that GA is currently missing). As you already know R is a favourite tool ...read more

Read more »

… ridiculously photogenic factors (heatmap with p-values)

May 5, 2013
By
… ridiculously photogenic factors (heatmap with p-values)

Some months ago, I had to explore a vast amount of categorical variables before making some multivariate analyses. One good way to know your raw data, to make new hypotheses…etc, is to calculate some pairwise “crude” chi-square tests of independence … Sigue leyendo →

Read more »

How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid

How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid

Introduction Today, I will talk about the math behind calculating partial correlation and illustrate the computation in R with an example involving the oxidation of ammonia to make nitric acid using a built-in data set in R called stackloss.  In a separate post, I will also share an R function that I wrote to estimate partial correlation.

Read more »

R, D3.js and SNA Course

R, D3.js and SNA Course

I took the SNA course by Lada Adamic in coursera. It's a super interesting course. In fact, I was using the networks only how a visualization tool, and that is what it make me little bit embarrassing because there are more, a lot of more. You can detec...

Read more »

R, D3.js and SNA Course

R, D3.js and SNA Course

I took the SNA course by Lada Adamic in coursera. It's a super interesting course. In fact, I was using the networks only how a visualization tool, and that is what it make me little bit embarrassing because there are more, a lot of more. You can detec...

Read more »

R/Finance 2013 Is Coming Quickly…

May 5, 2013
By
R/Finance 2013 Is Coming Quickly…

There is about two weeks remaining until R/Finance 2013 - being held on May 17th and 18th at UIC in Chicago.  Make sure you register beforehand to ensure you have a spot, and – yes - you do want to come to the conference dinner on Friday.   I am particularly excited about the lineup of keynotes

Read more »

Simulation shows gain of clmm over ANOVA is small

May 5, 2013
By
Simulation shows gain of clmm over ANOVA is small

After last post's setting up for a simulation, it is now time to look how the models compare. To my disappointment with my simple simulations of assessors behavior the gain is minimal. Unfortunately, the simulation took much more time than I ...

Read more »

Volatility Regimes: Part 2

Volatility Regimes: Part 2

Adam Duncan from January, 2013Also avilable on R-bloggers.com Strategy Implications In this part of the volatility regimes analysis, we’ll use the regime identification framework established in part 1 to draw conclusions about which strategies work best is each regime. That should prove useful to us and goes a long way to answering the question, “What strategies should I be...

Read more »

Quandl Package – 5,000,000 free datasets at the tip of your fingers!

May 5, 2013
By
Quandl Package – 5,000,000 free datasets at the tip of your fingers!

# Yes, you read that correctly and no Quandl (http://www.quandl.com/) did not pay me anything.# Quandl is a new database management tool which seeks to become the place to find datasets.  They boast of having over 5x10^6 data sets available t...

Read more »

AIC & BIC vs. Crossvalidation

May 4, 2013
By
AIC & BIC vs. Crossvalidation

Model selection is a process of seeking the model in a set of candidate models that gives the best balance between model fit and complexity (Burnham & Anderson 2002). I have always used AIC for that. But you can also…Read more →

Read more »

A Prototype of Monotonic Binning Algorithm with R

May 4, 2013
By
A Prototype of Monotonic Binning Algorithm with R

I’ve been asked many time if I have a piece of R code implementing the monotonic binning algorithm, similar to the one that I developed with SAS (http://statcompute.wordpress.com/2012/06/10/a-sas-macro-implementing-monotonic-woe-transformation-in-scorecard-development) and with Python (http://statcompute.wordpress.com/2012/12/08/monotonic-binning-with-python). Today, I finally had time to draft a quick prototype with 20 lines of R code, which is however barely useable without the

Read more »

Backporting R 3.0.0 to Quantal, Precise, and Lucid

May 4, 2013
By

Today (May 4, 2013) I will begin the process of backporting R 3.0.0 to Quantal, Precise, and Lucid. This will include all the recommended packages and the packages for R found in the universe repository for Ubuntu. Things to keep in mind: If you do...

Read more »

LaTeX in R graphs

May 3, 2013
By
LaTeX in R graphs

A nice post was recently published on the rsnippets blog, about the tikzDevice R package. This package is – indeed – awesome. Even if it has been removed from the CRAN website. Of course, it can be download from the archive folder, on http://cran.r-project.org/…, but also (for a more recent version)  on http://download.r-forge.r-project.org/…. But first, it is necessary to install...

Read more »

Animation, from R to LaTeX

May 3, 2013
By
Animation, from R to LaTeX

Just a short post, to share some codes used to generate animated graphs, with R. Assume that we would like to illustrate the law of large number, and the convergence of the average value from binomial sample. We can generate samples  using > n=200 > k=1000 > set.seed(1) > X=matrix(sample(0:1,size=n*k,replace=TRUE),n,k) Each row  will be a trajectory of heads and...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.