Monthly Archives: July 2013

“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I…”

July 29, 2013
By
“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I…”

“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I think I will call it “volatility”.” - Daniel Davies via nonergodic   (PS: If you didn’t see Read more »

Quandl.com for Time Series Datasets

July 29, 2013
By

If you want to dig in with both feet on time series data, then quandl.com is a good choice.  The website claims to have several million datasets all of them available for free download.  It also allows you to upload data to the site with an a...

Read more »

Tips for speeding up R with byte compilation

July 29, 2013
By

A byte-compiler for R code β€” which can improve the execution performance of R functions β€” was introduced in R 2.13.0, and was automatically applied to the bundled packages in R 2.14.0. Drew Dimmery provides some good advice for identifying targets amongst your own R functions for compilation: I have some function that will be repeatedly executed n times....

Read more »

Estimate Age from First Name

July 29, 2013
By
Estimate Age from First Name

Today I read a cute post from Flowing Data on the most trendy names in US history. What caught my attention was a link posted in the article to the source data, which happens to be yearly lists of baby … Continue reading →

Read more »

Programming instrumental music from scratch

July 29, 2013
By
Programming instrumental music from scratch

I recently posted about automatically making music. The algorithm that I made pulled out interesting sequences of music from existing songs and remixed them. While this worked reasonably well, it also didn’t have full control over the basics of the music; it wasn’t actually specifying which instruments to use, or what notes to play. Maybe I’m being...

Read more »

Exploratory Data Analysis: Combining Histograms and Density Plots to Examine the Distribution of the Ozone Pollution Data from New York in R

Exploratory Data Analysis: Combining Histograms and Density Plots to Examine the Distribution of the Ozone Pollution Data from New York in R

Introduction This is a follow-up post to my recent introduction of histograms.  Previously, I presented the conceptual foundations of histograms and used a histogram to approximate the distribution of the “Ozone” data from the built-in data set “airquality” in R.  Today, I will examine this distribution in more detail by overlaying the histogram with parametric

Read more »

Easier Database Querying with R

July 29, 2013
By
Easier Database Querying with R

I have a strong distaste for database connection management.  All I want to do when I want to query one of our many databases at work is to simply supply the query, and package the result into an R data.frame or data.table. R has many great database connection tools, including but not limited to RPostgreSQL,

Read more »

analyze the youth risk behavior surveillance system (yrbss) with r

July 29, 2013
By

the youth risk behavior surveillance system is the high school edition of the behavioral risk factor surveillance system (brfss), a scientific study of good kids who do bad things.  questions are mostly about sex, drugs, rock and roll, and populat...

Read more »

BCEA 2.0

July 28, 2013
By
BCEA 2.0

I know that updating a package too often is not quite good practice, so, given we've released BCEA 1.3-1 just about a month ago, this is way too soon to move forward. But between the last release and now, I've been doing some reading and have made some...

Read more »

Orthogonal Partial Least Squares (OPLS) in R

July 28, 2013
By
Orthogonal Partial Least Squares (OPLS) in R

I often need to analyze and model very wide data (variables >>>samples), and because of this I gravitate to robust yet relatively simple methods. In my opinion partial least squares (PLS) is a particular useful algorithm. Simply put, PLS is an extension of principal components analysis (PCA), a non-supervised  method to maximizing  variance explained in X,

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)