Zacks Data on Quandl

November 8, 2014
By
Zacks Data on Quandl

Data from Zacks Research have just been made available on Quandl. Registered Quandl users have free preview access to these data, which cover the following: Earnings Estimates: forward-looking consensus forecasts; Earnings Surprises: estimated future and actual historical earnings; Earnings Announcements: predictions for earnings announcement dates, parameters, and supplementary data; Sales Estimates: analogous to earnings estimates,

Read more »

6 new R jobs (for November 7th 2014)

November 7, 2014
By
6 new R jobs (for November 7th 2014)

This is the bimonthly R Jobs post (for 2014-11-07), based on the R-bloggers’ sister website: R-users.com. If you are an employer who is looking to hire people from the R community, please visit this link to post a new R job (it’s free, and registration takes less than 10 seconds). If you are a job seekers, please follow the links below to learn more and apply for your job of interest (or visit previous...

Read more »

Learn about Revolution R Open in live webinar, November 12

November 7, 2014
By

On Wednesday next week, I'll be presenting a live webinar to introduce Revolution R Open and several other open source projects from Revolution Analytics. In the webinar I'll describe: The enhancements included in Revolution R Open The Reproducible R Toolkit and the checkpoint package How to call R from other applications with DeployR Open How to run R in...

Read more »

Predicting High Yield with SPY–a Two Part Post

November 7, 2014
By
Predicting High Yield with SPY–a Two Part Post

This post will cover ideas from two individuals: David Varadi of CSS Analytics with whom I am currently collaborating on some … Continue reading →

Read more »

When to fly to get there on time? Six million flights analyzed.

November 6, 2014
By
When to fly to get there on time? Six million flights analyzed.

We analyzed over six million flights to help you decide on the best time to travel to avoid delays. The post When to fly to get there on time? Six million flights analyzed. appeared first on Decision Science News.

Read more »

RcppRedis 0.1.2

November 6, 2014
By

A new release of RcppRedis is now on CRAN. It contains additional commands for hashes and sets, all contributed by John Laing and Whit Armstrong. Changes in version 0.1.2 (2014-11-06) New commands execv, hset, hget, sadd, srem, and smembers...

Read more »

Evaluating BreakoutDetection

November 6, 2014
By
Evaluating BreakoutDetection

A couple of weeks ago, Twitter open-sourced their BreakoutDetection package for R, a package designed to determine shifts in time-series data. The Twitter announcement does a great job of explaining the main technique for detection (E-Divisive with Medians), so I won’t rehash that material here. Rather, I wanted to see how this package works relative

Read more »

Seasonal periods

November 6, 2014
By

I get questions about this almost every week. Here is an example from a recent comment on this blog: I have two large time series data. One is separated by seconds intervals and the other by minutes. The length of each time series is 180 days. I’m using R (3.1.1) for forecasting the data. I’d

Read more »

geocodeHERE 0.1 is on CRAN

November 6, 2014
By

In my previous blog post, I detailed how I created my first R package called geocodeHERE. This package is a convenient wrapper for Nokia's HERE geocoding API. The cool thing about this API is that it allows for bulk geocoding. So, instead of doing n API calls to geocode n addresses, you can do it with...

Read more »

Introduction to Data Science with R video workshop

November 6, 2014
By
Introduction to Data Science with R video workshop

RStudio has teamed up with O’Reilly media to create a new way to learn R! The Introduction to Data Science with R video course is a comprehensive introduction to the R language. It’s ideal for non-programmers with no data science experience or for data scientists switching to R from Excel, SAS or other software. Join

Read more »

Faster, easier, and more reliable character string processing with stringi 0.3-1

November 6, 2014
By

A new release of the stringi package is available on CRAN (please wait a few days for Windows and OS X binary builds). # install.packages("stringi") or update.packages() library("stringi") stringi is an R package providing (but definitely not limiting to) equivalents…Read more ›

Read more »

Looking into a very messy data set

November 6, 2014
By
Looking into a very messy data set

by Joseph Rickert I recently had the opportunity to look at the data used for the 2009 KDD Cup competition. There are actually two sets of files that are still available from this competition. The "large" file is a series of five .csv files that when concatenated form a data set with 50,000 rows and 15,000 columns. The "small"...

Read more »

Improving R Data Visualisations Through Design

November 6, 2014
By
Improving R Data Visualisations Through Design

When I start an R class, one of my opening lines is nea

Read more »

Visualising stranded RNA-seq data with Gviz/Bioconductor

November 6, 2014
By
Visualising stranded RNA-seq data with Gviz/Bioconductor

Gviz is a really great package for visualising genomics data in R. Recently I have been looking at stranded RNA-seq data, which provides the ability to differentiate sense and antisense expression from a genomic locus thanks to the way in … Continue reading →

Read more »

The reddit Front Page is Not a Meritocracy

November 6, 2014
By
The reddit Front Page is Not a Meritocracy

I was pleasantly surprised when somebody shared my traveling salesman animation to reddit and the post made it all the way to reddit's default front page (i.e. the top 25). The gif racked up over 1.3 million pageviews on Imgur, a testament to reddit's traffic-generating prowess. Before the post made it to the front page, though, it was...

Read more »

Excel (and French people) are such a pain in the…

November 6, 2014
By
Excel (and French people) are such a pain in the…

A few days ago, I published a post entitled extracting datasets from excel files in a zipped folder, because I wanted to use datasets that were online, in some (zipped) excel format. The first difficult part was the folder with a non-standard character (the French é). Because next week I should be using those dataset in a crash course...

Read more »

A Software Carpentry workshop at Northwestern

November 5, 2014
By

On Friday October 31, 2014, and Saturday November 1, 2014, around thirty-five graduate students and faculty members attended a Software Carpentry workshop. Attendees came primarily from the Economics department and the Kellogg School of Management, w...

Read more »

RPushbullet 0.1.1

November 5, 2014
By

A minor bugfix release 0.1.1 of the RPushbullet package (interfacing the neat Pushbullet service) landed on CRAN yesterday morning. It cleans up a small issue related to the ability to transfer files between devices via the Pushbullet service where the ability to select a (non-default) target device has now been restored. With that, allow me to borrow...

Read more »

Treasury yield curve from the Volcker era through Greenspan,…

November 5, 2014
By
Treasury yield curve from the Volcker era through Greenspan,…

2006–2014 using FRBData 2006–2014 using FRBData packageTreasury yield curve from the Volcker era through Greenspan, Bernanke, and Yellen. require(YieldCurve) data(FedYieldCurve) maturities I can’t upload a larger view, but just run that code with a bigger width & height, you might get a ~10MB gif. I shrank it further with gifsicle. Here’s the animated history of the ECB yield curve,...

Read more »

Travis-CI to Github Pages

Travis-CI to Github Pages I don't remember how I got on this, but I believe I had a recent twitter exchange with some persons (or saw it fly by) about pushing R package vignettes to the web after building and checking on travis-ci. Hadley Wickham pointed to using such a scheme to push the web version of his book...

Read more »

High performance JSON streaming in R: Part 1

November 5, 2014
By
High performance JSON streaming in R: Part 1

The jsonlite stream_in and stream_out functions implement line-by-line processing of JSON data over a connection, such as a socket, url, file or pipe. Thereby we can construct a data processing pipeline that can handle large (or unlimited) amounts of data with limited memory. This post will walk through some examples...

Read more »

Net Promoter Mixture Modeling: Can a Single Likelihood Rating Reveal Customer Segments?

November 5, 2014
By
Net Promoter Mixture Modeling: Can a Single Likelihood Rating Reveal Customer Segments?

Net Promoter believes that customers come in one of three forms: promoters (happy yellows), passives (neutral grays), or detractors (angry reds). Cluster identification is relatively easy for all you need to do is ask the "ultimate question" concerning...

Read more »

Easy to use option settings management with the ‘settings’ package

November 5, 2014
By

Last week I released a new package called settings. It grew out of my frustration built up during several small projects where I'm generating heavily parameterized d3/js output. What I wanted was support to define a whole bunch of option … Continue reading →

Read more »

Computing Power Functions

November 5, 2014
By
Computing Power Functions

In a recent post I discussed some aspects of the distributions of some common test statistics when the null hypothesis that's being tested is actually false. One of the things that we saw there was that in many cases these distributions are "non-central", with a non-centrality parameter that increases as we move further and further away from the...

Read more »

Introducing Revolution R Enterprise V 7.3

November 5, 2014
By

by Bill Jacobs Revolution R Enterprise is the industry's first R-based analytics platform that supports a variety of parallel, grid and clustered systems such as Hadoop, Teradata database and Platform LSF Linux grids. Last year, we enhanced Revolution R Enterprise (RRE) to support big data systems, with support for Hadoop. We continued expansion of RRE in 2014, adding support...

Read more »

Update on JGBs versus USTs

November 5, 2014
By

Given the recent selloff in the Yen, I thought now would be a good time to update my favorite chart from Intended or Unintended Consequences. For a true currency death spiral, rates need to move up rather than down.  It appears we are long way from t...

Read more »

Tidbits from the Books that Defined S (and R)

November 5, 2014
By
Tidbits from the Books that Defined S (and R)

Why R? Because S! R is the open source implementation (and a pun!) of S, a language for statistical computing that was developed at Bell Labs in the late 1970s. After that, the implementation of S underwent a number of major revisions documented in a series of seminal books, often just referred to by the color of their cover: The...

Read more »

How to extract Google Analytics data in R using RGoogleAnalytics

November 5, 2014
By
How to extract Google Analytics data in R using RGoogleAnalytics

I am extremely thrilled to announce that RGoogleAnalytics was released recently by CRAN. R is already a swiss army knife for data analysis largely due its 6000 libraries. What this means is that digital analysts can now fully use the analytical capabilities of R to fully explore their Google Analytics Data. In this post, we will... Read More The post...

Read more »

Tweeting at #IMGC14 conference

November 4, 2014
By
Tweeting at #IMGC14 conference

The 28th annual International Mammalian Genome Conference was held over the last week in Bar Harbor, MA. For the first time, the official conference hashtag #IMGC14 was introduced. Twitter shares plummeted 9% next day. Pure coincide? I do not think so!Totally, 79 participants contributed 1546 tweets. Guess who was the Twitter evangelist? The...

Read more »