Resources for Learning Data Manipulation in R, SAS and Microsoft Excel

Resources for Learning Data Manipulation in R, SAS and Microsoft Excel

I had the great pleasure of speaking to the Department of Statistics and Actuarial Science at Simon Fraser University on last Friday to share my career advice with its students and professors.  I emphasized the importance of learning skills in data manipulation during my presentation, and I want to supplement my presentation by posting some

Read more »

The Logical-Invest “Universal Investment Strategy”–A Walk Forward Process on SPY and TLT

February 23, 2015
By
The Logical-Invest “Universal Investment Strategy”–A Walk Forward Process on SPY and TLT

I’m sure we’ve all heard about diversified stock and bond portfolios. In its simplest, most diluted form, it can be … Continue reading →

Read more »

Launching DataScience.Vegas Blog

February 23, 2015
By

We are glad to announce the launch of DataScience.Vegas as a blog that aggregates all the events, news and information impacting the Las Vegas data science community. Our community has witnessed the birth and steady growth of several data science meetup groups with a very enthusiastic group of devoted members. We are a community of data scientists, data miners, statisticians, data...

Read more »

Hadoop and Neo4j

February 23, 2015
By
Hadoop and Neo4j

Hadoop is being widely used for processing big data and Neo4j is a popular open-source graph database. When doing social network analysis on big data, a “natural” thought is to use them together. Unfortunately, Neo4j cannot work directly on HDFS … Continue reading →

Read more »

7 new R jobs (2015-02-23)

February 23, 2015
By
7 new R jobs (2015-02-23)

This is the bimonthly post (for 2015-02-23) for new R Jobs from R-users.com. If you are an employer who is looking to hire people from the R community, please visit this link to post a new R job (it’s free and quick). If you are a job seekers, please follow the links below to learn more and apply for your job of interest (or visit previous R jobs posts). Full-Time Machine Learning...

Read more »

drat Tutorial: Publishing a package

February 22, 2015
By

Introduction The drat package was released earlier this month, and described in a first blog post. I received some helpful feedback about what works and what doesn't. For example, Jenny Bryan pointed out that I was not making a clear enough distinction between the role of using drat to publish code, and using drat to...

Read more »

k-means clustering and Voronoi sets

February 22, 2015
By
k-means clustering and Voronoi sets

In the context of -means, we want to partition the space of our observations into  classes. each observation belongs to the cluster with the nearest mean. Here “nearest” is in the sense of some norm, usually the (Euclidean) norm. Consider the case where we have 2 classes. The means being respectively the 2 black dots. If we partition based...

Read more »

Priors on odds and probability of success

February 22, 2015
By
Priors on odds and probability of success

In Bayesian Approaches to Clinical Trials and Health-Care Evaluation (David J. Spiegelhalter, Keith R. Abrams, Jonathan P. Myles) they mention that an non informative prior should be uniform over the range of interest. They combine this with the d...

Read more »

Export R output to a file

February 21, 2015
By

Sometimes it is useful to export the output of a long-running R command. For example, you might want to run a time consuming regression just before leaving work on Friday night, but would like to get the output saved inside your Dropbox folder to take a look at the results before going back to work on...

Read more »

RcppAPT 0.0.1

February 21, 2015
By

Over the last few days I put together a new package RcppAPT which interfaces the C++ library behind the awesome apt, apt-get, apt-cache, ... commands and their GUI-based brethren. The package currently implements two functions which permit search for package information via a regular expression, as well as a (vectorised) package name-based check. More to come, and contributions would...

Read more »

using the httr package to retrieve data from apis in R

February 20, 2015
By

For a project I’m working on, I needed to access residential electricity rates and associated coordinate information (lat/long) for locations in the US. After a little searching, I found that data.gov offers the rate information in two forms: a static list of approximate rates by region and an API, which returns annual average utility rates ($/kWH) for residential, commercial,...

Read more »

HOW TO: Package vignettes in plain LaTeX

February 20, 2015
By
HOW TO: Package vignettes in plain LaTeX

Ever wanted to include a plain-LaTeX vignette in your package and have it compiled into a PDF? The R.rsp package provides a four-line solution for this. But, first, what's R.rsp? R.rsp is an R package that implements a compiler for the RSP markup language. RSP can be used to embed dynamic R code in any...

Read more »

Making Maps in R with Ryan Peek and Michele Tobias

February 20, 2015
By

Today, Ryan Peek and Michele Tobias gave an introduction to making maps in R. Here’s the webcast: (Pardon the little scuffle at the beginning and as we switched computers halfway through. Still getting the hang of hangouts.) Resources: Download all of Ryan’s code and HTML files here. See Michele’s slides on Slideshare here. Code for Michele’s example maps...

Read more »

A Closer Update To David Varadi’s Percentile Channels Strategy

February 20, 2015
By
A Closer Update To David Varadi’s Percentile Channels Strategy

So thanks to seeing Michael Kapler’s implementation of David Varadi’s percentile channels strategy, I was able to get a better … Continue reading →

Read more »

The United States In Two Words

February 20, 2015
By
The United States In Two Words

Sweet home Alabama, Where the skies are so blue; Sweet home Alabama, Lord, I’m coming home to you (Sweet home Alabama, Lynyrd Skynyrd) This is the second post I write to show the abilities of twitteR package and also the second post I write for KDnuggets. In this case my goal is to have an insight of … Continue reading...

Read more »

Part 2: Data Preparation

February 20, 2015
By

In Part 1 I have introduced the weather data set we will be using in this series of tutorials. We are now going to have the data prepared for the subsequent EDA. We will recode and transform variables, change their types, and perform some basic data ch...

Read more »

Put the size of countries in perspective by comparing them to US states

February 20, 2015
By
Put the size of countries in perspective by comparing them to US states

Wouldn't it be cool for US readers to see how big foreign countries are by comparing them to presumably familiar US states? Wouldn't it be cool for non-US readers to see how big US states are by comparing them to presumably familiar countries? The post Put the size of countries in perspective by comparing them to US states appeared...

Read more »

CRAN: the Granddaddy of Analytical Marketplaces

February 20, 2015
By

by Bill Jacobs, VP Product Marketing, Revolution Analytics I had a most interesting exchange with an industry analysis firm recently who suggested that application marketplaces were critical to the success of analytical tools, suggesting that Revolution Analytics was remiss in not creating one. I must say, I was taken aback somewhat. In considering the suggestion, I was left suspecting...

Read more »

Launching our R Training Path

February 20, 2015
By
Launching our R Training Path

Become a proficient R user in no time via DataCamp’s newly launched guided R Training Path! We’re very excited to announce that we have just launched the R Training Path! This guided training path will take you from getting to know R, the leading open-source programming language in statistics and data science, over manipulation and The post

Read more »

Who Has the Best Fantasy Football Projections? 2015 Update

February 20, 2015
By
Who Has the Best Fantasy Football Projections? 2015 Update

In prior posts, I demonstrated how to download projections from numerous sources, calculate custom projections for your league, and compare the accuracy of different sources of projections (2013, 2014).  In The post Who Has the Best Fantasy Football Projections? 2015 Update appeared first on Fantasy Football Analytics.

Read more »

Jade: a clean, whitespace-sensitive template language for writing HTML

February 19, 2015
By
Jade: a clean, whitespace-sensitive template language for writing HTML

Jade is a high performance template engine heavily influenced by Haml. It is designed for writing HTML pages using a concise, modern syntax without the verbosity of old fasioned XML-like tags that we all want to forget about. The new rjade package implements convenient bindings from R to this popular JavaScript...

Read more »

Applied Nonparametric Econometrics

February 19, 2015
By

Recently, I received a copy of a new econometrics book, Applied Nonparametric Econometrics, by Daniel Henderson and Christopher Parmeter.The title is pretty self-explanatory and, as you'd expect with any book published by CUP, this is a high-quality item.The book's Introduction begins as follows:"The goal of this book is to help bridge the gap between applied economists and theoretical...

Read more »

Customer segmentation – LifeCycle Grids, CLV and CAC with R

February 19, 2015
By
Customer segmentation – LifeCycle Grids, CLV and CAC with R

We studied a very powerful approach for customer segmentation in the previous post, which is based on the customer’s lifecycle. We used two metrics: frequency and recency. It is also possible and very helpful to add monetary value to our segmentation. If you have customer acquisition cost (CAC) and customer lifetime value (CLV), you can easily... Read More »

Read more »

Some R Conferences in 2015

February 19, 2015
By

by Joseph Rickert For the past few years, the Strata + Hadoop World Conference in San Jose has kicked off my personal conference season. With its focus on Data Science, Strata always seems to present some interesting R related talks, and I am looking forward to the various events over the next couple of days. But, Strata and other...

Read more »

Host a CRAN mirror using Docker

Host a CRAN mirror using Docker

CRAN mirrors are the backbone to everyday common R usage. They provide the R website and most of the R packages today. Currently there are about 104 official CRAN mirrors. Hosting a CRAN mirror is one step to help the R community and is explained here. To ease that process, at BNOSAC, we have created...

Read more »

The Evolution of LondonR

February 19, 2015
By
The Evolution of LondonR

By Liz Matthews – Marketing and Events, UK. There are over 150 R user groups currently running across the globe. From Chiang Mai to Kansas and Melbourne to Cracow, R enthusiasts meet to discuss, share and promote the usage of R. … Continue reading →

Read more »

amazing Gibbs sampler

February 18, 2015
By
amazing Gibbs sampler

When playing with Peter Rossi’s bayesm R package during a visit of Jean-Michel Marin to Paris, last week, we came up with the above Gibbs outcome. The setting is a Gaussian mixture model with three components in dimension 5 and the prior distributions are standard conjugate. In this case, with 500 observations and 5000 Gibbs

Read more »

An update to the checkpoint package

February 18, 2015
By

by Andrie de Vries During October 2014 we announced RRT (the Reproducible R Toolkit) that consists of the checkpoint package and the MRAN. In January, David Smith followed up with another post about reproducibility using Revolution R Open. Since then, we've had several requests for new features and enhancements. The development code for checkpoint is available at GitHub. The...

Read more »

Philippine Infographic: Recapitulation on Incidents Involving Motorcycle Riding in Tandem Criminals for 2011-2013

February 18, 2015
By
Philippine Infographic: Recapitulation on Incidents Involving Motorcycle Riding in Tandem Criminals for 2011-2013

The Philippine government has launched Open Data Philippines (data.gov.ph) last year, January 16, 2014. Accordingly, the data.gov.ph aims to make national government data searchable, accessible, and useful, with the help of the different agencies of go...

Read more »