EARL 2016 talk

September 15, 2016
By
EARL 2016 talk

I spoke on 14th September at the EARL (Effective Applications of the R Language) conference in London. This is event is concerned exclusively with the R programming language and it was the first… Continue reading →

Read more »

2016 Data Science Salary Survey results

September 15, 2016
By
2016 Data Science Salary Survey results

O'Reilly has released the results of the 2016 Data Science Salary Survey. This survey is based on data from over 900 respondents to a 64-question survey about data-related tasks, tools, and the salary they receive from doing/using them. The median salary reported in the survey was US$87,000; amongst data scientists in the US, the median salary was US$106,000. Appropriately...

Read more »

Why you need version control

September 15, 2016
By

I recently had an email exchange with a seasoned, well respected analytical professional which included the following (from them, not me): “… my versioning is to have multiple versions of files and to use naming conventions… it works really well....

Read more »

HIBPwned updated on CRAN

September 15, 2016
By

Haveibeenpwned.com is a fantastic service that helps people find out if they’ve been involved in a data breach. HIBPwned is an R wrapper for that service. Recently, due to abuse of the system, Troy Hunt had to add a limit of one request per 1.5s. The new version published on CRAN last night adds a The post

Read more »

Data Science 101, now online

September 14, 2016
By

We are delighted to note that IBM's BigDataUniversity.com has launched the quintessential introductory course on data science aptly named Data Science 101.The target audience for the course is the uninitiated cohort that is curious about data science and would like to take the baby steps to a career in data and analytics. Needless to say, the course is for...

Read more »

Monitoring R Applications with RZabbix

September 14, 2016
By
Monitoring R Applications with RZabbix

As R users we mostly perform analysis, produce reports and create interactive shiny applications. Those are rather one-time performances. Sometimes, however, the R developer enters the world of the real software development, where R applications shoul...

Read more »

How I made some Pokémon Business Cards

September 14, 2016
By
How I made some Pokémon Business Cards

As I’m in the industry now I figured I needed some business cards and as it seems the 90s never left us and Japanese monsters are hip again, I decided to make them Pokémon themed. I think they turned out pretty well, and here I’m just going ...

Read more »

GoodReads: Exploratory data analysis and sentiment analysis (Part 2)

September 14, 2016
By
GoodReads: Exploratory data analysis and sentiment analysis (Part 2)

After scraping reviews from Goodreads in the first installment of this series, we are now ready to do some exploratory data analysis to get a better sense of the data we have. This will also allow us to create features that we will use in future analyses. Setup and data preparation We start by loading Related Post

Read more »

2016-12 ‘DOM’ Version 0.2

September 13, 2016
By

This report describes changes in version 0.2 of the ‘DOM’ package for R. Version 0.1 of ‘DOM’ allowed HTML content to be added to a web page (or removed or modified); version 0.2 adds the ability to append SVG content … Continue reading →

Read more »

Forecasting Opportunities

September 13, 2016
By
Forecasting Opportunities

The previous post in this series, showed a way to identify trading opportunities. The approach I implemented used time series daily data to identify good entry points in terms of risk-reward. The natural next step is to try to make use of these opportunities using machine learning. To refresh: the output of the previous post The post

Read more »

Announcing the simputation package: make imputation simple

September 13, 2016
By

I am happy to announce that my simputation package has appeared on CRAN this weekend. This package aims to simplify missing value imputation. In particular it offers standardized interfaces that make it easy to define both imputation method and imputation … Continue reading →

Read more »

New Version of the OpenStreetMap R Pacakge

September 13, 2016
By
New Version of the OpenStreetMap R Pacakge

A new version of the OpenStreetMap package has been released to CRAN. OpenStreetMap 0.3.3 contains several minor improvements. I've removed the CloudMade tile set types, as they seem to have gone out of business. MapQuest has also been removed as they have moved to a new API. The mapbox type has been updated to use their

Read more »

A predictive maintenance solution template with SQL Server R Services

September 13, 2016
By
A predictive maintenance solution template with SQL Server R Services

by Jaya Mathew, Data Scientist at Microsoft By using R Services within SQL Server 2016, users can leverage the power of R at scale without having to move their data around. Such a solution is beneficial for organizations with very sensitive, big data which cannot be hosted on any public cloud but does most of their coding in R....

Read more »

Independent t test in R

September 13, 2016
By
Independent t test in R

The independent t test is used to test if there is any statistically significant difference between two means. Use of an independent t test requires several assumptions to be satisfied. The assumptions are listed below The variables are continuous and independent The variables are normally distributed The variances in each group are equal When these

Read more »

New features in imager 0.30

September 13, 2016
By
New features in imager 0.30

imager is an R package for image processing, based on CImg. This new release brings many new features, including: Support for automatic parallel processing using OpenMP. A new S3 class, imlist, which makes it easy to work with image lists New functions for interactively selecting image regions (grabRect,grabPoint,grabLine) Experimental support for CImg’s byte-compiled DSL via

Read more »

anytime 0.0.1: New package for ‘anything’ to POSIXct (or Date)

September 13, 2016
By
anytime 0.0.1: New package for ‘anything’ to POSIXct (or Date)

anytime just arrived on CRAN as a very first release 0.0.1. So why (yet another) package dealing with dates and times? R excels at computing with dates, and times. By using typed representation we not only get all that functionality but also of the a...

Read more »

Creating an animation using R

September 12, 2016
By
Creating an animation using R

In this post, I will show you how to create an animation using R and ffmpeg. The idea to do so is pretty simple: Generate a number of snapshots Combine them in a video file using ffmpeg The best way to learn about the art of animation is by doing it ourselves, so our work Related Post

Read more »

Shiny 0.14

September 12, 2016
By
Shiny 0.14

A new Shiny release is upon us! There are many new exciting features, bug fixes, and library updates. We’ll just highlight the most important changes here, but you can browse through the full changelog for details. This will likely be the last release before shiny 1.0, so get out your party hats! To install it,

Read more »

Volunteer to help improve R’s documentation

September 12, 2016
By

The R Consortium, in its most recent funding round, awarded a grant of $10,000 to The R Documentation Task Force, whose mission is to design and build the next generation R documentation system. (Microsoft is a Platinum Member of the R Consortium.) The task force has the support and participation of R Core members Duncan Murdoch, Michael Lawrence, and...

Read more »

Did Wages Detach from Productivity in 1973? An Investigation

September 12, 2016
By
Did Wages Detach from Productivity in 1973? An Investigation

This is the third and final blog post in my series on income inequality. This post discusses the detachment of compensation from productivity that occured around 1973. I look at the data and use R for exploring this break, along with why it may have occured. R code is with the analysis, in the spirit of reproducible research.

Read more »

Analysing the Modelled Territorial Authority GDP estimates for New Zealand

September 12, 2016
By
Analysing the Modelled Territorial Authority GDP estimates for New Zealand

At the conference of the New Zealand Association of Economists (NZAE) in late June 2016 I gave a paper on Modelled Territorial Authority Gross Domestic Product, a new dataset my team developed last year in my day job. See the official website of the M...

Read more »

Some insights in soccer transfers using Market Basket Analysis

September 12, 2016
By
Some insights in soccer transfers using Market Basket Analysis

Introduction Although more than 20 years old, Market Basket Analysis (MBA) (or association rules mining) can still be a very useful technique to gain insights in large transactional data sets. The classical example is transactional data in a supermarket. For … Continue reading →

Read more »

Telco churn prediction with R+H2O

September 12, 2016
By

Recently together with my friend Wit Jakuczun we have discussed about a blog post on Revolution showing application of SQL Server R services to build and run telco churn model. It is a very nice analysis and we thought that it would be interesting to compare the results to H2O, which is a great tool for automated...

Read more »

Clustering using the ClusterR package

September 11, 2016
By
Clustering using the ClusterR package

This blog post is about clustering and specifically about my recently released package on CRAN, ClusterR. The following notes and examples are based mainly on the package Vignette. Cluster analysis or clustering is the task of grouping a set...

Read more »

Weapons of Math Destruction – A Data Scientist’s Guide to Disarmament

September 11, 2016
By
Weapons of Math Destruction – A Data Scientist’s Guide to Disarmament

I’ve had this book on pre-order since spring and it finally arrived on Friday. I subsequently devoured it over the weekend. The book lays out a clear and compelling case for how data-driven algorithms can become — in contrast to their promise of amoral objectivism — efficient means for reproducing and even exacerbating social inequalities

Read more »

Hunspell 2.0: High-Performance Stemmer, Tokenizer, and Spell Checker for R

September 11, 2016
By
Hunspell 2.0: High-Performance Stemmer, Tokenizer, and Spell Checker for R

A new version of the ropensci hunspell package has been released to CRAN. Hunspell is the spell checker library used by LibreOffice, OpenOffice, Mozilla Firefox, Google Chrome, Mac OS-X, InDesign, Opera, RStudio and many others. It provides a system for tokenizing, stemming and spelling in almost any language or alphabet. The R package exposes both the high-level spell-checker...

Read more »

stringdist 0.9.4.2 released

September 11, 2016
By
stringdist 0.9.4.2 released

stringdist 0.9.4.2 was accepted on CRAN at the end of last week. This release just fixes a few bugs affecting the stringdistmatrix function, when called with a single argument. From the NEWS file: bugfix in stringdistmatrix(a): value of p, for … Continue reading →

Read more »

Data Scientist with a wine hobby (Part I)

September 11, 2016
By
Data Scientist with a wine hobby (Part I)

After high school I made my way from Johannesburg, situated in the northern part of South Africa, to the famous wine country known as Stellenbosch in the south. Here for the first time I got a ton of exposure to wine and the countless varietals that ma...

Read more »

What is the cost of a progress bar in R?

September 11, 2016
By
What is the cost of a progress bar in R?

The pbapply R package adds progress bar to vectorized functions, like lapply. A feature request regarding progress bar for parallel functions has been sitting at the development GitHub repository for a few months. More recently, the author of the pbmcapply package dropped a note about his implementation of forking functionality with progress bar for Unix/Linux computers, which got me...

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.