OpenML Workshop 2017

August 31, 2017
By

What is OpenML? The field of Machine Learning has grown tremendously over the last years, and is a key component of data-driven science. Data analysis algorithms are being invented and used every day, but their results and experiments are published almost exclusively in journals or separated repositories. However, data by itself has no value. It’s the ever-changing ecosystem surrounding data...

Read more »

Mapping to a ‘t'(map)

August 31, 2017
By
Mapping to a ‘t'(map)

tmap More maps of the Highlands? Yep, same as last time, but no need to install dev versions of anything, we can get awesome maps courtesy of the tmap package. Get the shapefile from the last post library(tmap) library(tmaptools) library(viridis) scot

Read more »

Multiplicative Congruential Generators in R

August 31, 2017
By
Multiplicative Congruential Generators in R

Part 2 of 2 in the series Random Number GenerationMultiplicative congruential generators, also known as Lehmer random number generators, is a type of linear congruential generator for generating pseudorandom numbers in . The multiplicative congruential generator, often abbreviated as MLCG or MCG, is defined as a recurrence relation similar to... The post Multiplicative Congruential Generators in R appeared first on...

Read more »

Probability functions intermediate

August 31, 2017
By
Probability functions intermediate

In this set of exercises, we are going to explore some of the probability functions in R by using practical applications. Basic probability knowledge is required. In case you are not familiarized with the function apply, check the R documentation. Note: We are going to use random numbers functions and random processes functions in R Related exercise sets: Lets Begin...

Read more »

DEADLINE EXTENDED: Last call for Boston EARL abstracts

August 31, 2017
By
DEADLINE EXTENDED: Last call for Boston EARL abstracts

...

Read more »

Text featurization with the Microsoft ML package

August 31, 2017
By
Text featurization with the Microsoft ML package

Last week I wrote about how you can use the MicrosoftML package in Microsoft R to featurize images: reduce an image to a vector of 4096 numbers that quantify the essential characteristics of the image, according to an AI vision model. You can perform a similar featurization process with text as well, but in this case you have a...

Read more »

Why to use the replyr R package

August 31, 2017
By
Why to use the replyr R package

Recently I noticed that the R package sparklyr had the following odd behavior: suppressPackageStartupMessages(library("dplyr")) library("sparklyr") packageVersion("dplyr") #__ '0.7.2.9000' packageVersion("sparklyr") #__ '0.6.2' packageVersion("dbplyr") #__ '1.1.0.9000' sc * Using Spark: 2.1.0 d NA ncol(d) #__ NA nrow(d) #__ NA … Continue reading Why to use the replyr R package

Read more »

Pulling Data Out of Census Spreadsheets Using R

August 31, 2017
By

In this post, I show a method for extracting small amounts of data from somewhat large Census Bureau Excel spreadsheets, using R.  The objects of interest are expenditures of state and local governments on hospital capital in Iowa for the years 2004 to 2014. The data can be found at http://www2.census.gov/govs/local/. The files at the Related Post Extracting Tables from...

Read more »

Community Call – rOpenSci Software Review and Onboarding

August 31, 2017
By

Are you thinking about submitting a package to rOpenSci's open peer software review? Considering volunteering to review for the first time? Maybe you're an experienced package author or reviewer and have ideas about how we can improve. Join our Community Call on Wednesday, September 13th. We want to get your feedback and we'd love to answer your questions! Agenda Welcome (Stefanie Butland,...

Read more »

Create and Update PowerPoint Reports using R

August 30, 2017
By
Create and Update PowerPoint Reports using R

In my sordid past, I was a data science consultant. One thing about data science that they don’t teach you at school is that senior managers in most large companies require reports to be in PowerPoint....

Read more »

Pacific Island Hopping using R and iGraph

August 30, 2017
By
Pacific Island Hopping using R and iGraph

Use R as your travel guide and plan your next Pacific island hopping holiday with the igraph package. This code analyses flight routes and finds routes. Continue reading → The post Pacific Island Hopping using R and iGraph appeared first on The Devil is in the Data.

Read more »

Community Call – rOpenSci Software Review and Onboarding

Are you thinking about submitting a package to rOpenSci’s open peer software review? Considering volunteering to review for the first time? Maybe you’re an experienced package author or reviewer and have ideas about how we can improve. Join our Community Call on Wednesday, September 13th. We want to get your feedback and we’d love to answer your questions! Agenda Welcome (Stefanie Butland,...

Read more »

Community Call – rOpenSci Software Review and Onboarding

Are you thinking about submitting a package to rOpenSci’s open peer software review? Considering volunteering to review for the first time? Maybe you’re an experienced package author or reviewer and have ideas about how we can improve. Join our Community Call on Wednesday, September 13th. We want to get your feedback and we’d love to answer your questions! Agenda Welcome...

Read more »

TensorFlow Estimators

August 30, 2017
By

The tfestimators package is an R interface to TensorFlow Estimators, a high-level API that provides implementations of many different model types including linear models and deep neural networks. More models are coming soon such as state saving recurrent neural networks, dynamic recurrent neural networks, support vector machines, random forest, KMeans clustering, etc. TensorFlow estimators also provides a flexible framework for...

Read more »

TensorFlow Estimators

August 30, 2017
By

The tfestimators package is an R interface to TensorFlow Estimators, a high-level API that provides implementations of many different model types including linear models and deep neural networks. More models are coming soon such as state saving recurrent neural networks, dynamic recurrent neural networks, support vector machines, random forest, KMeans clustering, etc. TensorFlow estimators also provides a flexible framework for...

Read more »

Probably more likely than probable

August 30, 2017
By
Probably more likely than probable

What kind of probability are people talking about when they say something is "highly likely" or has "almost no chance"? The chart below, created by Reddit user zonination, visualizes the responses of 46 other Reddit users to "What probability would you assign to the phase: " for various statements of probability. Each set of responses has been converted to...

Read more »

IMDB Genre Classification using Deep Learning

August 30, 2017
By
IMDB Genre Classification using Deep Learning

The Internet Movie Database (Imdb) is a great source to get information about movies. Keras provides access to some part of the cleaned dataset (e.g. for sentiment classification). While sentiment classification is an interesting topic, I wanted to see...

Read more »

3-D animations with R

August 30, 2017
By
3-D animations with R

R is often used to visualize and animate 2-dimensional data. (Here are just a few examples.) But did you know you can create 3-dimensional animations as well? As Thomas Lins Pedersen explains in a recent blog post, the trick is in using the persp function to translate points in 3-D space into a 2-D projection. This function is normally...

Read more »

Finding distinct rows of a tibble

August 30, 2017
By

I’ve been using R or its predecessors for about 30 years, which means I know a lot about R, and also that I don’t necessarily know how to use modern R tools. Lately, I’ve been trying to unlearn some old approaches, and to re-learn them using the ...

Read more »

R in the Data Science Stack at ODSC

August 30, 2017
By
R in the Data Science Stack at ODSC

Register now for ODSC West in San Francisco, November 2-4 and save 60% with code RB60 until September 1st. R continues to hold its own in the data science landscape thanks in no small part to its flexibility.  That flexibility allows R to integrate with some of the most popular data science tools available. Given … Continue reading R...

Read more »

RStudio 1.1 Preview – I Only Work in Black

August 30, 2017
By
RStudio 1.1 Preview – I Only Work in Black

Today, we’re continuing our blog series on new features in RStudio 1.1. If you’d like to try these features out for yourself, you can download a preview release of RStudio 1.1. I Only Work in Black For those of us that like to work in black or very very dark grey, the dark theme can be enabled from the ‘Global Options’...

Read more »

Layered Data Visualizations Using R, Plotly, and Displayr

August 30, 2017
By
Layered Data Visualizations Using R, Plotly, and Displayr

If you have tried to communicate research results and data visualizations using R, there is a good chance you will have come across one of its great limitations. R is painful when you need to...

Read more »

Web Scraping Influenster: Find a Popular Hair Care Product for You

August 30, 2017
By
Web Scraping Influenster: Find a Popular Hair Care Product for You

Are you a person who likes to try new products? Are you curious about which hair products are popular and trendy? If you're excited about getting The post Web Scraping Influenster: Find a Popular Hair Care Product for You appeared first on NYC Data Science Academy Blog.

Read more »

Data wrangling : Cleansing – Regular expressions (2/3)

August 30, 2017
By
Data wrangling : Cleansing – Regular expressions (2/3)

Data wrangling, is the process of importing, cleaning and transforming raw data into actionable information for analysis. It is a time-consuming process which is estimated to take about 60-80% of analyst’s time. In this series we will go through this process. It will be a brief series with goal to craft the reader’s skills on Related exercise sets: Regular Expressions...

Read more »

The one function call you need to know as a data scientist: h2o.automl

August 30, 2017
By
The one function call you need to know as a data scientist: h2o.automl

Introduction Two things that recently came to my attention were AutoML (Automatic Machine Learning) by h2o.ai and the fashion MNIST by Zalando Research. So as a test, I ran AutoML on the fashion mnist data set. H2o AutoML As you all … Continue reading →

Read more »

RcppArmadillo 0.7.960.1.2

August 29, 2017
By
RcppArmadillo 0.7.960.1.2

A second fix-up release is needed following on the recent bi-monthly RcppArmadillo release as well as the initial follow-up as it turns out that OS X / macOS is so darn special that it needs an entire separate treatment for OpenMP. Namely to turn it...

Read more »

Tidy Time Series Analysis, Part 4: Lags and Autocorrelation

Tidy Time Series Analysis, Part 4: Lags and Autocorrelation

In the fourth part in a series on Tidy Time Series Analysis, we’ll investigate lags and autocorrelation, which are useful in understanding seasonality and form the basis for autoregressive forecast models such as AR, ARMA, ARIMA, SARIMA (basically any forecast model with “AR” in the acronym). We’ll use the tidyquant package along with our tidyverse downloads data obtained from...

Read more »

New CRAN Package Announcement: splashr

August 29, 2017
By
New CRAN Package Announcement: splashr

I’m pleased to announce that splashr is now on CRAN. (That image was generated with splashr::render_png(url = "https://cran.r-project.org/web/packages/splashr/")). The package is an R interface to the Splash javascript rendering service. It works in a similar fashion to Selenium but is fear more geared to web scraping and has quite a bit of power under the... Continue reading →

Read more »

Clean or shorten Column names while importing the data itself

August 29, 2017
By
Clean or shorten Column names while importing the data itself

When it comes to clumsy column headers namely., wide ones with spaces and special characters, I see many get panic and change the headers in the source file, which is an awkward option given variety of alternatives that exist in R for handling them. One easy handling of such scenarios is using library(janitor), as name suggested can...

Read more »

Search R-bloggers


Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R



Quantide: statistical consulting and training

ODSC2 west

ODSC1_jobs

datasociety

http://www.eoda.de



CRC R books series







Six Sigma Online Training



mljar.com



Contact us if you wish to help support R-bloggers, and place your banner here.