September 2017

Nested Resampling with rsample

September 4, 2017 | Max Kuhn

A typical scheme for splitting the data when developing a predictive model is to create an initial split of the data into a training and test set. If resampling is used, it is executed on the training set where a series of binary splits is created. In rsample, we use ...
[Read more...]

Readability Redux

September 4, 2017 | hrbrmstr

I recently posted about using a Python module to convert HTML to usable text. Since then, a new package has hit CRAN dubbed htm2txt that is 100% R and uses regular expressions to strip tags from text. I gave it a spin so folks could compare some basic output, but ... [Read more...]

Keras for R

September 4, 2017 | JJ Allaire

We are excited to announce that the keras package is now available on CRAN. The package provides an R interface to Keras, a high-level neural networks API developed with a focus on enabling fast experimentation. Keras has the following key features: Allows the same code to run on CPU or ...
[Read more...]

Keras for R

September 4, 2017 | JJ Allaire

We are excited to announce that the keras package is now available on CRAN. The package provides an R interface to Keras, a high-level neural networks API developed with a focus on enabling fast experimentation. Keras has the following key features: Allows the same code to run on CPU or ...
[Read more...]

Writing and Publishing my first R package

September 4, 2017 | R Views

Inspired by the Community One of the themes at useR 2017 in Brussels was “Get involved”. People were encouraged to contribute to the community, even when they did not consider themselves R specialists (yet). This could be by writing a package or a blog post, but also by simply correcting typos ... [Read more...]

A guide to parallelism in R

September 4, 2017 | Florian Privé

In this post, I will talk about parallelism in R. This post will likely be biased towards the solutions I use. For example, I never use mcapply nor clusterApply. I prefer to always use foreach. In this post, we will focus on how to parallelize R code on your computer. ... [Read more...]

Analyzing Google Trends Data in R

September 4, 2017 | Jake Hoare

Google Trends shows the changes in the popularity of search terms over a given time (i.e., number of hits over time). It can be used to find search terms with growing or decreasing popularity or to review periodic variations from the past such as seasonality. Google Trends search data ...
[Read more...]

Package GetLattesData

September 4, 2017 | R and Finance

Downloading and reading bibliometric data from Lattes - Lattes is the largest and unique platform for academic curriculumns. There you can find information about the academic work of ALL Brazilian scholars. It includes institution... [Read more...]

Calculating Marginal Effects Exercises

September 4, 2017 | BC Mullins

A common experience for those in the social sciences migrating to R from SPSS or STATA is that some procedures that happened at the click of a button will now require more work or are too obscured by the unfamiliar language to see how to accomplish. One such procedure that ... [Read more...]

Topic Modeling of New York Times Articles

September 3, 2017 | Susan Li

In machine learning and natural language processing, A “topic” consists of a cluster of words that frequently occur together. A topic model is a type of statistical model for discovering the abstract “topics” that occur in a collection of ...
[Read more...]

It is Needlessly Difficult to Count Rows Using dplyr

September 3, 2017 | John Mount

Question: how hard is it to count rows using the R package dplyr? Answer: surprisingly difficult. When trying to count rows using dplyr or dplyr controlled data-structures (remote tbls such as Sparklyr or dbplyr structures) one is sailing between Scylla and Charybdis. The task being to avoid dplyr corner-cases and ...
[Read more...]

Variable Selection with Elastic Net

September 3, 2017 | statcompute

LASSO has been a popular algorithm for the variable selection and extremely effective with high-dimension data. However, it often tends to “over-regularize” a model that might be overly compact and therefore under-predictive. The Elastic Net addresses the aforementioned “over-regularization” by balancing between LASSO and ridge penalties. In particular, a hyper-parameter, ... [Read more...]
1 12 13 14 15

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)