## Tips and tricks on using R to query data in Power BI

August 25, 2017
In Power BI, the dashboarding and reporting tool, you can use R to filter, transform, or restructure data via the Query Editor. For example, you could use the mice package to impute missing values, or use the tidytext package to assign sentiment scores to text inputs. As Imke Feldmann explains, there are lots of useful tricks you can accomplish...

## How much will that Texas rain be

August 25, 2017
PUTTING 35 INCHES OF RAIN IN PERSPECTIVE We are always interested in putting numbers into perspective, so we were interested in this article in which they put the Hurricane Harvey’s rain into perspective. They’re predicting 30-40 inches of rain in a few days in Texas. They asked an expert to put that into perspective and The post How much...

## How to prepare and apply machine learning to your dataset

August 25, 2017
INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a Related exercise sets: Vector exercises...

## BH 1.65.0-1

August 24, 2017
The BH package on CRAN was updated today to version 1.65.0. BH provides a sizeable portion of the Boost C++ libraries as a set of template headers for use by R, possibly with Rcpp as well as other packages. This release upgrades the version of Boost ...

## Newer to R? rstudio::conf 2018 is for you! Early bird pricing ends August 31.

August 24, 2017
Immersion is among the most effective ways to learn any language. Immersing where new and advanced users come together to improve their use of the R language is a rare opportunity. rstudio::conf 2018 is that time and place! REGISTER TODAY ...

August 24, 2017
Boosting is another famous ensemble learning technique in which we are not concerned with reducing the variance of learners like in Bagging where our aim is to reduce the high variance of learners by averaging lots of models fitted on bootstrapped data samples generated with replacement from training data, so as to avoid overfitting. Another Related Post Radial kernel Support...

## Linear Congruential Generator in R

August 24, 2017
Part of 1 in the series Random Number GenerationA Linear congruential generator (LCG) is a class of pseudorandom number generator (PRNG) algorithms used for generating sequences of random-like numbers. The generation of random numbers plays a large role in many applications ranging from cryptography to Monte Carlo methods. Linear congruential... The post Linear Congruential Generator in R appeared first on...

## Calculating a fuzzy kmeans membership matrix with R and Rcpp

August 24, 2017
by Błażej Moska, computer science student and data science intern Suppose that we have performed clustering K-means clustering in R and are satisfied with our results, but later we realize that it would also be useful to have a membership matrix. Of course it would be easier to repeat clustering using one of the fuzzy kmeans functions available in...

August 24, 2017
I needed to clean some web HTML content for a project and I usually use hgr::clean_text() for it and that generally works pretty well. The clean_text() function uses an XSLT stylesheet to try to remove all non-"main text content" from an HTML document and it usually does a good job but there are some pages...

## Big Data analytics with RevoScaleR Exercises

August 24, 2017
In this set of exercise , you will explore how to handle bigdata with RevoscaleR package from Microsoft R (previously Revolution Analytics).It comes with Microsoft R client . You can get it from here . get the Credit card fraud data set from revolutionanalytics and lets get started Answers to the exercises are available here.Please Related exercise sets: Vector exercises...

## Introducing ‘powerlmm’ an R package for power calculations for longitudinal multilevel models

August 24, 2017
Over the years I've produced quite a lot of code for power calculations and simulations of different longitudinal linear mixed models. Over the summer I bundled together these calculations for the designs I most typically encounter into an R package. T...

## New R Course: Sentiment Analysis in R – The Tidy Way

August 24, 2017
Hello, R users! This week we're continuing to bridge the gap between computers and human language with the launch Sentiment Analysis in R: The Tidy Way by Julia Silge! Text datasets are diverse and ubiquitous, and sentiment analysis provides an approa...

## Practical Guide to Principal Component Methods in R

August 24, 2017
Introduction Although there are several good books on principal component methods (PCMs) and related topics, we felt that many of them are either too theoretical or too advanced. This book provides a solid practical guidance to summarize, visu...

August 24, 2017


## H2O.ai: Going for a paddle

August 24, 2017
Owen Jones, Placement Student A quick disclaimer: This post isn’t called H2O.ai: Going for the 100m freestyle world record. I’m not trying to win a Kaggle competition. I’m not carrying out detailed, highly-controlled benchmarking tests. I’m not, in fact, claiming to be doing anything particularly useful at all. This is just me, just playing around with some code, just for...

## A simple function for installing R packages based on a folder with R scripts

August 24, 2017
- Whenever I buy a new computer or format an old one, I have the problem of reinstalling my R packages. If you are a heavy user, you will likely have a significant amount of packages used by...

## FedData – Getting assorted geospatial data into R

August 24, 2017
The package FedData has gone through software review and is now part of rOpenSci. FedData includes functions to automate downloading geospatial data available from several federated data sources (mainly sources maintained by the US Federal government). Currently, the package enables extraction from six datasets: The National Elevation Dataset (NED) digital elevation models (1 and 1/3 arc-second; USGS) The...

## My experience in switching from Windows 10 to Linux Mint 18.2

August 24, 2017
- It has been 8 months since I switched from Windows 10 to Linux Mint. In this post I’ll talk about my experience as a scholar and R user in this transition. My work is, simply put, to...

## Analyzing Google Trends Data in R

August 23, 2017
Google Trends shows the changes in the popularity of search terms over a given time (i.e., number of hits over time). It can be used to find search terms with growing or decreasing popularity or...

## Hard-nosed Indian Data Scientist Gospel Series – Part 1 : Incertitude around Tools and Technologies

August 23, 2017
Before recession a commercial tool was popular in the country, hence, uncertainty around tools and technology was not much; however, after recession, incertitude (i.e. uncertainty) around tools and technology have pre-occupied and occupying data sc...

## Digit fifth powers: Euler Problem 30

August 23, 2017
$Digit fifth powers: Euler Problem 30$

Euler problem 30 is another number crunching problem that deals with numbers to the power of five. Two other Euler problems dealt with raising numbers to a power. The previous problem looked at permutations of powers and problem 16 asks for … Continue reading → The post Digit fifth powers: Euler Problem 30 appeared first on The Devil is in the...

## Control Systems Toolbox – System Interconnection

August 23, 2017
Introduction Dynamic systems are usually represented by a model before they can be analyzed computationally. These dynamic systems are systems that change, evolve or have their states altered or varied with time based on a set of defined rules. Dynamic systems could be mechanical, electrical, electronic, biological, sociological, and so on. Many such systems are usually defined by a set...

## Sentiment analysis using tidy data principles at DataCamp

August 23, 2017
I’ve been developing a course at DataCamp over the past several months, and I am happy to announce that it is now launched! The course is Sentiment Analysis in R: the Tidy Way and I am excited that it is now available for you to explore and learn...

## Recreating and updating Minard with ggplot2

August 23, 2017
Minard's chart depicting Napoleon's 1812 march on Russia is a classic of data visualization that has inspired many homages using different time-and-place data. If you'd like to recreate the original chart, or create one of your own, Andrew Heiss has created a tutorial on using the ggplot2 package to re-envision the chart in R: The R script provided in...

## Basics of data.table: Smooth data exploration

August 23, 2017
The data.table package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of data.table and finishing this exercise set successfully you will be able to start easing into using data.table for all your data manipulation needs. We will use data Related exercise sets: Vector exercises...

## Going Bayes #rstats

August 23, 2017
Some time ago I started working with Bayesian methods, using the great rstanarm-package. Beside the fantastic package-vignettes, and books like Statistical Rethinking or Doing Bayesion Data Analysis, I also found the ressources from Tristan Mahr helpful to both better understand Bayesian analysis and rstanarm. This motivated me to implement tools for Bayesian analysis into my

## Rcpp now used by 10 percent of CRAN packages

August 23, 2017
Over the last few days, Rcpp passed another noteworthy hurdle. It is now used by over 10 percent of packages on CRAN (as measured by Depends, Imports and LinkingTo, but excluding Suggests). As of this morning 1130 packages use Rcpp out of a total of...

## Simple practice: data wrangling the iris dataset

August 23, 2017
If you want to work on large data science projects (analyses and machine learning) you need to be able to perform dozens of small tasks ... For example, you'll need to be able to fluently perform dozens of little bits of data wrangling, just like this ... The post Simple practice: data wrangling the iris dataset appeared first on SHARP SIGHT...

## useR!2017 Roundup

August 23, 2017
Organising useR!2017 was a challenge but a very rewarding experience. With about 1200 attendees of over 55 nationalities exploring an interesting program, we believe it is appropriate to call it a success - something the aftermovie only seems to confirm. Behind the Scenes To give you a glimpse behind the scenes of the conference organization, Maxim Nazarov held...