elastic – Elasticsearch for R

August 2, 2017
By

elastic is an R client for Elasticsearch elastic has been around since 2013, with the first commit in November, 2013. sidebar - 'elastic' was picked as a package named before the company now known as Elastic changed their name to Elastic. What is Elasticsearch? If you aren't familiar with Elasticsearch, it is a distributed, RESTful search and analytics engine. It's similar to Solr. It falls...

Read more »

F-Test: Compare Two Variances in R

August 2, 2017
By
F-Test: Compare Two Variances in R

F-test is used to assess whether the variances of two populations (A and B) are equal. Contents When to you use F-test? Research questions and statistical hypotheses Formula of F-test Compute F-test in R R function Import and check your data into R Preleminary test to check F-test assumptions Compute F-test Interpretation of the result Access to the values returned by var.test() function Infos When to...

Read more »

Singular Value Decomposition (SVD): Tutorial Using Examples in R

August 1, 2017
By
Singular Value Decomposition (SVD): Tutorial Using Examples in R

If you have ever looked with any depth at statistical computing for multivariate analysis, there is a good chance you have come across the singular value decomposition (SVD). It is a workhorse for techniques that decompose data, such as correspondence analysis and principal...

Read more »

Five kinds of weather you’ll meet in America

August 1, 2017
By
Five kinds of weather you’ll meet in America

K-MEANS CLUSTERING, A WORKHORSE OF DATA SCIENCE AND MACHINE LEARNING CLICK TO ENLARGE The USA is a large country. How different are people’s experiences of the weather depending on where they live? To look into this question, we downloaded high temperature data for over 1,300 airport weather stations in the contiguous USA for every day The post Five kinds...

Read more »

A Postcard from JSM

August 1, 2017
By
A Postcard from JSM

Baltimore has the reputation of being a tough town: hot in the summer and gritty, but the convention center hosting the Joint Statistical Meetings is a pretty cool place to be. There are thousands of people here and so many sessions (over 600) that it’s just impossible to get an overview of all that’s going on. So, here are...

Read more »

Transfer Learning with Keras in R

August 1, 2017
By
Transfer Learning with Keras in R

In my last posts ((http://flovv.github.io/Logo_detection_deep_learning/ and here, I described how one can detect logos in images with R. The first results were promising and achieved a classification accuracy of ~50%. In this post i will detail h...

Read more »

A modern database interface for R

August 1, 2017
By

At the useR! conference last month, Jim Hester gave a talk about two packages that provide a modern database interface for R. Those packages are the odbc package (developed by Jim and other members of the RStudio team), and the DBI package (developed by Kirill Müller with support from the R Consortium). To communicate with databases, a common protocol...

Read more »

HebRew (using Hebrew in R)

August 1, 2017
By

Adi Sarid (Tel Aviv university and Sarid Research Institute LTD.) July-2017 Background A while back I participated in an R workshop, in the annual convention of the Israeli Association for Statistics. I had the pleasure of talking with Tal Galili and Jonathan Rosenblatt which indicated that a lot of Israeli R users run into difficulties … Continue reading HebRew...

Read more »

R⁶ — Reticulating Parquet Files

August 1, 2017
By

The reticulate package provides a very clean & concise interface bridge between R and Python which makes it handy to work with modules that have yet to be ported to R (going native is always better when you can do it). This post shows how to use reticulate to create parquet files directly from R... Continue reading →

Read more »

Let’s Talk Drawdowns (And Affiliates)

August 1, 2017
By
Let’s Talk Drawdowns (And Affiliates)

This post will be directed towards those newer in investing, with an explanation of drawdowns–in my opinion, a simple and … Continue reading →

Read more »

Showing Some Respect for Data Munging

August 1, 2017
By
Showing Some Respect for Data Munging

In this post, I'd like to focus on data munging, e.g. the process of acquiring and arranging data (typically in a tidy manner) prior to data analysis. It's common knowledge that data scientists spend an enormous amount of time munging data, but data analysis, modeling, and visualization get most of the attention at presentations, on blogs and in the...

Read more »

Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-5)

August 1, 2017
By
Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-5)

Statistics are often taught in school by and for people who like Mathematics. As a consequence, in those class emphasis is put on leaning equations, solving calculus problems and creating mathematics models instead of building an intuition for probabilistic problems. But, if you read this, you know a bit of R programming and have access Related exercise sets: Nonparametric Tests...

Read more »

Building a website with pkgdown: a short guide

August 1, 2017
By
Building a website with pkgdown: a short guide

As promised in my last post, here is a short guide with some tips and tricks for building a documentation website for an R package using pkgdown.In the end, this guide ended up way longer than I was expecting, but I hope you'll find it useful, although it often replicates information already available in pkgdown documentation !Prerequisites To build a website using pkgdown, all you need...

Read more »

Plane Crash Data – Part 1: Web Scraping

August 1, 2017
By
Plane Crash Data – Part 1: Web Scraping

Several months ago I stumbled across the Kaggle data set Airplane Crashes Since 1908. Since I couldn't find the data source, I searched the web for historical plane crash data and quickly found the web page http://www.planecrashinfo.com. On this site you can find various tables inside...

Read more »

What analysis programs drive conservation science?

July 31, 2017
By
What analysis programs drive conservation science?

What analysis programs drive conservation science? With the International Congress for Conservation Biology on at the end of July I was wondering, what analysis programs are supporting conservation science? And, what programs support spatial analysis ...

Read more »

How to use H2O with R on HDInsight

July 31, 2017
By

H2O.ai is an open-source AI platform that provides a number of machine-learning algorithms that run on the Spark distributed computing framework. Azure HDInsight is Microsoft's fully-managed Apache Hadoop platform in the cloud, which makes it easy to spin up and manage Azure clusters of any size. It's also easy to to run H2O on HDInsight: H2O AI Platform is...

Read more »

Counterfactual estimation on nonstationary data, be careful!!!

July 31, 2017
By
Counterfactual estimation on nonstationary data, be careful!!!

By Gabriel Vasconcelos In a recent paper that can be downloaded here, Carvalho, Masini and Medeiros show that estimating counterfactuals in a non-stationary framework (when I say non-stationary it means integrated) is a tricky task. It is intuitive that the … Continue reading →

Read more »

15 Jobs for R users (2017-07-31) – from all over the world

July 31, 2017
By
15 Jobs for R users (2017-07-31) – from all over the world

To post your R job on the next post Just visit this link and post a new R job to the R community. You can post a job for free (and there are also “featured job” options available for extra exposure). Current R jobs Job seekers: please follow the links below to learn more and apply for your R job of interest: Featured Jobs Freelance Data Scientists...

Read more »

Machine Learning Explained: Dimensionality Reduction

July 31, 2017
By
Machine Learning Explained: Dimensionality Reduction

Dealing with a lot of dimensions can be painful for machine learning algorithms. High dimensionality will increase the computational complexity, increase the risk of overfitting (as your algorithm has more degrees of freedom) and the sparsity of the data will grow. Hence, dimensionality reduction will project the data in a space with less dimension to The post Machine Learning...

Read more »

Google Vision API in R – RoogleVision

July 31, 2017
By
Google Vision API in R – RoogleVision

Using the Google Vision API in R Utilizing RoogleVision After doing my post last month on OpenCV and face detection, I started looking into other algorithms used for pattern detection in images. As it turns out, Google has done a phenomenal job with their Vision API. It’s absolutely incredible the amount of information it can

Read more »

Upcoming Talk at the Bay Area R Users Group (BARUG)

July 31, 2017
By
Upcoming Talk at the Bay Area R Users Group (BARUG)

Next Tuesday (August 8) I will be giving a talk at the Bay Area R Users Group (BARUG). The talk is titled Beyond Popularity: Monetizing R... The post Upcoming Talk at the Bay Area R Users Group (BARUG) appeared first on AriLamstein.com.

Read more »

sparklyr 0.6

July 30, 2017
By

We’re excited to announce a new release of the sparklyr package, available in CRAN today! sparklyr 0.6 introduces new features to: Distribute R computations using spark_apply() to execute arbitrary R code across your Spark cluster. You can now use all of your favorite R packages and functions in a distributed context. Connect to External Data Sources using spark_read_source(), spark_write_source(), spark_read_jdbc() and...

Read more »

Data visualization with googleVis exercises part 9

July 30, 2017
By
Data visualization with googleVis exercises part 9

Histogram & Calendar chart This is part 9 of our series and we are going to explore the features of two interesting types of charts that googleVis provides like histogram and calendar charts. Read the examples below to understand the logic of what we are going to do and then test yous skills with the Related exercise sets: Data Visualization...

Read more »

Matching, Optimal Transport and Statistical Tests

July 30, 2017
By
Matching, Optimal Transport and Statistical Tests

To explain the “optimal transport” problem, we usually start with Gaspard Monge’s “Mémoire sur la théorie des déblais et des remblais“, where the the problem of transporting a given distribution of matter (a pile of sand for instance) into another (an excavation for instance). This problem is usually formulated using distributions, and we seek the “optimal” transport from one...

Read more »

Scripting for data analysis (with R)

July 30, 2017
By
Scripting for data analysis (with R)

Course materials (GitHub) This was a PhD course given in the spring of 2017 at Linköping University. The course was organised by the graduate school Forum scientium and was aimed at people who might be interested in using R for data analysis. The materials developed from a part of a previous PhD course from a

Read more »

Understanding Overhead Issues in Parallel Computation

July 29, 2017
By
Understanding Overhead Issues in Parallel Computation

In my talk at useR! earlier this month, I emphasized the fact that a major impediment to obtaining good speed from parallelizing an algorithm is systems overhead of various kinds, including: Contention for memory/network. Bandwidth limits — CPU/memory, CPU/network, CPU/GPU. Cache coherency problems. Contention for I/O ports. OS and/or R limits on number of sockets … Continue reading Understanding...

Read more »

Memorable dataviz with the R program, talk awarded people’s choice prize

July 29, 2017
By

“Memorable dataviz with the R program” awarded people’s choice prize For the past two years Dr Nick Hamilton has invited me to give a talk on creating data visuals with the R program at the wonderful UQ Winterschool in Bioinformatics. This year...

Read more »

Tidy Time Series Analysis, Part 3: The Rolling Correlation

Tidy Time Series Analysis, Part 3: The Rolling Correlation

In the third part in a series on Tidy Time Series Analysis, we’ll use the runCor function from TTR to investigate rolling (dynamic) correlations. We’ll again use tidyquant to investigate CRAN downloads. This time we’ll also get some help from the...

Read more »

Forecasting workshop in Perth

July 29, 2017
By

On 26-28 September 2017, I will be running my 3-day workshop in Perth on “Forecasting: principles and practice” based on my book of the same name. Topics to be covered include seasonality and trends, exponential smoothing, ARIMA modelling, dynamic regression and state space models, as well as forecast accuracy methods and forecast evaluation techniques such as cross-validation. Workshop participants are expected...

Read more »

Search R-bloggers

Sponsors

Mango solutions









Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC2

ODSC1

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training



statcon.de

mljar.com

Contact us if you wish to help support R-bloggers, and place your banner here.