Convert hand-drawn equations to LaTeX with the mathpix package

September 29, 2017
By

Statistics involves a lot of mathematics, so one of the nice things about report-generation systems for R like Rmarkdown is that it makes it easy to include nicely-formatted equations by using the LaTeX syntax. So, if we want to include the density function of the Guassian Normal distribution: $$ \frac{1}{{\sigma \sqrt {2\pi } }} e^ { - \frac{ -...

Read more »

Real time Yelp reviews analysis and response solutions for restaurant owners

September 29, 2017
By
Real time Yelp reviews analysis and response solutions for restaurant owners

Motivation Before trying a new restaurant, we frequently consult with review platforms, such as Yelp, Zomato, or Google, where we can read comments from previous diners. The post Real time Yelp reviews analysis and response solutions for restaurant owners appeared first on NYC Data Science Academy Blog.

Read more »

bupaR: Business Process Analysis with R

September 29, 2017
By
bupaR: Business Process Analysis with R

Organizations are nowadays storing huge amounts of data related to various business processes. Process mining provides different methods and techniques to analyze and improve these processes. This allows companies to gain a competitive advantage. Process mining initiated with the discovery … Continue reading →

Read more »

Multicollinearity in R

September 29, 2017
By
Multicollinearity in R

One of the assumptions of Classical Linear Regression Model is that there is no exact collinearity between the explanatory variables. If the explanatory variables are perfectly correlated, you will face with these problems: Parameters of the model become indeterminate Standard errors of the estimates become infinitely large However, the case of perfect collinearity is very Related Post Scikit-Learn for Text...

Read more »

How to rapidly master data science

September 29, 2017
By
How to rapidly master data science

If you want to rapidly master data science, you need to ... The post How to rapidly master data science appeared first on SHARP SIGHT LABS.

Read more »

New Package GetITRData

September 29, 2017
By
New Package GetITRData

Downloading Quarterly Financial Reports from Bovespa - Financial statements of companies traded at B3 (formerly Bovespa), the Brazilian stock exchange, are available in its website. Accessing the data for a single company is strai...

Read more »

Using Linear Programming (LP) for optimizing bowling change or batting lineup in T20 cricket

September 28, 2017
By
Using Linear Programming (LP) for optimizing bowling change or batting lineup in T20 cricket

In my recent post, My travels through the realms of Data Science, Machine Learning, Deep Learning and (AI), I had recounted my journey in the domains of of Data Science, Machine Learning (ML), and more recently Deep Learning (DL) all of which are useful while analyzing data. Of late, I have come to the realization that … Continue reading Using...

Read more »

Rcpp 0.12.13: Updated vignettes, and more

September 28, 2017
By

The thirteenth release in the 0.12.* series of Rcpp landed on CRAN this morning, following a little delay because Uwe Ligges was traveling and whatnot. We had announced its availability to the mailing list late last week. As usual, a rather substanti...

Read more »

August 2017 New Package Picks

September 28, 2017
By
August 2017 New Package Picks

August was a relatively slow month for new R packages; “only” 180 new packages stuck to CRAN. Here are my “Top 40” picks organized into seven categories: Data, Machine Learning, Miscellaneous, Science, Statistics, Utilities and Visualizations. Although they have been written for specialized audiences, I have included the three “Science” packages because, in my layman’s opinion, they not only...

Read more »

Partial Pooling for Lower Variance Variable Encoding

September 28, 2017
By
Partial Pooling for Lower Variance Variable Encoding

Banaue rice terraces. Photo: Jon Rawlinson In a previous article, we showed the use of partial pooling, or hierarchical/multilevel models, for level coding high-cardinality categorical variables in vtreat. In this article, we will discuss a little more about the how and why of partial pooling in R. We will use the lme4 package to fit … Continue reading Partial...

Read more »

How Good is That Random Number Generator?

September 28, 2017
By
How Good is That Random Number Generator?

Recently, I saw a reference to an interesting piece from 2013 by Peter Grogono, a computer scientist now retired from Concordia University. It's to do with checking the "quality" of a (pseudo-) random number generator. Specifically, Peter discusses what he calls "The Pickover Test". This refers to the following suggestion that he attributes to Clifford Pickover (1995, Chap. 31): "Pickover describes...

Read more »

Goodness of Fit in MDS and t-SNE with Shepard Diagrams

September 28, 2017
By

The goodness of fit for data reduction techniques such as MDS and t-SNE can be easily assessed with Shepard diagrams. A Shepard diagram compares how far apart your data points are before and after you transform...

Read more »

R 3.4.2 is released

September 28, 2017
By

The R Core team today announced the release of R 3.4.2. This release fixes a number of minor bugs and also includes a performance improvement to the commonly-used function c when applied to vectors with a names attribute. Like all minor releases, this release is backwards compatible with prior releases in the R 3.4.x series. Binary builds of R...

Read more »

Gold-Mining – Week 4 (2017)

September 28, 2017
By

Week 4 Gold Mining and Fantasy Football Projection Roundup now available. Go get that free agent gold! The post Gold-Mining – Week 4 (2017) appeared first on Fantasy Football Analytics.

Read more »

SODD — StackOverflow Driven-Development

September 28, 2017
By

I occasionally hang out on StackOverflow and often use an answer as an opportunity to fill a package void for a particular need. docxtractr and qrencoder are two (of many) packages that were birthed from SO answers. I usually try to answer with inline code first then expand the functionality into a package (if warranted).... Continue reading →

Read more »

Oneway ANOVA Explanation and Example in R; Part 2

September 28, 2017
By
Oneway ANOVA Explanation and Example in R; Part 2

Please read the first part published at DataScience+, if you haven’t. Effect sizes and the strength of our prediction One relatively common question in statistics or data science is, how “big” is the difference or the effect? At this point we can state with some statistical confidence that tire brand matters in predicting tire mileage Related Post Oneway ANOVA Explanation...

Read more »

Marketing Multi-Channel Attribution model based on Sales Funnel with R

Marketing Multi-Channel Attribution model based on Sales Funnel with R

This is the last post in the series of articles about using Multi-Channel Attribution in marketing. In previous two articles (part 1 and part 2), we’ve reviewed a simple and powerful approach based on Markov chains that allows you to effectively attribute marketing channels. In this article, we will review another fascinating approach that marries heuristic The post Marketing Multi-Channel...

Read more »

RcppZiggurat 0.1.4

September 27, 2017
By
RcppZiggurat 0.1.4

A maintenance release of RcppZiggurat is now on the CRAN network for R. It switched the vignette to the our new pinp package and its two-column pdf default. The RcppZiggurat package updates the code for the Ziggurat generator which provides very fas...

Read more »

Blockchain & distributed ML – my report from the data2day conference

September 27, 2017
By
Blockchain & distributed ML – my report from the data2day conference

Yesterday and today I attended the data2day, a conference about Big Data, Machine Learning and Data Science in Heidelberg, Germany. Topics and workshops covered a range of topics surrounding (big) data analysis and Machine Learning, like Deep Learnin...

Read more »

CACE closed: EM opens up exclusion restriction (among other things)

September 27, 2017
By
CACE closed: EM opens up exclusion restriction (among other things)

This is the third, and probably last, of a series of posts touching on the estimation of complier average causal effects (CACE) and latent variable modeling techniques using an expectation-maximization (EM) algorithm . What follows is a simplistic way to implement an EM algorithm in R to do principal strata estimation of CACE. The EM algorithm In this approach, we assume...

Read more »

Featurizing images: the shallow end of deep learning

September 27, 2017
By
Featurizing images: the shallow end of deep learning

by Bob Horton and Vanja Paunic, Microsoft AI and Research Data Group Training deep learning models from scratch requires large data sets and significant computational reources. Using pre-trained deep neural network models to extract relevant features from images allows us to build classifiers using standard machine learning approaches that work well for relatively small data sets. In this context,...

Read more »

New Course! Supervised Learning in R: Classification

September 27, 2017
By
New Course! Supervised Learning in R: Classification

Hi there! We proud to launch our latest R & machine learning course, Supervised Learning in R: Classification! By Brett Lantz. This beginner-level introduction to machine learning covers four of the most common classification algorithms. You will ...

Read more »

Oneway ANOVA Explanation and Example in R; Part 1

September 27, 2017
By
Oneway ANOVA Explanation and Example in R; Part 1

This tutorial was inspired by a this post published at DataScience+ by Bidyut Ghosh. Special thanks also to Dani Navarro, The University of New South Wales (Sydney) for the book Learning Statistics with R (hereafter simply LSR) and the lsr packages available through CRAN. I highly recommend it. Let’s load the required R packages library(ggplot2) Related Post One-way ANOVA in...

Read more »

Churn Prediction with Automatic ML

September 27, 2017
By
Churn Prediction with Automatic ML

Sometimes we don’t even realize how common machine learning (ML) is in our daily lives. Various “intelligent” algorithms help us for instance with finding the most important facts (Google), they suggest what movie to watch (Netflix), or influence our shopping decisions (Amazon). The biggest international companies quickly recognized the potential of machine learning and transferred it to business solutions. Nowadays...

Read more »

rrricanes to Access Tropical Cyclone Data

September 27, 2017
By
rrricanes to Access Tropical Cyclone Data

What is rrricanes Why Write rrricanes? There is a tremendous amount of weather data available on the internet. Much of it is in raw format and not very easy to obtain. Hurricane data is no different. When one thinks of this data they may be inclined to think it is a bunch of map coordinates with some wind values and not...

Read more »

What is the appropriate population scaling of the Affordable Care Act Funding?

September 26, 2017
By
What is the appropriate population scaling of the Affordable Care Act Funding?

Analysis of the effects of the Graham-Cassidy Bill on the ACA population - I have been trying to decipher for myself, what is in the current (well, yesterday’s) Graham-Cassidy health care bill. I saw this image on many news outlets a few da...

Read more »

Data.Table by Example – Part 2

September 26, 2017
By
Data.Table by Example – Part 2

In part one, I provided an initial walk through of some nice features that are available within the data.table package. In particular, we saw how to filter data and get a count of rows by the date. Let us now add a few columns to our dataset on reported crimes in the city of Chicago.

Read more »

RcppAnnoy 0.0.10

September 26, 2017
By

A few short weeks after the more substantial 0.0.9 release of RcppAnnoy, we have a quick bug-fix update. RcppAnnoy is our Rcpp-based R integration of the nifty Annoy library by Erik. Annoy is a small and lightweight C++ template header library for ve...

Read more »

rrricanes to Access Tropical Cyclone Data

What is rrricanes Why Write rrricanes? There is a tremendous amount of weather data available on the internet. Much of it is in raw format and not very easy to obtain. Hurricane data is no different. When one thinks of this data they may be inclined to...

Read more »

Search R-bloggers

Sponsors

Mango solutions







Zero Inflated Models and Generalized Linear Mixed Models with R



Quantide: statistical consulting and training

ODSC2

ODSC1

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training



statcon.de

mljar.com



Contact us if you wish to help support R-bloggers, and place your banner here.