Running RStudio (1.2) Background Jobs

June 9, 2018
By

The forthcoming RStudio 1.2 release has a new “Jobs” feature for running and managing background R tasks. I did a series of threaded screencaps on Twitter but that doesn’t do the feature justice. So I threw together a quick ‘splainer on how to run and Python (despite RStudio not natively supporting Python) code in the... Continue reading →

Read more »

R:case4base – data aggregation with base R

June 9, 2018
By
R:case4base – data aggregation with base R

Introduction In the previous articles of the R:case4base series we discussed and learned how to reshape data with base R to a form that is practical for our use and how to subset data to get the relevant parts of it with base R. In this one, we will look at aggregation techniques using base R’s stats::aggregate generic function, focusing on the method...

Read more »

Classification from scratch, boosting 11/8

June 8, 2018
By
Classification from scratch, boosting 11/8

Eleventh post of our series on classification from scratch. Today, that should be the last one… unless I forgot something important. So today, we discuss boosting. An econometrician perspective I might start with a non-conventional introduction. But that’s actually how I understood what boosting was about. And I am quite sure it has to do with my background in...

Read more »

Unconf18 projects 4: umapr, greta, roomba, proxy-bias-vignette, http caching

Unconf18 projects 4: umapr, greta, roomba, proxy-bias-vignette, http caching

For the fourth and last day of project recaps from this year’s unconf, here is an overview of the next five projects. In the spirit of exploration and experimentation at rOpenSci unconferences, these projects are not necessar...

Read more »

Sharpening The Knives in The data.table Toolbox: Exercises

June 8, 2018
By
Sharpening The Knives in The data.table Toolbox: Exercises

If knowledge is power, then knowledge of data.table is something of a super power, at least in the realm of data manipulation in R. In this exercise set, we will use some of the more obscure functions from the data.table package. The solutions will use set(), inrange(), chmatch(), uniqueN(), tstrsplit(), rowid(), shift(), copy(), address(), setnames() Related exercise sets:Spatial Data...

Read more »

Download all KEGG pathway KGML files for SPIA analysis

June 8, 2018
By

Most people know KEGG pathway, but not everyone knows that it costs at least $2000 to subscribe its database. If you want to save the cost a bit, you can manually download the KEGG pathway KGML files and install in SPIA. Here I have a workaround to dow...

Read more »

Microsoft R Open 3.5.0 now available

June 8, 2018
By

Microsoft R Open 3.5.0 is now available for download for Windows, Mac and Linux. This update includes the open-source R 3.5.0 engine, which is a major update with many new capabilities and improvements to R. In particular, it includes a major new framework for handling data in R, with some major behind-the-scenes performance and memory-use benefits (and with further...

Read more »

Why Bother with Shiny?

June 8, 2018
By
Why Bother with Shiny?

Aimée Gott, Education Practice Lead For the last week we've been talking on the blog and Twitter about some of the functionality in Shiny and how you can learn it. But, if you haven't already made the leap and started using Shiny, why should you? What is the challenge to be solved? At Mango we define data science as the proactive...

Read more »

Classification from scratch, bagging and forests 10/8

June 8, 2018
By
Classification from scratch, bagging and forests 10/8

Tenth post of our series on classification from scratch. Today, we’ll see the heuristics of the algorithm inside bagging techniques. Often, bagging is associated with trees, to generate forests. But actually, it is possible using bagging for any kind of model. Recall that bagging means “boostrap aggregation”. So, consider a model . Let denote the estimator of obtained from...

Read more »

How To Create a Flexdashboard

June 8, 2018
By
How To Create a Flexdashboard

INTRODUCTION With flexdashboard, you can easily create interactive dashboards for R. What is amazing about it is that with R Markdown, you can publish a group of related data visualizations as a dashboard. Additionally, it supports a wide variety of components, including htmlwidgets; base, lattice, and grid graphics; tabular data; gauges and value boxes and Related exercise sets:Spatial Data...

Read more »

pinp 0.0.5: Accomodate pandoc 2.*

pinp 0.0.5: Accomodate pandoc 2.*

Another maintenance release of our pinp package for snazzier one or two column vignettes is getting onto CRAN right now. Everybody's favourite Swiss Army Knife of text processing and document conversion, pandoc, decided in all its wisdow for version ...

Read more »

Women in R

June 8, 2018
By

Last week I gave one of the keynote addresses at R/Finance 2018 in Chicago. I considered it an honor and a pleasure to be there, both because of the stimulating intellectual exchange and the fine level of camaraderie and hospitality that prevailed. I mentioned at the start of my talk that the success of this … Continue reading Women...

Read more »

.rprofile: Julia Silge

.rprofile: Julia Silge

Dr. Julia Silge is a data scientist at Stack Overflow. We talked about why R brings Julia joy, her path to a career in data science and what it was like to co-write a book for O’Reilly Media. This interview occurred on February 3, 2018 at the RStudio Conference in San Diego. KO: What is your name,...

Read more »

maximal spacing around order statistics [#2]

June 7, 2018
By
maximal spacing around order statistics [#2]

The proposed solution of the riddle from the Riddler discussed here a few weeks ago is rather approximative, in that the distribution of when the n-sample is made of iid Normal variates is (a) replaced with the distribution of one arbitrary minimum and (b) the distribution of the minimum is based on an assumption of

Read more »

Making World Tile Grid-Grids

June 7, 2018
By
Making World Tile Grid-Grids

A colleague asked if I would blog about how I crafted the grid of world tile grids in this post and I accepted the challenge. The technique isn’t too hard as it just builds on the initial work by Jon Schwabish and a handy file made by Maarten Lambrechts. The Premise For this particular use-case,... Continue reading →

Read more »

In case you missed it: May 2018 roundup

June 7, 2018
By

In case you missed them, here are some articles from April of particular interest to R users. The R Consortium has announced a new round of grants for projects proposed by the R community. A look back at the ROpenSci unconference held in Seattle. Video of my European R Users Meeting talk, "Speeding up R with Parallel Programming in...

Read more »

stringdist 0.9.5.0: now with C API

June 7, 2018
By
stringdist 0.9.5.0: now with C API

Version 0.9.5.0 of stringdist is accepted on CRAN (binaries for non-linux OSs will be available in a few days). The main new feature, with a huge thanks to our awesome new contributor Chris Muir, is that we made it easy … Continue reading →

Read more »

Database bulk update and inline editing in a Shiny Application

June 7, 2018
By
Database bulk update and inline editing in a Shiny Application

Ava Yang, Data Scientist There are times when it costs more than it should to leverage javascript, database, html, models and algorithms in one language. Now maybe is time for connecting some dots, without stretching too much. If you have been developing shiny apps, consider letting it sit on one live database instead of manipulating data I/O by hand? If you use...

Read more »

Polynomial Model in R – Study Case: Exercises

June 7, 2018
By

It is pretty rare to find something that represents linearity in the environmental system. The Y/X response may not be a straight line, but humped, asymptotic, sigmoidal or polynomial are possibly, truly non-linear. In this exercise, we will try to take a closer look at how polynomial regression works and practice with a study case. Related exercise sets:Spatial Data...

Read more »

Testers for RFishBC

June 6, 2018
By

Back-calculating lengths of fish at previous ages from measurements made on calcified structures (scales, otoliths, etc.) is fairly common practice within some fisheries agencies and institutions. The FishBC software distributed by the Amer...

Read more »

Classification from scratch, linear discrimination 8/8

June 6, 2018
By
Classification from scratch, linear discrimination 8/8

Eighth post of our series on classification from scratch. The latest one was on the SVM, and today, I want to get back on very old stuff, with here also a linear separation of the space, using Fisher’s linear discriminent analysis. Bayes (naive) classifier Consider the follwing naive classification ruleor(where is the density in the continuous case). In the...

Read more »

Unconf18 projects 3: jobstatus, motifator, QcodeR, opencv, trackmd

Unconf18 projects 3: jobstatus, motifator, QcodeR, opencv, trackmd

For day 3 of project recaps from this year’s unconf, here is an overview of the next five projects. Stay tuned for the last recap tomorrow. In the spirit of exploration and experimentation at rOpenSci unconferences, these projects are not necessarily finished products or in scope for rOpenSci packages. Let’s dive into today’s 5 projects in focus! jobstatus Summary: jobstatus helps keep an...

Read more »

Sentiment Use Across the Course of Pitchfork Music Reviews: A Tidy Text Analysis with R

June 6, 2018
By
Sentiment Use Across the Course of Pitchfork Music Reviews: A Tidy Text Analysis with R

In this post, we'll return to the Kaggle data containing information on Pitchfork music reviews. In a previous post, I used this dataset to cluster music genres. In the current post, I will use R and the tidytext package (and philosophy) to examine the text of the music reviews. Specifically, the goal of the analysis described in this post...

Read more »

An introduction to machine learning with Keras in R

June 6, 2018
By
An introduction to machine learning with Keras in R

A guest post by @MaxMaPichler, MSc student in the Group for Theoretical Ecology / UR Artificial neural networks, especially deep neural networks and (deep) convolutions neural networks, have become increasingly popular in recent years, dominating most machine learning competitions since the early 2010’s (for reviews about DNN and (D)CNNs see LeCun, Bengio, & Hinton, 2015). In ecology,…

Read more »

R functions for Bayesian Model Statistics and Summaries #rstats #stan #brms

June 6, 2018
By
R functions for Bayesian Model Statistics and Summaries #rstats #stan #brms

A new update of my sjstats-package just arrived at CRAN. This blog post demontrates those functions of the sjstats-package that deal especially with Bayesian models. The update contains some new and some revised functions to compute summary statistics of Bayesian models, which are now described in more detail. hdi() rope() mcse() n_eff() tidy_stan() equi_test() mediation() … Weiterlesen R functions...

Read more »

Classification from scratch, SVM 7/8

June 6, 2018
By
Classification from scratch, SVM 7/8

Seventh post of our series on classification from scratch. The latest one was on the neural nets, and today, we will discuss SVM, support vector machines. A formal introduction Here takes values in . Our model will be Thus, the space is divided by a (linear) border The distance from point to is If the space is linearly separable,...

Read more »

Using Countdown Clock Data to Understand the New York City Subway

June 6, 2018
By
Using Countdown Clock Data to Understand the New York City Subway

If you’ve been on a New York City subway platform since January 2018, you should have noticed a countdown clock that displayed an estimate of when the next train would arrive. Although there’s no official record of when trains actually stopped at each station, the countdown clock data can be used to approximate. Over the past 5 months, I’ve...

Read more »

R trainings & data science talks are coming to Stuttgart

June 6, 2018
By
R trainings & data science talks are coming to Stuttgart

It’s happening – and this time in Swabia. eoda is bringing you the popular trainings for R to the South of Germany. Over more than 1500 satisfied participants have trained their practical data science skills with R trainings by eoda. Offering an Introduction to R and an Introduction to Data Mining with R, professional training … „R trainings &...

Read more »

Intro to Time Series Analysis -Part 1

June 5, 2018
By
Intro to Time Series Analysis -Part 1

In the exercises below, we will work with Time Series analysis and see how R can make your life easier when working with Time Series. doing Time Series Analysis.This will be a series of Exercises and I urge you to take it in series Please install the package and load the library before starting Answers Related exercise sets:Spatial Data...

Read more »

Search R-bloggers


Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R



datasciencego.com

Quantide: statistical consulting and training

ODSC2 west

ODSC1_london

datasociety

http://www.eoda.de

max kuhn









Six Sigma Online Training



mljar.com

datazar.com

Our ads respect your privacy. Read our Privacy Policy page to learn more.

Contact us if you wish to help support R-bloggers, and place your banner here.