shinyalert: Easily create pretty popup messages (modals) in Shiny

February 13, 2018
By
shinyalert: Easily create pretty popup messages (modals) in Shiny

A brand new shiny package has entered the world yesterday: shinyalert. It does only one thing, but does it well: show a message to the user in a modal (aka popup, dialog, or alert box). Actually, it can do one more thing: shinyalert can also ...

Read more »

Is 10,000 Cells Big?

February 12, 2018
By
Is 10,000 Cells Big?

Trick question: is a 10,000 cell numeric data.frame big or small? In the era of "big data" 10,000 cells is minuscule. Such data could be fit on fewer than 1,000 punched cards (or less than half a box). The joking answer is: it is small when they are selling you the system, but can be … Continue reading Is...

Read more »

tfestimators – Package: Embeddings for Categorical Variables

February 12, 2018
By
tfestimators – Package: Embeddings for Categorical Variables

In my last posts (here and here) I explored how to use embeddings to represent categorical variables. Furthermore, I showed how to represent categorical variables with embeddings and add other variable to create a more complex model. Both posts focused...

Read more »

Diversity scholarships for upcoming R conferences

February 12, 2018
By

One of the greatest things about the R community is its diversity. This is largely thanks to organizations like Forwards and R-Ladies, who have been instrumental in welcoming women and other under-represented groups to the world of R. Likewise, conferences in the R community encourage diversity, with open codes of conduct, facilitations like on-site child-care, and by offering scholarships...

Read more »

$15/course: Udemy data science courses in R (etc.)

February 12, 2018
By
$15/course: Udemy data science courses in R (etc.)

Udemy is offering readers of R-bloggers access to its global online learning marketplace for $15 per course! This deal (offering over 50%-90% discount) is for hundreds of their courses – including many R-Programming, data science, machine learning etc. Click here to browse ALL (R and non-R) courses Advanced R courses:  Regression modelling Comprehensive Linear Modeling with R (15 Hours of video) Linear regression in R...

Read more »

How to Access Datasets in R

February 12, 2018
By
How to Access Datasets in R

Have you spent hours, pulling your hair out trying to figure out how to access datasets in R? Once imported to a variable, columns from a dataset (eg: CSV) can be very tricky to access. Sometimes columns contain spaces, funky characters or other incosistencies. Here are some examples on how to access the data from CSV and JSON datasets. CSV We’ll...

Read more »

Love Machine: Automating the romantic songwriting process

February 12, 2018
By
Love Machine: Automating the romantic songwriting process

...

Read more »

How to maraaverickfy a blog post without even reading it

February 12, 2018
By

Steph is currently out of the office, teaching people cool Data Science stuff on a cruise at Tech Outbound. She counts on her team to keep the company’s Twitter account afloat in the meantime, so I had to think of a way to contribute. What about advertising existing content from her blog in the style of her Twitter role...

Read more »

Parametric Portfolio Policies

February 11, 2018
By
Parametric Portfolio Policies

By Gabriel Vasconcelos Overview There are several ways to do portfolio optimization out there, each with its advantages and disadvantages. We already discussed some techniques here. Today I am going to show another method to perform portfolio optimization that works … Continue reading →

Read more »

Summer interns

February 11, 2018
By

We are excited to announce the first formal summer internship program at RStudio. The goal of our internship program is to enable RStudio employees to collaborate with current students to create impactful and useful applications that will help both RStudio users and the broader R community, and help ensure that the community of R developers is representative of the...

Read more »

tidygraph 1.1 – A tidy hope

February 11, 2018
By
tidygraph 1.1 – A tidy hope

I am very pleased to tell you that the next version of tidygraph (1.1) is now available on CRAN. This is not a bug-fix release, nor a change-it-all release, but rather a more-of-it-all release, and in this post I’m going to tell you all about it. The idea of tidygraph Before we enter the goldmine of new features that makes this...

Read more »

R Interfaces to Python Keras Package

February 11, 2018
By

Keras is a popular Python package to do the prototyping for deep neural networks with multiple backends, including TensorFlow, CNTK, and Theano. Currently, there are two R interfaces that allow us to use Keras from R through the reticulate package. While the keras R package is able to provide a flexible and feature-rich API, the

Read more »

Predicting job search by training a random forest on an unbalanced dataset

February 10, 2018
By
Predicting job search by training a random forest on an unbalanced dataset

In this blog post, I am going to train a random forest on census data from the US to predict the probability that someone is looking for a job. To this end, I downloaded the US 1990 census data from the UCI Machine Learning Repository. Having a background in economics, I am always quite interest by such datasets. I...

Read more »

«smooth» package for R. Common ground. Part IV. Exogenous variables. Advanced stuff

February 10, 2018
By
«smooth» package for R. Common ground. Part IV. Exogenous variables. Advanced stuff

Previously we’ve covered the basics of exogenous variables in smooth functions. Today we will go slightly crazy and discuss automatic variables selection. But before we do that, we need to look at a Santa’s little helper function implemented in . It is called . It is useful in cases when you think that your exogenous

Read more »

Version 0.6-9 of NIMBLE released

February 9, 2018
By
Version 0.6-9 of NIMBLE released

We’ve released the newest version of NIMBLE on CRAN and on our website. Version 0.6-9 is primarily a maintenance release with various bug fixes and fixes for CRAN packaging issues. New features include: dimensions in a model will now be determined from either ‘inits’ or ‘data’ if not otherwise available; one can now specify “nBootReps

Read more »

Visualising an ethnicity statistical classification by @ellis2013nz

February 9, 2018
By

Official statistical classifications can be big, complex things. For example, the complete version of the “International Standard Industrial Classification of All Economic Activities, Rev.4” comes as either a 300 page PDF or a 1.3GB Microsoft Acce...

Read more »

Phylogeny and species traits predict bird detectability

February 9, 2018
By
Phylogeny and species traits predict bird detectability

It all started with this paper in Methods in Ecol. Evol. where we looked at detectability of many species. So we wanted to use life history traits to validate our results. But we had to cut the manuscript, and there was this leftover with some neat patterns, but without much focus. It took a few years, and the most positive peer-review experience ever, and...

Read more »

.rprofile: Julia Stewart Lowndes

.rprofile: Julia Stewart Lowndes

Dr. Julia Stewart Lowndes is the Science Program Lead for the Ocean Health Index and works at the National Center for Ecological Analysis and Synthesis. She and Sean Kross discussed how data science, open science, and community can help reproducibility in research. SK: I’m Sean Kross, I’m the CTO of...

Read more »

A comparison between spaCy and UDPipe for Natural Language Processing for R users

February 8, 2018
By
A comparison between spaCy and UDPipe for Natural Language Processing for R users

In the last few years, Natural Language Processing (NLP) has become more and more an open multi-lingual task instead of being held back by language, country and legal boundaries. With the advent of commonly used open data regarding natural language processing tasks as available at http://universaldependencies.org one can now relatively easily compare different toolkits which perform natural language processing....

Read more »

DataExplorer: Fast Data Exploration With Minimum Code

February 8, 2018
By
DataExplorer: Fast Data Exploration With Minimum Code

by Boxuan Cui, Data Scientist at Smarter Travel Once upon a time, there was a joke: In Data Science, 80% of time spent prepare data, 20% of time spent complain about need for prepare data. — Big Data Borat (@BigDataBorat) February 27, 2013 According to a Forbes article, cleaning and organizing data is the most time-consuming and least enjoyable...

Read more »

How to Install and Include an R Package

February 8, 2018
By
How to Install and Include an R Package

We get a lot of questions about the usage of R libraries. The most common question is “can I use all the R libraries in your notebooks/consoles?” Yes! You can use any of the libraries that have been published to the R package repository (CRAN). Open up your notebook/console: 1. Install install.packages("ggplot2") this will install the package if it hasn’t already been installed. 2....

Read more »

RcppEigen 0.3.3.4.0

February 7, 2018
By

A new minor release 0.3.3.4.0 of RcppEigen hit CRAN earlier today, and just went to Debian as well. It brings Eigen 3.3.4 to R. Yixuan once again did the leg-work of bringing the most recent Eigen release in along with the small set of patches we hav...

Read more »

Apply to attend rOpenSci unconf 2018!

Apply to attend rOpenSci unconf 2018!

For a fifth year running, we are excited to announce the rOpenSci unconference, our annual event loosely modeled on Foo Camp. rOpenSci unconferences have a rich history. You can get a feel for them by reading collected stories about people and projects from unconf17. We’re organizing unconf18 to bring together scientists, developers, and open data enthusiasts from academia, industry, government,...

Read more »

Calculating Beta in the Capital Asset Pricing Model

February 7, 2018
By

Today we will continue our portfolio fun by calculating the CAPM beta of our portfolio returns. That will entail fitting a linear model and, when we get to visualization next time, considering the meaning of our results from the perspective of asset returns. By way of brief background, the Capital Asset Pricing Model (CAPM) is a model, created by William...

Read more »

Analysing Digital Water Meter Data using the Tidyverse

February 7, 2018
By
Analysing Digital Water Meter Data using the Tidyverse

Many water utilities are implementing or considering digital metering. This article describes analysing digital water meter data using the data science Tidyverse library. Continue reading → The post Analysing Digital Water Meter Data using the Tidyverse appeared first on The Devil is in the Data.

Read more »

2018-02 An ISCC-NBS Colour List for ‘roloc’

February 7, 2018
By

This report describes the development of a colour list and colour metric for the ‘roloc’ package that is based on the ISCC-NBS System of Color Designation. Paul Murrell Download

Read more »

The plots thicken

February 7, 2018
By
The plots thicken

Data science tells a story through visualisation. Both story and visualisation rely on a good plot. And an abundance of those has evolved over time. Many have their own dedicated Wikipedia page! Which generate the most interest? How is the interest in each trending over time? The post The plots thicken appeared first on thinkr.

Read more »

In case you missed it: January 2018 roundup

February 7, 2018
By

In case you missed them, here are some articles from January of particular interest to R users. Josh Katz and Peter Aldhous used R to analyze the content and presentation of the most recent State of the Union speech from the US president. Slides for my presentation "Speeding up R with Parallel Processing in the Cloud", with applications of...

Read more »

Additional Thoughts on Estimating LGD with Proportional Odds Model

February 6, 2018
By

In my previous post (https://statcompute.wordpress.com/2018/01/28/modeling-lgd-with-proportional-odds-model), I’ve discussed how to use Proportional Odds Models in the LGD model development. In particular, I specifically mentioned that we would estimate a sub-model, which can be Gamma or Simplex regression, to project the conditional mean for LGD values in the (0, 1) range. However, it is worth pointing out

Read more »

Search R-bloggers


Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R



Quantide: statistical consulting and training

ODSC2 west

ODSC1_jobs

datasociety

http://www.eoda.de



CRC R books series







Six Sigma Online Training



mljar.com



Contact us if you wish to help support R-bloggers, and place your banner here.