Gabriel and Hugo discuss his role on helping to make the BBC more data informed.

February 11, 2019
By
Gabriel and Hugo discuss his role on helping to make the BBC more data informed.

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Gabriel Straub, the Head of Data Science and Architecture at the BBC. Here is the podcast link. Introducing Gabriel Straub Hugo: Hi there Gabriel, and welcome to DataFramed. Gabriel: ...

Read more »

Direct Optimization of Hyper-Parameter

February 10, 2019
By

In the previous post (https://statcompute.wordpress.com/2019/02/03/sobol-sequence-vs-uniform-random-in-hyper-parameter-optimization), it is shown how to identify the optimal hyper-parameter in a General Regression Neural Network by using the Sobol sequence and the uniform random generator respectively through the N-fold cross validation. While the Sobol sequence yields a slightly better performance, outcomes from both approaches are very similar, as shown below

Read more »

How did Axios rectangle Trump’s PDF schedule? A try with R

February 10, 2019
By
How did Axios rectangle Trump’s PDF schedule? A try with R

Last week, Axios published a very interesting piece reporting on Trump’s private schedule thanks to an insider’s leak. The headlines all were about Trump’s spending more than 60% of his time in “executive time” which admittedly was indeed the most important aspect of the story. I, however, also got curious about Axios’ work to go from the PDF schedules to the spreadsheet they made public....

Read more »

A Major Upgrade of the V8 package

This week version 2.0 of the V8 package has been released to CRAN. Go get it now! install.packages("V8") The V8 package provides an embedded JavaScript engine that can be used inside of R. You can use it interactively as a JavaScript console, but it is mostly useful for wrapping JavaScript libraries in R packages. Some cool examples include jsonld, jsonvalidate, and...

Read more »

Multilevel Modelling in R: Analysing Vendor Data

February 10, 2019
By
Multilevel Modelling in R: Analysing Vendor Data

CategoriesRegression Models Tags Linear Mixed Model Linear Regression R Programming One of the main limitations of regression analysis is when one needs to examine changes in data across several categories. This problem can be resolved by using a multilevel model, i.e. one that varies at more than one level and allows for variation between different groups or categories. This dataset from data.ok.gov contains information...

Read more »

stringfix : adding transcoder shiny app

February 9, 2019
By
stringfix : adding transcoder shiny app

Adding quotes to a character list Transcoder : Shiny app to tranpose lists to columns (and reciprocally) and formatting tricks Often I have to take a character list or column and put it in a vector, which means before I have to add quote...

Read more »

Manipulating strings with the {stringr} package

Manipulating strings with the {stringr} package

This blog post is an excerpt of my ebook Modern R with the tidyverse that you can read for free here. This is taken from Chapter 4, in which I introduce the {stringr} package. Manipulate strings with {stringr} {stringr} contains functions to manipulate strings. In Chapter 10, I will teach you about regular expressions, but the functions contained in {stringr} allow you to already...

Read more »

Quick Hit: Speeding Up a Slow/Mundane Task with a Little Rcpp

February 9, 2019
By
Quick Hit: Speeding Up a Slow/Mundane Task with a Little Rcpp

Over at $DAYJOB’s blog I’ve queued up a post that shows how to use our new opendata🔗 package to work with our Open Data portal’s API. I’m not super-sure when it’s going to be posted so keep an RSS reader fixed on https://blog.rapid7.com/ if you’re interested in seeing it (I may make a small note... Continue reading →

Read more »

Inserting “Edit on GitHub” Buttons in a Single R Markdown Document

February 9, 2019
By
Inserting “Edit on GitHub” Buttons in a Single R Markdown Document

As the R Markdown ecosystem becomes larger, users now may encounter situations where they have to make decisions on which output format of R Markdown to use. One may found none of the formats suitable – the features essential to the output document one wants may scatter across different output formats of R Markdown. Here is a real example I encountered....

Read more »

Where the German Companies Are

Where the German Companies Are

Last week, the German NGO Open Knowledge Foundation Deutschland e.V. has made German Trade Resister data available via the project OffeneRegister.de, together with the British NGO opencorporates. While the data from German Trade Resister is publicly available in principle, retrieving the data is a case-by-case activity and is very cumbersome (try for yourself if you like). The data provided...

Read more »

Benchmarking cast in R from long data frame to wide matrix

February 8, 2019
By
Benchmarking cast in R from long data frame to wide matrix

In my daily work I often have to transform a long table to a wide matrix so accommodate some function. At some stage in my life I came across the reshape2 package, and I have been with that philosophy ever since – I find it makes data wrangling easy and straight forward. I particularly like … Continue reading Benchmarking...

Read more »

Deploying an R Shiny App With Docker

February 8, 2019
By

If you haven’t heard of Docker, it is a system that allows projects to be split into discrete units (i.e. containers) that each operate within their own virtual environment. Each container has a blueprint written in its Dockerfile that describes all of the operating parameters including operating system and package dependencies/requirements. Docker images are easily … Continue reading Deploying...

Read more »

NSERC – Discovery Grants Program, over the past 5 years

February 7, 2019
By
NSERC – Discovery Grants Program, over the past 5 years

In a previous post, I discussed how it was possible to scrap the NSERC website to get stats about discovery grants. Since we just got the new 2018 figures, I thought it would be a good opportunity to update my graphs, library(XML) library(stringr) url="http://www.nserc-crsng.gc.ca/NSERC-CRSNG/FundingDecisions-DecisionsFinancement/ResearchGrants-SubventionsDeRecherche/ResultsGSC-ResultatsCSS_eng.asp" download.file(url,destfile = "GSC.html") library(XML) tables=readHTMLTable("GSC.html") GSC=tables]$V1 GSC=as.character(GSC) namesGSC=tables]$V2 namesGSC=as.character(namesGSC) Correction = function(x) as.numeric(gsub('', '', x))...

Read more »

“Correlation is not causation”. So what is?

February 7, 2019
By
“Correlation is not causation”. So what is?

Intro Machine learning applications have been growing in volume and scope rapidly over the last few years. What’s Causal inference, how is it different than plain good ole’ ML and when should you consider using it? In this report I try giving a...

Read more »

Chester (sorry, Liverpool) is the Most Popular City in the World (relative to use as password per inhabitant)

Chester (sorry, Liverpool) is the Most Popular City in the World (relative to use as password per inhabitant)

Update 2018-02-17: The title of this article has changed reflecting new information I have received since publishing. For mor information, I refer to the last paragraph. A treasure trove of leaked passwords The API of pwnedpasswords.com is quite remarkable. It not only allows you to fetch the results generally obtained by typing in your e-mail into the browser interface and finding...

Read more »

Liverpool is the Most Popular City in the World (relative to use as password per inhabitant)

Liverpool is the Most Popular City in the World (relative to use as password per inhabitant)

A treasure trove of leaked passwords The API of pwnedpasswords.com is quite remarkable. It not only allows you to fetch the results generally obtained by typing in your e-mail into the browser interface and finding out whether or not you've been pwned from the comfort of your shell. It further allows you to very simply check whether a certain password...

Read more »

Introducing olsrr

February 7, 2019
By
Introducing olsrr

I am pleased to announce the olsrr package, a set of tools for improved output from linear regression models, designed keeping in mind beginner/intermediate R users. The package includes: comprehensive regression output variable selection procedures heteroskedasticiy, collinearity diagnostics and measures of influence various plots and underlying data If you know how to build models using lm(), you will find olsrr very useful. Most of the functions use...

Read more »

“Correlation is not causation”. So what is?

February 7, 2019
By
“Correlation is not causation”. So what is?

Intro Machine learning applications have been growing in volume and scope rapidly over the last few years. What’s Causal inference, how is it different than plain good ole’ ML and when should you consider using it? In this report I try giving a...

Read more »

Launching codecentric.AI Bootcamp course!

February 7, 2019
By
Launching codecentric.AI Bootcamp course!

Today, I am happy to announce the launch of our codecentric.AI Bootcamp! This bootcamp is a free online course for everyone who wants to learn hands-on machine learning and AI techniques, from basic algorithms to deep learning, computer vision and NLP....

Read more »

Create data visualizations like BBC News with the BBC’s R Cookbook

February 7, 2019
By
Create data visualizations like BBC News with the BBC’s R Cookbook

If you're looking a guide to making publication-ready data visualizations in R, check out the BBC Visual and Data Journalism cookbook for R graphics. Announced in a BBC blog post this week, it provides scripts for making line charts, bar charts, and other visualizations like those below used in the BBC's data journalism. The cookbook is based around the...

Read more »

Statswars

February 7, 2019
By
Statswars

I am stuck at home sick today, so I decided to provide a relational analysis of the Stats Package Wars that have been bubbling away for the past week. True in all its details. If you want something slightly more constructive, consider The Plain Person’s Guide to Plain-Text Social Science.

Read more »

An absolute beginner’s guide to creating data frames for a Stack Overflow [r] question

February 7, 2019
By

For better or worse I spend some time each day at Stack Overflow , reading and answering questions. If you do the same, you probably notice certain features in questions that recur frequently. It’s as though everyone is copying from one source – perhaps the one at the top of the search results. And it … Continue reading An...

Read more »

Are you leaking h2o? Call plumber!

February 7, 2019
By
Are you leaking h2o? Call plumber!

Create a predictive model with the h2o package. H2o is a fantastic open source machine learning platform with many different algorithms. There is Graphical user interface, a Python interface and an R interface. Suppose you want to create a predictive … Continue reading →

Read more »

Investigating words distribution with R – Zipf’s law

February 7, 2019
By
Investigating words distribution with R – Zipf’s law

Hello again! Typically I would start by describing a complicated problem that can be solved using machine or deep learning methods, but today I want to do something different, I want to show you some interesting probabilistic phenomena! Have you heard of Zipf’s law? I hadn’t until recently. Zipf’s law is an empirical law that Article Investigating words distribution...

Read more »

Le Monde puzzle [#1083]

February 6, 2019
By
Le Monde puzzle [#1083]

A Le Monde mathematical puzzle that seems hard to solve without the backup of a computer (and just simple enough to code on a flight to Montpellier): Given the number N=2,019, find a decomposition of N as a sum of non-trivial powers of integers such that (a) the number of integers in the sum is

Read more »

PDSwR2: New Chapters!

February 6, 2019
By
PDSwR2: New Chapters!

We have two new chapters of Practical Data Science with R, Second Edition online and available for review! The newly available chapters cover: Data Engineering And Data Shaping – Explores how to use R to organize or wrangle data into a shape useful for analysis. The chapter covers applying data transforms, data manipulation packages, and … Continue reading PDSwR2:...

Read more »

Mapping multiple trends with confidence

February 6, 2019
By
Mapping multiple trends with confidence

A tutorial to compute trends by groups and plot/map the results We will use dplyr::nest to create a list-column and will apply a model (with purrr::map) to each row, then we will extract each slope and its p-value with map and broom::tidy. Setup Data Map data. Départements polygons from OSM. Population data by département 1990-2008

Read more »

Visualizing New York City WiFi Access with K-Means Clustering

February 5, 2019
By
Visualizing New York City WiFi Access with K-Means Clustering

CategoriesAdvanced Modeling Tags K Means R Programming Unsupervised Learning Visualization has become a key application of data science in the telecommunications industry. Specifically, telecommunication analysis is highly dependent on the use of geospatial data. This is because telecommunication networks in themselves are geographically dispersed, and analysis of such dispersions can yield valuable insights regarding network structures, consumer demand, and availability. Data To illustrate this...

Read more »

R for trial and model-based cost-effectiveness analysis

February 5, 2019
By

9 July 2019, University College London Training event (8 July): Torrington (1-19) B07 - Teal Room in Torrington Place, 1-19 (), University College London, United Kingdom Main workshop (9 July): Anatomy G29 J Z Young Lecture Theatre, UCL Medical Sciences and Anatomy (https://goo.gl/maps/biryoFc9CiL2), University College London, United Kingdom. Background and objectives It is our pleasure to...

Read more »

Search R-bloggers


Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R



wiley.com/learn/datascience

Quantide: statistical consulting and training

ODSC boston

http://www.eoda.de









Six Sigma Online Training

mljar.com

Our ads respect your privacy. Read our Privacy Policy page to learn more.

Contact us if you wish to help support R-bloggers, and place your banner here.