Pretty histograms with ggplot2

May 10, 2017
By
Pretty histograms with ggplot2

@drsimonj here to make pretty histograms with ggplot2! In this post you’ll learn how to create histograms like this:  The data Let’s simulate data for a continuous variable x in a data frame d: set.seed(070510) d <- data.frame(x = rnorm(2000)) head(d) #> x #> 1 ...

Read more »

Euler Problem 20: Large Integer Factorials

May 10, 2017
By
Euler Problem 20: Large Integer Factorials

A proposed solution in the R language to Euler Problem 20: Find the sum of the digits in the faculty of 100: 100 × 99 × ... × 3 × 2 × 1 Continue reading → The post Euler Problem 20: Large Integer Factorials appeared first on The Devil is in the Data.

Read more »

Mapping Quandl Macroeconomic Data

May 9, 2017
By

In previous posts, we built a map to access global ETFs and a simple Shiny app to import and forecast commodities data from Quandl. Today, we will begin a project that combines those previous apps. Our end goal is to build an interactive map to access macroeconomic data via Quandl, allowing the user to choose an economic...

Read more »

niceOverPlot, or when the number of dimensions does matter

niceOverPlot, or when the number of dimensions does matter

  Hi there!    Over the last few months, my lab-mate Irene Villa (see more of her work here!) and I, have been discussing ecological niche overlap. The niche concept dates back to ideas first proposed by ornithologist J. Grinnell (1917). Later on, G.E. Hutchinson (1957) defined the ecological niche of a species as the...

Read more »

Predicting Hospital Length of Stay using SQL Server R Services

May 9, 2017
By
Predicting Hospital Length of Stay using SQL Server R Services

Last week, my Microsoft colleagues Bharath Sankaranarayan and Carl Saroufim presented a live webinar showing how you can predict a patient's length of stay at a hospital using SQL Server R Services. The recorded webinar is available for on-demand viewing now. (Registration is required to view.) The webinar is based on the Machine Learning Solution Template Predicting Length of...

Read more »

Better block sampling in MCMC with the Automated Factor Slice Sampler

May 9, 2017
By
Better block sampling in MCMC with the Automated Factor Slice Sampler

One nice feature of NIMBLE’s MCMC system is that a user can easily write new samplers from R, combine them with NIMBLE’s samplers, and have them automatically compiled to C++ via the NIMBLE compiler. We’ve observed that block sampling using a simple adaptive multivariate random walk Metropolis-Hastings sampler doesn’t always work well in practice, so

Read more »

Looking for a Programming or Statistics High Level Course? MIT Open Course Ware.

Looking for a Programming or Statistics High Level Course? MIT Open Course Ware.

Although MIT OCW has been operating for more than 15 years, I consider it important to do this post as there are still many people who do not know about its existence.MIT OpenCourseWare (OCW) is a web-based publication of virtually all MIT course co...

Read more »

Shiny Applications Layouts Exercises (Part-6)

May 9, 2017
By
Shiny Applications Layouts Exercises (Part-6)

Shiny Applications Layouts – Absolutely-positioned panel In the sixth part of our journey through Shiny App Layouts we will meet the absolutely-positioned panels. These are panels that you can drag and drop or not wherever you want in the interface. Moreover you can put anything in them, including inputs and outputs. This part can be Related exercise sets:

Read more »

Load a Python/pandas data frame from an HDF5 file into R

May 9, 2017
By

The title is self-descriptive, so I will not dwell on the issue at length before showing the code. Just a small note: to my knowledge, there is only one public snippet out there that addresses this particular problem. It uses the Bioc package rhdf5 and you can find it here. The main problem is that it only works when… Continuar leyendo...

Read more »

datasauRus now on CRAN

May 9, 2017
By
datasauRus now on CRAN

datasauRus is a package storing the datasets from the paper Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. It’s a useful package for: Having a dinosaur dataset Showing a dinosaur related variant of The post datasauRus now on CRAN appeared first on Locke Data. Locke Data are a data...

Read more »

Reports or Newspapers – The Two Sides of Healthcare Priorities

May 9, 2017
By
Reports or Newspapers – The Two Sides of Healthcare Priorities

  Both the World Health Organization‘s statistical profile of Qatar and the much more detailed Annual Health Report of the Department of Epidemiology and Medical Statistics of the Sate of Qatar show beyond the shadow of a doubt that cardiovascular diseases, diabetes, hypertension, obesity and other metabolic/noncommunicable diseases are the Read More ...

Read more »

Studying CRAN package names

May 9, 2017
By
Studying CRAN package names

- Setting a name for a CRAN package is an intimate process. Out of an infinite range of possibilities, an idea comes for a package and you spend at least a couple of days writing up and testing your code...

Read more »

Travis-CI Flaw Exposed Some ‘Secure’ Environment Variable Contents

May 8, 2017
By

Tagging this as #rstats-related since many R coders use Travis-CI to automate package builds (and other things). Security researcher Ivan Vyshnevskyi did some ++gd responsible disclosure to the Travis-CI folks letting them know they were leaking the contents of “secure” environment variables in the build logs. The TL;DR on “secure” environment variables is that they... Continue reading...

Read more »

Video Introduction to Bayesian Data Analysis, Part 3: How to do Bayes?

May 8, 2017
By
Video Introduction to Bayesian Data Analysis, Part 3: How to do Bayes?

This is the last video of a three part introduction to Bayesian data analysis aimed at you who isn’t necessarily that well-versed in probability theory but that do know a little bit of programming. If you haven’t watched the other parts yet, I re...

Read more »

Jobs for R users – from all over the world (2017-05-08)

May 8, 2017
By
Jobs for R users – from all over the world (2017-05-08)

To post your R job on the next post Just visit this link and post a new R job to the R community. You can post a job for free (and there are also “featured job” options available for extra exposure). Current R jobs Job seekers: please follow the links below to learn more and apply for your R job of interest: Featured Jobs Full-Time Data Analyst,...

Read more »

Machine Learning Pipelines for R

May 8, 2017
By
Machine Learning Pipelines for R

Building machine learning and statistical models often requires pre- and post-transformation of the input and/or response variables, prior to training (or fitting) the models. For example, a model may require training on the logarithm of the response and input variables. As a consequence, fitting and then generating predictions from these models requires repeated application of … Continue...

Read more »

From Points to (Messy) Lines

May 8, 2017
By
From Points to (Messy) Lines

A week or so ago, I came up with a new chart type – race concordance charts – for looking at a motor circuit race from the on-track perspective of a particular driver. Here are a couple of examples from the 2017 F1 Grand Prix: The gap is the time to the car on track

Read more »

Machine Learning. Regression Trees and Model Trees (Predicting Wine Quality)

We will develop a forecasting example using model trees and regression trees algorithms. The exercise was originally published in "Machine Learning in R" by Brett Lantz, PACKT publishing 2015 (open source community experience destilled).The example we will develop is about predicting...

Read more »

Installing Packages without Internet

May 8, 2017
By

Graham Parsons At Mango we’re often giving R training in locations where a reliable WiFi connection is not always guaranteed, so if we need trainees to download packages from CRAN it can be a show-stopper. Here are a couple of … Continue reading →

Read more »

Graphical Presentation of Missing Data; VIM Package

May 8, 2017
By
Graphical Presentation of Missing Data; VIM Package

Missing data is a problem that challenge data analysis methodologically and computationally in medical research. Patients of the clinical trials and cohort studies may drop out of the study, and therefore, generate missing data. The missing data could be at random when participants who drop out of study are not different from those who remained Related Post

Read more »

Trading Strategy: 52-Weeks High Effect in Stocks

May 8, 2017
By
Trading Strategy: 52-Weeks High Effect in Stocks

By Milind Paradkar In today’s algorithmic trading having a trading edge is one of the most critical elements. It’s plain simple. If you don’t have an edge, don’t trade! Hence, as a quant, one is always on a look out for good trading ideas. One of the good resources for trading strategies that have been... The post Trading...

Read more »

ggplot2 style plotting in Python

May 8, 2017
By
ggplot2 style plotting in Python

R is my language of choice for data science but a good data scientist should have some knowledge of all of the great tools available to them. Recently, I have been gleefully using Python for machine learning problems (specifically pandas and the wonderful scikit-learn). However, for all its greatness, I couldn’t help but feel it… Continue reading...

Read more »

R Quick Tip: parameter re-use within rmarkdown YAML

May 8, 2017
By

Ever wondered how to make an rmarkdown title dynamic? Maybe, wanted to use a parameter in multiple locations? Maybe wanted to pass through a publication date? Advanced use of YAML headers can help! Normally, when we write rmarkdown, we might The post R Quick Tip: parameter re-use within rmarkdown YAML appeared first on Locke Data. Locke...

Read more »

GSoC 2017 : Integrating biodiversity data curation functionality

May 7, 2017
By
GSoC 2017 : Integrating biodiversity data curation functionality

By Thiloshon Nagarajah URL of project idea page: https://github.com/rstats-gsoc/gsoc2017/wiki/Integrating-biodiversity-data-curation-functionality Introduction The importance of data in the biodiversity research has been repeatedly stressed in the recent times and various organizations have come together and followed each other to provide data for advancing biodiversity research. But, that is exactly where the main hiccup of biodiversity research lies. Since

Read more »

Turning kindle notes into a tidy data

May 7, 2017
By

It is my dream to do everything with R. And we aRe almost there. We can write blogs in blogdown or bookdown, write reports in RMarkdown (thank you Yihui Xie!) create interactive webpages with Shiny (thank you Winston Chang). Control our lifx light...

Read more »

Shiny Application Layouts Exercises (Part-5)

May 7, 2017
By
Shiny Application Layouts Exercises (Part-5)

Shiny Application Layouts-Vertical Layout In the fifth part of our series we will apply the kmeans() function to the iris dataset to create a shiny application. The difference is that now we will display its result vertically. This part can be useful for you in two ways. First of all, you can see different ways Related exercise sets:

Read more »

Know your data structures!

May 7, 2017
By
Know your data structures!

Just a few days ago I stated the following on Twitter: Just reduced the runtime of an algorithm from 9 hours to 3 min. by using a different data structure… Know you data structures 🙂 #rstats — Verena Haunschmid (@ExpectAPatronum) May 1, 2017 Since my tweet has been liked and shared a lot, I thought … Continue...

Read more »

Plot the Vote: Making U.S. Senate & House Cartograms in R

May 7, 2017
By
Plot the Vote: Making U.S. Senate & House Cartograms in R

Political machinations are a tad insane in the U.S. these days & I regularly hit up @ProPublica & @GovTrack sites (& sub to the GovTrack e-mail updates) as I try to be an informed citizen, especially since I’ve got a Senator and Representative who seem to be in the sway of 🍊. I’ve always appreciated... Continue reading...

Read more »

RInside 0.2.14

A new release 0.2.14 of RInside is now on CRAN and in Debian. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by Rcpp. It has been nearly two years since the last release, and a number of nice extensions,...

Read more »

Sponsors

Mango solutions







Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training





omictools

Contact us if you wish to help support R-bloggers, and place your banner here.