Articles by John

How to Remove Outliers in R

January 19, 2020 | John

Statisticians often come across outliers when working with datasets and it is important to deal with them because of how significantly they can distort a statistical model. Your dataset may have values that are distinguishably … The post How to Remove Outliers in R appeared first on ProgrammingR. [Read more...]

Validate Me! Simple Test vs. Holdout Samples in R

January 8, 2020 | John

In statistics, it is often necessary to not only model data but test that model as well. To do this, you need to randomly separate the data into two groups ensuring even samples regardless of … The post Validate Me! Simple Test vs. Holdout Samples in R appeared first on ProgrammingR. ... [Read more...]

Validate Me! Simple Test vs. Holdout Samples in R

January 8, 2020 | John

In statistics, it is often necessary to not only model data but test that model as well. To do this, you need to randomly separate the data into two groups ensuring even samples regardless of … The post Validate Me! Simple Test vs. Holdout Samples in R appeared first on ProgrammingR. [Read more...]

Finished Data Science Specialization of Coursera

May 22, 2019 | John

In 2018 I restarted with the Johns Hopkins University Data Science Specialization of Coursera with certificates. In May 2019 I finished the 10 course program that covers the Data Science process from data collection to the production of Data Science products and all done in R. The 10 cources are: Course 1: The Data Scientist’... [Read more...]

Data Science Capstone – Milestone Report

April 14, 2019 | John

Executive Summary This is the Milestone Report for the Coursera Data Science Capstone project. The goal of the capstone project is to create a predictive text model using a large text corpus of documents as training data. Natural language processing techniques will be used to perform the analysis and build ... [Read more...]

Developing Data Products

March 3, 2019 | John

I had to develop an RStudio shiny application as a part of final project in the Developing Data Products course in the Coursera Data Science Specialization track. The Shiny application calculates your maximum Heart Rate and the target Heart Rate zones for the chosen age. It includes: Its build with ... [Read more...]

Zen and The Art of Competing Against MBA’s

January 24, 2019 | John

“I appreciate your ambition, but we’re looking for an MBA…” My senior manager smiled and indicated the topic was closed. Despite the fact I was effectively running our direct mail program in the absence of … The post Zen and The Art of Competing Against MBA’s appeared first on ... [Read more...]

Zen and The Art of Competing Against MBA’s

January 24, 2019 | John

“I appreciate your ambition, but we’re looking for an MBA…” My senior manager smiled and indicated the topic was closed. Despite the fact I was [...] The post Zen and The Art of Competing Against MBA’s appeared first on ProgrammingR. [Read more...]

The First Date with your Data in R

June 3, 2018 | John

The First Date with your Data in R So you have your data, now what? With a little R code, you can quickly get to [...] The post The First Date with your Data in R appeared first on ProgrammingR. [Read more...]

How To Make Your Data Analyst Resume Stand Out

February 5, 2017 | John

To the typical reader, most technical resumes sound alike and share none of the unique personality behind the paper. For example, you may know that [...] The post How To Make Your Data Analyst Resume Stand Out appeared first on ProgrammingR. [Read more...]

Simple Anagram Finder Using R

November 27, 2016 | John

One of my early programming projects in Python was a word game solver (example: word jumble solver) – the early version was a simple script, which grew [...] The post Simple Anagram Finder Using R appeared first on ProgrammingR. [Read more...]

Webscraping with rvest: So Easy Even An MBA Can Do It!

November 6, 2016 | John

This is the fourth installment in our series about web scraping with R. This includes practical examples for the leading R web scraping packages, including: RCurl package [...] The post Webscraping with rvest: So Easy Even An MBA Can Do It! appeared first on ProgrammingR. [Read more...]

Resume & Interview Tips For R Programmers

July 6, 2016 | John

Speaking as a hiring manager, it doesn’t take much to stand out as a candidate for a statistical programming job. We just finished hiring the [...] The post Resume & Interview Tips For R Programmers appeared first on ProgrammingR.
[Read more...]

Data Mining the California Solar Statistics with R: Part V

June 8, 2015 | John

Building a Shiny App to explore the model and the data About the Shiny App In my previous post I built several models to try to predict the amount of residential solar installed per county by quarter as a function of solar insolation, price of solar electricity, county population and ... [Read more...]

Data Mining the California Solar Statistics with R: Part III

May 11, 2015 | John

Data Mining the California Solar Statistics with R: Part III Today I want to combine the California solar statistics with information about the annual solar insolation in each county as well as information about the population and median income. These can then be used as predictors in the models I'll ...
[Read more...]

Data Mining the California Solar Statistics with R: Part I

April 24, 2015 | John

Data Mining the California Solar Statistics with R: Part I Intro Today I’m taking a look at the data set available from California Solar Statistics availalbe from https://www.californiasolarstatistics.ca.gov/. This data set lists all the applications for state incentives for both residential and commercial systems, it ...
[Read more...]

Automatic drug utilization reports with R and ggplot2

September 18, 2012 | John

This program takes a data set of drug utilisation of 4 fictional drugs in 10 fictional hospitals and plots each time-series with a locally weighted regression (Lowess) trend line. It also places an time-series trend of the usage for each … Continue reading → [Read more...]

R script to manipulate health data

June 3, 2012 | John

Here is the code that fixed up the World Bank data export for use in Tableau. The databank spits out everything in an untidy format for grouping and aggregating. The reshape2 and plyr packages  make it easy to manipulate the whole set … Continue reading → [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)