Articles by Eric Cai - The Chemical Statistician

Exploratory Data Analysis: Conceptual Foundations of Histograms – Illustrated with New York’s Ozone Pollution Data

July 9, 2013 | Eric Cai - The Chemical Statistician

Introduction Continuing my recent series on exploratory data analysis (EDA), today’s post focuses on histograms, which are very useful plots for visualizing the distribution of a data set. I will discuss how histograms are constructed and use histograms to assess the distribution of the “Ozone” data from the built-in “... [Read more...]

Exploratory Data Analysis – Kernel Density Estimation and Rug Plots on Ozone Data in New York and Ozonopolis

June 30, 2013 | Eric Cai - The Chemical Statistician

For the sake of brevity, this post has been created from the second half of a previous long post on kernel density estimation. This second half focuses on constructing kernel density plots and rug plots in R. The first half focused on the conceptual foundations of kernel density estimation. Introduction ... [Read more...]

Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

June 25, 2013 | Eric Cai - The Chemical Statistician

Introduction Continuing my recent series on exploratory data analysis (EDA), and following up on the last post on the conceptual foundations of empirical cumulative distribution functions (CDFs), this post shows how to plot them in R. (Previous posts in this series on EDA include descriptive statistics, box plots, kernel density ... [Read more...]

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

June 24, 2013 | Eric Cai - The Chemical Statistician

Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R. (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and ... [Read more...]

Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data

June 16, 2013 | Eric Cai - The Chemical Statistician

Introduction Recently, I began a series on exploratory data analysis (EDA), and I have written about descriptive statistics, box plots, and kernel density plots so far. As previously mentioned in my post on box plots, there is a way to combine box plots and kernel density plots. This combination results ... [Read more...]

Exploratory Data Analysis: Kernel Density Estimation in R on Ozone Pollution Data in New York and Ozonopolis

June 9, 2013 | Eric Cai - The Chemical Statistician

Introduction Recently, I began a series on exploratory data analysis; so far, I have written about computing descriptive statistics and creating box plots in R for a univariate data set with missing values. Today, I will continue this series by analyzing the same data set with kernel density estimation, a ... [Read more...]

Exploratory Data Analysis: Variations of Box Plots in R for Ozone Concentrations in New York City and Ozonopolis

May 26, 2013 | Eric Cai - The Chemical Statistician

Introduction Last week, I wrote the first post in a series on exploratory data analysis (EDA). I began by calculating summary statistics on a univariate data set of ozone concentration in New York City in the built-in data set “airquality” in R. In particular, I talked about how to calculate ... [Read more...]

When Does the Kinetic Theory of Gases Fail? Examining its Postulates with Assistance from Simple Linear Regression in R

May 19, 2013 | Eric Cai - The Chemical Statistician

Introduction The Ideal Gas Law, , is a very simple yet useful relationship that describes the behaviours of many gases pretty well in many situations. It is “Ideal” because it makes some assumptions about gas particles that make the math and the physics easy to work with; in fact, the simplicity ... [Read more...]

Exploratory Data Analysis – Computing Descriptive Statistics in R for Data on Ozone Pollution in New York City

May 19, 2013 | Eric Cai - The Chemical Statistician

Introduction This is the first of a series of posts on exploratory data analysis (EDA). This post will calculate the common summary statistics of a univariate continuous data set – the data on ozone pollution in New York City that is part of the built-in “airquality” data set in R. This ... [Read more...]

How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid

May 5, 2013 | Eric Cai - The Chemical Statistician

Introduction Today, I will talk about the math behind calculating partial correlation and illustrate the computation in R with an example involving the oxidation of ammonia to make nitric acid using a built-in data set in R called stackloss. In a separate post, I will also share an R function ... [Read more...]

Using the Golden Section Search Method to Minimize the Sum of Absolute Deviations

April 28, 2013 | Eric Cai - The Chemical Statistician

Introduction Recently, I introduced the golden search method – a special way to save computation time by modifying the bisection method with the golden ratio, and I illustrated how to minimize a cusped function with this script. I also wrote an R function to implement this method and an R script ... [Read more...]

Scripts and Functions: Using R to Implement the Golden Section Search Method for Numerical Optimization

April 22, 2013 | Eric Cai - The Chemical Statistician

In an earlier post, I introduced the golden section search method – a modification of the bisection method for numerical optimization that saves computation time by using the golden ratio to set its test points. This post contains the R function that implements this method, the R functions that contain the 3 ... [Read more...]

The Golden Section Search Method: Modifying the Bisection Method with the Golden Ratio for Numerical Optimization

April 22, 2013 | Eric Cai - The Chemical Statistician

Introduction The first algorithm that I learned for root-finding in my undergraduate numerical analysis class (MACM 316 at Simon Fraser University) was the bisection method. It’s very intuitive and easy to implement in any programming language (I was using MATLAB at the time). The bisection method can be easily adapted ... [Read more...]

Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241

April 14, 2013 | Eric Cai - The Chemical Statistician

Introduction Today, I will discuss the alpha decay of americium-241 and use R to model the number of emissions from a real data set with the Poisson distribution. I was especially intrigued in learning about the use of Am-241 in smoke detectors, and I will elaborate on this clever application. ... [Read more...]

How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

March 31, 2013 | Eric Cai - The Chemical Statistician

In the early morning, especially here in Canada, I often see dew – water droplets formed by the condensation of water vapour on outside surfaces, like windows, car roofs, and leaves of trees. I also sometimes see fog – water droplets or ice crystals that are suspended in air and often blocking ... [Read more...]

Checking for Normality with Quantile Ranges and the Standard Deviation

March 31, 2013 | Eric Cai - The Chemical Statistician

Introduction I was reading Michael Trosset’s “An Introduction to Statistical Inference and Its Applications with R”, and I learned a basic but interesting fact about the normal distribution’s interquartile range and standard deviation that I had not learned before. This turns out to be a good way to ... [Read more...]

Estimating the Decay Rate and the Half-Life of DDT in Trout – Applying Simple Linear Regression with Logarithmic Transformation

March 24, 2013 | Eric Cai - The Chemical Statistician

This blog post uses a function and a script written in R that were displayed in an earlier blog post. Introduction This is the second of a series of blog posts about simple linear regression; the first was written recently on some conceptual nuances and subtleties about this model. In ... [Read more...]

My Own R Function and Script for Simple Linear Regression – An Illustration with Exponential Decay of DDT in Trout

March 24, 2013 | Eric Cai - The Chemical Statistician

Here is the function that I wrote for doing simple linear regression, as alluded to in my blog post about simple linear regression on log-transformed data on the decay of DDT concentration in trout in Lake Michigan. My goal was to replicate the 4 columns of the output from applying summary() ... [Read more...]

Discovering Argon with the 2-Sample t-Test

March 10, 2013 | Eric Cai - The Chemical Statistician

I learned about Lord Rayleigh’s discovery of argon in my 2nd-year analytical chemistry class while reading “Quantitative Chemical Analysis” by Daniel Harris. (William Ramsay was also responsible for this discovery.) This is one of my favourite stories in chemistry; it illustrates how diligence in measurement can lead to an ... [Read more...]

Adding Labels to Points in a Scatter Plot in R

March 2, 2013 | Eric Cai - The Chemical Statistician

What’s the Scatter? A scatter plot displays the values of 2 variables for a set of data, and it is a very useful way to visualize data during exploratory data analysis, especially (though not exclusively) when you are interested in the relationship between a predictor variable and a target variable. ... [Read more...]

« 1 2 3 »

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Articles by Eric Cai - The Chemical Statistician

Exploratory Data Analysis: Conceptual Foundations of Histograms – Illustrated with New York’s Ozone Pollution Data

Exploratory Data Analysis – Kernel Density Estimation and Rug Plots on Ozone Data in New York and Ozonopolis

Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data

Exploratory Data Analysis: Kernel Density Estimation in R on Ozone Pollution Data in New York and Ozonopolis

Exploratory Data Analysis: Variations of Box Plots in R for Ozone Concentrations in New York City and Ozonopolis

When Does the Kinetic Theory of Gases Fail? Examining its Postulates with Assistance from Simple Linear Regression in R

Exploratory Data Analysis – Computing Descriptive Statistics in R for Data on Ozone Pollution in New York City

How to Calculate a Partial Correlation Coefficient in R: An Example with Oxidizing Ammonia to Make Nitric Acid

Using the Golden Section Search Method to Minimize the Sum of Absolute Deviations

Scripts and Functions: Using R to Implement the Golden Section Search Method for Numerical Optimization

The Golden Section Search Method: Modifying the Bisection Method with the Golden Ratio for Numerical Optimization

Checking the Goodness of Fit of the Poisson Distribution in R for Alpha Decay by Americium-241

How do Dew and Fog Form? Nature at Work with Temperature, Vapour Pressure, and Partial Pressure

Checking for Normality with Quantile Ranges and the Standard Deviation

Estimating the Decay Rate and the Half-Life of DDT in Trout – Applying Simple Linear Regression with Logarithmic Transformation

My Own R Function and Script for Simple Linear Regression – An Illustration with Exponential Decay of DDT in Trout

Discovering Argon with the 2-Sample t-Test

Adding Labels to Points in a Scatter Plot in R

Articles by Eric Cai - The Chemical Statistician

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)