Seven Ways You Can Use A Linear, Polynomial, Gaussian, & Exponential Line Of Best Fit

March 18, 2015
By

(This article was first published on Plotly Blog, and kindly contributed to R-bloggers)

A line of best fit lets you model, predict, forecast, and explain data. This post shows how you can use a line of best fit to explain college tuition, rats, turkeys, burritos, and the NHL draft. Read on or see our tutorials for more. Contact us if you’re interested in a trial of plotly on-premise. Developers, scroll down to see Python and R.

1. A Linear Fit For College Tuition

How many hours would you have to work on the minimum wage to pay for one credit hour of college? In 1979 it was around 8 hours. In 2013 you would have to work 59 hours on the minimum wage to pay for one credit hour. Our linear fit picks the best slope and y-intercept to show us a trend in the data.

to Pay for One University Credit Hour" style="display: block; text-align: center;">to Pay for One University Credit Hour" style="max-width: 100%;width: 729px;" width="450" onerror="this.onerror=null;this.src='https://plot.ly/404.png';"/>

Plotly calculates the mean squared error, fit parameters (slope and y-intercept), and the R2, also known as the coefficient of determination. The R2 is a calculation from 0 to 1 showing how closely the fit models the data. In this case, the R2 is 0.9504, a close fit. Hover your mouse to see data; click and drag your mouse to zoom.

2. NHL Players & Burritos With Gaussian Fit

A Gaussian fit looks like a bell curve. The fit shows trends in observations between two points on a line.

The data in the first histogram we’re fitting—click here for a histogram tutorial—shows the height of NHL players from the 2013 draft. The bins show how many players are in each bin between 64.5 and 79.5 inches (our boundaries). For example, there are 36 players who are 71 inches tall. The fit adds a bell curve to the distribution.

Col1 vs Col1 - fit

We’ve applied a Guassian fit to study burritos (and burrito bowls) at Chipotle. The data shows what % of meals contain a given number of calories, with a Gaussian fit added to the plot.

At Chipotle, How Many Calories Are You Consuming?" style="display: block; text-align: center;">At Chipotle, How Many Calories Are You Consuming?" style="max-width: 100%;width: 800px;" width="450" onerror="this.onerror=null;this.src='https://plot.ly/404.png';"/>

3. Polynomial Fits & Turkeys

The data below models turkey growth. The researchers determined that a fourth degree polynomial model is best for estimating the growth of the native Mexican turkey. A polynomial fit is a type of nonlinear fit, and we can specify the degree of the fit (e.g., 4th).

Native Mexican Turkey's Growth

4. Rat Populations

An exponential fit models exponential growth or decay. Rat populations, which can double every 47 days, are an example. The graph below estimates the population size of a colony of rats living in optimal conditions after three years assuming a single pair of rats to start.

Rat population growth under optimal conditions

We’re plotting the fit over a specific x range, one of Plotly’s advanced features:

5. Plotting With Plotly’s APIs

Plotly’s APIs let you build plots and add fits with Python, R, and MATLAB. The plot below shows the distribution of student grades with a Gaussian fit, and was made in an IPython Notebook.

course-grade-distribution

We can also add fits with Plotly’s R API. You can copy and paste the code below to make a plot with R in Plotly.

install.packages("devtools")
library("devtools")
install_github("ropensci/plotly")
devtools::install_github("ropensci/plotly")
library(plotly)
 
py <- plotly(username="r_user_guide", key="mw5isa4yqp") # open plotly connection
 
c <- ggplot(mtcars, aes(qsec, wt))
c + stat_smooth() + geom_point()
py$ggplotly()

We’re @plotlygraphs and would love to hear your thoughts and feedback.

To leave a comment for the author, please follow the link and comment on their blog: Plotly Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)