A line of best fit lets you model, predict, forecast, and explain data. This post shows how you can use a line of best fit to explain college tuition, rats, turkeys, burritos, and the NHL draft. Read on or see our tutorials for more. Contact us if you’re interested in a trial of plotly on-premise. Developers, scroll down to see Python and R.
1. A Linear Fit For College Tuition
How many hours would you have to work on the minimum wage to pay for one credit hour of college? In 1979 it was around 8 hours. In 2013 you would have to work 59 hours on the minimum wage to pay for one credit hour. Our linear fit picks the best slope and y-intercept to show us a trend in the data.
Plotly calculates the mean squared error, fit parameters (slope and y-intercept), and the R2, also known as the coefficient of determination. The R2 is a calculation from 0 to 1 showing how closely the fit models the data. In this case, the R2 is 0.9504, a close fit. Hover your mouse to see data; click and drag your mouse to zoom.
2. NHL Players & Burritos With Gaussian Fit
A Gaussian fit looks like a bell curve. The fit shows trends in observations between two points on a line.
The data in the first histogram we’re fitting—click here for a histogram tutorial—shows the height of NHL players from the 2013 draft. The bins show how many players are in each bin between 64.5 and 79.5 inches (our boundaries). For example, there are 36 players who are 71 inches tall. The fit adds a bell curve to the distribution.
We’ve applied a Guassian fit to study burritos (and burrito bowls) at Chipotle. The data shows what % of meals contain a given number of calories, with a Gaussian fit added to the plot.
3. Polynomial Fits & Turkeys
The data below models turkey growth. The researchers determined that a fourth degree polynomial model is best for estimating the growth of the native Mexican turkey. A polynomial fit is a type of nonlinear fit, and we can specify the degree of the fit (e.g., 4th).
4. Rat Populations
An exponential fit models exponential growth or decay. Rat populations, which can double every 47 days, are an example. The graph below estimates the population size of a colony of rats living in optimal conditions after three years assuming a single pair of rats to start.
We’re plotting the fit over a specific x range, one of Plotly’s advanced features:
5. Plotting With Plotly’s APIs
Plotly’s APIs let you build plots and add fits with Python, R, and MATLAB. The plot below shows the distribution of student grades with a Gaussian fit, and was made in an IPython Notebook.
We can also add fits with Plotly’s R API. You can copy and paste the code below to make a plot with R in Plotly.
install.packages("devtools") library("devtools") install_github("ropensci/plotly") devtools::install_github("ropensci/plotly") library(plotly) py <- plotly(username="r_user_guide", key="mw5isa4yqp") # open plotly connection c <- ggplot(mtcars, aes(qsec, wt)) c + stat_smooth() + geom_point() py$ggplotly()
We’re @plotlygraphs and would love to hear your thoughts and feedback.