Blog Archives

Comparing a MySQL Query with a Data Table in R

December 2, 2016
By
Comparing a MySQL Query with a Data Table in R

  Data tables are becoming an increasingly popular way of working with data sets in R. The syntax can become rather complex, but the framework is much faster and more flexible than other methods. The basic structure of a data table, though, if fairly intuitive–as it corresponds quite nicely with SQL queries in relational database

Read more »

How to Summarize a Data Frame by Groups in R

November 30, 2016
By
How to Summarize a Data Frame by Groups in R

Sometimes, when you’re analyzing a data set and you want to get a complete picture of it, you want calculate the metrics on all the observations for each variable. Let’s say, for example, that you run a small zoo and want to inventory the cost of all your animals. To calculate this in a spreadsheet,

Read more »

Create, Interpret, and Use a Linear Regression Model in R

November 29, 2016
By
Create, Interpret, and Use a Linear Regression Model in R

In my last post, we looked at how to create a correlation matrix in R. Specifically, we used data pulled from the web to see which variables were most highly correlated with an automobile’s fuel economy. Suppose, however, that we are trying to guess the fuel economy of a new car without actually having driven

Read more »

Examine a Data Frame in R with 7 Basic Functions

November 29, 2016
By
Examine a Data Frame in R with 7 Basic Functions

When I first started learning R, it seemed way more complicated than what I was used to with looking at spreadsheets in Microsoft Excel. When I started working with data frames in R, it didn’t seem quite as easy to know what I was looking at. I’ve since come to see the light. While there is

Read more »

5 Ways to Subset a Data Frame in R

November 29, 2016
By
5 Ways to Subset a Data Frame in R

Often, when you’re working with a large data set, you will only be interested in a small portion of it for your particular analysis. So, how do you sort through all the extraneous variables and observations and extract only those you need? Well, R has several ways of doing this in a process it calls

Read more »

Nesting Functions in R with the Piping Operator

November 29, 2016
By
Nesting Functions in R with the Piping Operator

One of the most useful (and most popular) applications in R are the functions available in the dplyr package. With functions like select, filter, arrange, and mutate, you can restructure a data set to get it looking just the way you want it. The problem is that doing so can take multiple steps. As a

Read more »

Create a Correlation Matrix in R

November 21, 2016
By
Create a Correlation Matrix in R

So, in my last post, I showed how to create two histograms from a certain data set and then how to plot the two variables to see if there is any relationship. Visually, it was easy to tell that there was a negative relationship between the weight of an automobile and the fuel economy of

Read more »

Create Histograms and Scatter Plots in R for Exploratory Data Analysis

November 20, 2016
By
Create Histograms and Scatter Plots in R for Exploratory Data Analysis

No matter how sophisticated you get with your statistical analysis, you’ll usually start off exploring your data the same way. If you’re looking at a single variable, you’ll want to create a histogram to look at the distribution. If you’re trying to compare two variables to see if there is a relationship between them, you’ll

Read more »

Create a Function in R to Calculate the Subtotal After Discounts and Taxes

November 19, 2016
By
Create a Function in R to Calculate the Subtotal After Discounts and Taxes

One of the coolest things you can do in R is write custom functions to solve your own unique problems. I’m not sure I’m brave enough to try my hand at more complex functions with loops and conditionals and such but, for now, I thought I’d share something simple. Suppose you have a list of

Read more »

Use R to Combine Multiple Columns of Data into a Single Column Spread Out Across Rows

November 18, 2016
By
Use R to Combine Multiple Columns of Data into a Single Column Spread Out Across Rows

I work a lot with Pivot Tables in Microsoft Excel. A problem I often encounter is trying to analyze a spreadsheet in which data from a single variable has been spread out over many columns. In particular, this happens rather frequently with longitudinal data. If you are trying to look at data spread out across multiple

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)