Articles by Unknown

K is for Keep or Drop Variables

April 13, 2020 | Unknown

A few times in this series, I've wanted to display part of a dataset, such as key variables, like Title, Rating, and Pages. The tidyverse allows you to easily keep or drop variables, either temporarily or permanently, with the select function. For inst...
[Read more...]

J is for Join

April 11, 2020 | Unknown

Today, we'll start digging into the wonderful world of joins! The tidyverse offers several different types of joins between two datasets, X and Y: left_join - keeps all rows from X and adds columns from Y to any that match cases in X; if there is no matching record ...
[Read more...]

I is for I Want to Learn More

April 10, 2020 | Unknown

This could have easily been a post about a function beginning with the letter I. But I wanted to take the opportunity to share some the resources that really helped me learn R as well as I do.Obviously, practice and looking things up on stackoverflow and github as I ...
[Read more...]

H is for haven

April 9, 2020 | Unknown

The tidyverse includes many packages meant to make importing, wrangling, analyzing, and visualizing data easier. The haven package allows you to important files from other statistical software, such as SPSS, SAS, and Stata. I learned SPSS in college an...
[Read more...]

G is for group_by

April 8, 2020 | Unknown

For the letter G, I'd like to introduce a very useful function: group_by. This function lets you group data by one or more variables. By itself, it may not seem very useful, but it's great when you start manipulating and summarizing data. That's becaus...
[Read more...]

F is for filter

April 7, 2020 | Unknown

For the letter F - filters! Filters are incredibly useful, especially when combined with the main pipe %__%. I frequently use filters along with ggplot functions, to chart a specific subgroup or remove missing cases or outliers. As one example, I could use a filter to chart only fiction books from ...
[Read more...]

E is for Exposition Pipe

April 6, 2020 | Unknown

For the letter E, I want to talk about a set of operators provided by tidyverse (specifically the magrittr package) that makes for much prettier, easier-to-read code: pipes. The main pipe %__% pushes the object to the left of it forward into function...
[Read more...]

D is for dummy_cols

April 4, 2020 | Unknown

For the letter D, I'm going to talk about the dummy_cols functions, which isn't actually part of the tidyverse, but hey: my posts, my rules. This function is incredibly useful for creating dummy variables, which are used in a variety of ways, including...
[Read more...]

C is for coalesce

April 3, 2020 | Unknown

For the letter C, we'll talk about the coalesce function. If you're familiar with SQL, you may have seen this function before. It combines two or more variables into a single column, and is a way to deal with missing data. When you give it a list of va...
[Read more...]

B is for bind_rows

April 2, 2020 | Unknown

Moving on to the letter B, today we'll talk about merging datasets that contain the same variables but add new cases. This is easily done with bind_rows. Let's say I realized I forgot to log some of the books I read last year, and I wanted to merge tho...
[Read more...]

A is for arrange

April 1, 2020 | Unknown

The arrange function allows you to sort a dataset by one or more variable, either ascending or descending. This function is especially helpful if you plan on aggregating your data with summarize (which, we'll get to later), so you can select specific r...
[Read more...]

Blogging A to Z: The A to Z of tidyverse

March 31, 2020 | Unknown

Announcing my theme for this year's blogging A to Z!The tidyverse is a set of R packages for data science. The big thing about the tidyverse is making sure your data are tidy. What does that mean?Each row is an observationEach column is a variableEach ...
[Read more...]

Visualizing the Tallest Building in Each State

February 13, 2020 | Unknown

Via Digg:This data visualization, put together by takeasecond on Reddit, shows the tallest building in all 50 states in 2020. As the graph demonstrates, the current tallest building in America is New York's One World Trade Center at 1,776 feet tall. In...
[Read more...]

Statistics Sunday: Mixed Effects Meta-Analysis

July 8, 2018 | Unknown

As promised, how to conduct mixed effects meta-analysis in R:Code used in the video is available here. And I'd recommend the following posts to provide background for this video:What is meta-analysis?Introduction to effect sizesVariance and weights in ... [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)