{Long Vs. Wide} Data Frames

July 24, 2015

(This article was first published on R - Data Science Heroes Blog, and kindly contributed to R-bloggers)


This is an excellent resource to understand 2 types of data frame format: Long and Wide.

  • Just take a look at figure 1 inside the article

1) Long format: ggplot2 needs in certain scenarios this kind of format to work (generally grouped plots).

2) Wide format: On the other hand, usually when you read transnational data, you may find “long-format” and you need it in “wide” in order to create a predictive model.

Here, each row represents a case study, and each column an attribute/variable. Classical input for building a cluster or predictive model.

R Library

The most used library to achieve this is “reshape2”, and, what’s the difference with “reshape”?

Package author said:

“Reshape2 is a reboot of the reshape package. It’s been over five years
since the first release of the package”…”reshape2 uses that knowledge to make a new package for reshaping data that is much more focused and much much faster.”

Happy transforming!

Data Science Heroes

To leave a comment for the author, please follow the link and comment on their blog: R - Data Science Heroes Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)