How to convert contingency tables to data frames with R

[This article was first published on Rronan » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I wanted to write contingency tables in HTML with hwrite(). I realized that the method hwrite() does not exist for the table objects. I could use, but the table produced is non-intuitive. I did a search on R-bloggers and I quickly found the solution to my problem: the function.

The contingency table

A contingency table is a display format used to analyse and record the relationship between two categorical variables. For example, we use two variables from the dataset ?state included in R. The two variables are \(x\) (state.division) and \(y\) (state.region).

state.division state.region nlevels(state.division) nlevels(state.region)

These two variables have respectively \(r = 9\) et \(s = 4\) terms. The contingency table therefore contains \((r + 1) \times (s + 1) – 1 = 49\) informatives cells.

The contingency table will show the number of times each combination of state.division and state.region appears.

(MyTable <- table(state.division, state.region))
## state.region ## state.division Northeast South North Central West ## New England 6 0 0 0 ## Middle Atlantic 3 0 0 0 ## South Atlantic 0 8 0 0 ## East South Central 0 4 0 0 ## West South Central 0 4 0 0 ## East North Central 0 0 5 0 ## West North Central 0 0 7 0 ## Mountain 0 0 0 8 ## Pacific 0 0 0 5

The R contingency tables are of class table. They are not handled the same way that the objects of class data.frame. Some methods of data.frame are not available for table (e.g. hwrite()). Actually, converting contingency tables to data frames gives non-intuitive results.
New EnglandNortheast6
Middle AtlanticNortheast3
South AtlanticNortheast0
East South CentralNortheast0
West South CentralNortheast0
East North CentralNortheast0
West North CentralNortheast0
New EnglandSouth0
Middle AtlanticSouth0
South AtlanticSouth8
East South CentralSouth4
West South CentralSouth4
East North CentralSouth0
West North CentralSouth0
New EnglandNorth Central0
Middle AtlanticNorth Central0
South AtlanticNorth Central0
East South CentralNorth Central0
West South CentralNorth Central0
East North CentralNorth Central5
West North CentralNorth Central7
MountainNorth Central0
PacificNorth Central0
New EnglandWest0
Middle AtlanticWest0
South AtlanticWest0
East South CentralWest0
West South CentralWest0
East North CentralWest0
West North CentralWest0

Here, the same information is presented in a table of \(3 \times r \times s = 108\) cells. Each term of \(x\) [\(y\)] is written \(s\) [respectively \(r\)] times.

The convert a table to a data.frame keeping its original structure, you must use the function. This is probably the only situation in which this obscure function would be used.
NortheastSouthNorth CentralWest
New England6000
Middle Atlantic3000
South Atlantic0800
East South Central0400
West South Central0400
East North Central0050
West North Central0070

If you are fussy, you might notice that the variable names do not appear in contingency tables written with hwrite(). This can cause problems if the terms do not have explicit names (e.g., a variable encoded \(1, 2, \ldots, r\)). In that case, remember to specify your variables by adding a caption to your table.

To leave a comment for the author, please follow the link and comment on their blog: Rronan » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)