Some useful bar plots using R

September 14, 2010
By

(This article was first published on stattler.com - R, and kindly contributed to R-bloggers)

In this article I am trying to show how to produce bar plots using R. Many of my friends think SPSS is the most useful software for producing plots and they keep using it (some of them even use Excel!).
My goal is to show that R can do every type of graphs that other commercial softwares can do. In fact it does much better than the simple point and click packages, as R gives us much better control over our analysis.

The data of my concern is -

   sex income   district
female     21      dhaka
  male     11      dhaka
  male     43 chittagong
female     22      dhaka
  male     56    barisal
female     23    barisal
female     66      dhaka
  male     76      dhaka
female     11 chittagong
female     89      dhaka

This data is not a real data, completely created by me just to do experiments using R codes.

Now, I want to produce a bar plot where 'sex' would be the category axis and the clusters will represent mean and median 'income', i.e. I want to produce a plot that we produce in SPSS by the command-

graph
/bar=mean(income) median(income) by sex.

So, at first I calculate mean and median 'income' for both male and female.

m1<-tapply(data\$income,data\$sex,mean)
m2<-tapply(data\$income,data\$sex,median)
r<-rbind(m1,m2)
b<-barplot(r,col=c("green","blue"),ylim=c(0,65),beside=T)
legend("topleft",c("mean","median"),col=c("green","blue"),pch=15)

Then if I want to put the numbers represented by the bars above them,
the code will be-

text(x=b,y=c(r[1],r[2],r[3],r[4]),labels=c(round(r[1],2),
round(r[2],2),r[3],r[4]),pos=3,col="black",cex=1.25)

And the plot is-plot1.png

Now, if I want to produce a more complex plot that is a bar plot that will show mean income for all the three districts separately for male and female, i.e. the plot we produce in SPSS by the command-

graph
/bar=mean(income) by sex by district.

For the required summary statistics I used a package 'Epi' and with the following command produced a very useful summary table-

s=stat.table(list(district,sex),contents=list(mean(income)))

And the produced table is-

----------------------------- 
             -------sex------- 
 district      female    male  
 ----------------------------- 
 barisal        23.00   56.00  
 chittagong     11.00   43.00  
 dhaka          49.50   43.50  
 -----------------------------

As we all know this statistics can also be found by a few lines of codes instead of 'stat.table', but I used it just to cut down some codes.
Then, I did the following commands-

female<-c(s[1],s[2],s[3])
male<-c(s[4],s[5],s[6])
r<-cbind(female,male)
row.names(r)<-c('barisal','chittagong','dhaka')
b<-barplot(r,col=c("red","green","yellow"),beside=T,ylim=c(0,60))
legend("topleft",c("barisal","chittagong","dhaka"),col=
c("red","green","yellow"),pch=15,bty="n")
text(x=b,y=c(r[1:6]),labels=c(r[1:6]),cex=1.25,pos=3)

And the plot is-
hmmm.jpeg

Hope this codes will be useful to those who really want to do every type of statistical work(including producing graphs) in R.

Category: 

To leave a comment for the author, please follow the link and comment on his blog: stattler.com - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.