Spring Cleaning Data: 2 of 6- Changing Column Names and Adding a Column

April 9, 2013

(This article was first published on OutLie..R, and kindly contributed to R-bloggers)

The first post (found here) we downloaded the data and imported it to R using the gdata package. This post we will be changing the column names to make them more reasonable, and adding a quarter variable. The reason for changing the column names is because the dw.2010.q1 file column names are messed up due to the formatting done in Excel. So if I was going to have to change one, just as well change them all, so i did.
The first chunk of code defines the labels I am going to use as c.label. Then I used the colnames() function to rename each file.

#Defining the new labels
c.label<-c('loan.date', 'mat.date', 'term',
'repay.date', 'district', 'borrower', 'city',
'state', 'ABA', 'type.credit', 'i.rate',
'amount', 'outstanding.credit',
'total.outstanding', 'collateral',
'commercial', 'residential.morg',
'comm.real', 'consumer', 'treasury',
'municipal', 'corp', 'mbs.cmo',
'mbs.cmo.other', 'asset.backed',
'internat', 'tdfd')
#Changing the column names

I also like to add a few additional variables when I see a potential need when I can. At this point the files are individual, and adding the quarter variable might be helpful. Sure I could write a loop to create the new column based on the month of the date, but I like to keep things as simple as possible. Why add complexity when there is no reason. I used the ABA to define the length of the data set because it did not have any missing values, while others did. The new column name is qtr, and the function rep() is used to repeat the quarter number the length of the column ABA.

#defining a quarter variable for future use, so I can 
#isolate quarters to compare and contrast
dw.2010.q3$qtr<-rep(3, length(dw.2010.q3$ABA))
dw.2010.q4$qtr<-rep(4, length(dw.2010.q4$ABA))
dw.2011.q1$qtr<-rep(1, length(dw.2011.q1$ABA))

To leave a comment for the author, please follow the link and comment on their blog: OutLie..R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)