Spring Cleaning Data: 2 of 6- Changing Column Names and Adding a Column
The first post (found here
) we downloaded the data and imported it to R using the gdata package. This post we will be changing the column names to make them more reasonable, and adding a quarter variable. The reason for changing the column names is because the dw.2010.q1 file column names are messed up due to the formatting done in Excel. So if I was going to have to change one, just as well change them all, so i did.
The first chunk of code defines the labels I am going to use as c.label. Then I used the colnames() function to rename each file.
#Defining the new labels
c.label<-c('loan.date', 'mat.date', 'term',
'repay.date', 'district', 'borrower', 'city',
'state', 'ABA', 'type.credit', 'i.rate',
'comm.real', 'consumer', 'treasury',
'municipal', 'corp', 'mbs.cmo',
#Changing the column names
I also like to add a few additional variables when I see a potential need when I can. At this point the files are individual, and adding the quarter variable might be helpful. Sure I could write a loop to create the new column based on the month of the date, but I like to keep things as simple as possible. Why add complexity when there is no reason. I used the ABA to define the length of the data set because it did not have any missing values, while others did. The new column name is qtr, and the function rep() is used to repeat the quarter number the length of the column ABA.
#defining a quarter variable for future use, so I can
#isolate quarters to compare and contrast
Created by Pretty R at inside-R.org
To leave a comment
for the author, please follow the link and comment on their blog: OutLie..R
offers daily e-mail updates
news and tutorials
on topics such as: Data science
, Big Data, R jobs
, visualization (ggplot2
), programming (RStudio
, Web Scraping
) statistics (regression
, time series
) and more...
If you got this far, why not subscribe for updates
from the site? Choose your flavor: e-mail
, or facebook