So far the individual files have been left on their own, it is now time to combine using the rbind function, simple enough after all we have done so far, then the quick check with summary.
#Changing the date variables, then #isolating the year variable for alter use library(stringr) dw$loan.date<-as.Date(dw$loan.date, '%b %d %Y') dw$mat.date<-as.Date(dw$mat.date, '%b %d %Y') dw$repay.date<-as.Date(dw$repay.date, '%b %d %Y')
The code assumes the date has been changed to the R default of YYYY-MM-DD, for the year I selected the first 4 numbers using the str_sub() function, while making it a numerical value- as.numeric(). The year and date variable I made it a factor for easier sorting and categorizing, with a similar process as above except I want both.
#Create a year variable dw$year<-as.numeric(str_sub(dw$loan.date, start=1, end=4)) #Create a year and month variable dw$year.month<-as.factor(str_sub(dw$loan.date, start=1, end=7))
The next step is to change the credit type to something simpler for tables and graphs. I used the gsub, one of the most interesting and fun functions I never knew existed until I did this. Basically it will take a string then replace it with another. For this data I wanted to replace the "Primary Credit" with "primary" because it make things so much easier for graphs and tables. Then I changed it to a factor.
#Changing the type of credit to one word dw$type.credit<-with(dw, gsub("Primary Credit", 'primary', type.credit)) dw$type.credit<-with(dw, gsub("Seasonal Credit", 'seasonal', type.credit)) dw$type.credit<-with(dw, gsub("Secondary Credit", 'secondary', type.credit)) #change to factor dw$type.credit<-as.factor(dw$type.credit) summary(dw)