# Lending Club – naive data analysis

November 8, 2011
By

(This article was first published on DataPunks.com » R, and kindly contributed to R-bloggers)

Dataspora recently analyzed Lending Club‘s data in a geographical way using the data distributed by the site.

Lending Club is an online financial community that brings together creditworthy borrowers and savvy investors so that both can benefit financially. We replace the high cost and complexity of bank lending with a faster, smarter way to borrow and invest.

Lending Club’s returns are very attractive (for the lenders’ point of view), and, at the same time, the Club allows borrowers to avoid high interest rates for similar loans from banks. There are obviously some risk associated with the high returns (like the costly money recovery from a payment default, etc.), and one can ask whether the risks are well weighted with each of the loans.

A few obvious things to note from the cute box charts:

• Interest Rates get higher with worse Credit Ratings
• Revolving Line Utilization seems to be higher for worse Credit Ratings
• Interest Rates get higher with higher existing Revolving Line Utilization
• Nothing really surprising so far. Some data on defaults is also available, and one could continue digging into the provided data to see if any pattern emerges.

 loans <- read.csv("data/lclub.csv", header=TRUE, skip = 1) o <- data.frame(id=loans$Loan.ID) o$rul <- floor(as.numeric(gsub("%", "", loans$Revolving.Line.Utilization))/10) o$rate <- as.numeric(gsub("%", "", loans$Interest.Rate)) o$grade <- loans\$CREDIT.Grade boxplot(rate ~ grade, data=o) boxplot(rate ~ rul, data=o) boxplot(rul ~ grade, data=o)