Scary Derivatives and Scary XML in R

July 7, 2011

(This article was first published on Timely Portfolio, and kindly contributed to R-bloggers)

I need some new R skills, and there is no better motivation to learn XML in R than one of the scariest financial datasets out there—the US Department of the Treasury Office of the Comptroller of the Currency (OCC) Quarterly Derivatives Report. I’ll move to the even scarier global BIS derivatives data in a later post to avoid inducing nightmares and sleepless nights.

The massive damage caused in 2008-2009 by the sliver of derivatives called credit default swaps seems like a faint warning siren when we see that they only represent < 7% of total derivatives exposure.  Interest rate and currency derivatives, also where I think the next disaster occurs, are more than 10 times larger than these credit contracts at $226 Trillion.  While some take comfort in

“Bilateral Netting: A legally enforceable arrangement between a bank and a counterparty that creates a single legal obligation covering all included individual contracts. This means that a bank‟s receivable or payable, in the event of the default or insolvency of one of the parties, would be the net sum of all positive and negative fair values of contracts included in the bilateral netting arrangement.”

bilateral netting does not work so well when your counterparty fails like Lehman Brothers.

From TimelyPortfolio

I know there must be a better way to read Microsoft Excel XML spreadsheets in R, but these were my two most elegant methods.  Please let me know how I could have done better.

R code (click to download):

#read xml derivatives data from the
#US Treasury OCC Quarterly Derivatives Report
#2 methods of as far as I can tell
#the first published example of how to read
#Microsoft Excel xml workbooks   require(XML)
require(ggplot2)   url = ""
#url = "dq410-xml.xml"   doc = xmlInternalTreeParse("dq410-xml.xml")
#define namespaces
#figuring this out took hours
#but using getNodeSet was much cleaner than the
#next method
namespaces = c(o="urn:schemas-microsoft-com:office:office",
ns <- getNodeSet(doc,"/ss:Workbook/ss:Worksheet[@ss:Name='Table 3']/ss:Table/ss:Row",namespaces)[[44]]
pct <- df <-, xmlValue))[5:8])
ns <- getNodeSet(doc,"/ss:Workbook/ss:Worksheet[@ss:Name='Table 3']/ss:Table/ss:Row",namespaces)[[8]]
lab <- as.character(xmlSApply(ns, xmlValue)[8:11])
df <- cbind(lab,df)
#jpeg(filename="derivatives by type.jpg",quality=100,
# width=6.25, height = 8, units="in",res=96)
barplot(df[,2],names.arg = df[,1],main="US Bank Derivatives by Type
Q4 2010"
,ylab="% of Total")
mtext("Source: US Dept of Treasury OCC Quarterly Derivatives Report",
side=1,line=3,cex=0.8,adj=0)       #here is the original very hacked brute force method
doc = xmlRoot(xmlTreeParse(url))
#set xmlNodeList to Table 3
#hacked with brute force
table3 <- doc[6]$Worksheet
df <-
xmlApply(table3[[1]],function(x) xmlApply(x,xmlValue))[53]$Row[5:8])),
df <- cbind(
as.character(xmlApply(table3[[1]],function(x) xmlApply(x,xmlValue))[17]$Row[c(8:11)]),
colnames(df) <- c("DerivativeType","PctOfTotal")
barplot(df[,2],names.arg = df[,1],main="US Bank Derivatives by Type
Q4 2010"
,ylab="% of Total")
mtext("Source: US Dept of Treasury OCC Quarterly Derivatives Report",

Created by Pretty R at

To leave a comment for the author, please follow the link and comment on their blog: Timely Portfolio. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)