Download and Parse NAREIT Data

[This article was first published on tradeblotter » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This is the first post of a series that describes how to download and parse specific data sets into R. These kinds of scripts can be functionalized further, but I doubt that these will ever find their way into a formal package. They are intended to be helpful to those facing similar tasks, but as demonstration scripts they will not be supported.

This is a script for downloading and parsing a monthly total return series of the FTSE NAREIT U.S. Real Estate Index. The spreadsheet can be downloaded from their website manually, but as you will see later, we don’t have to. Download the spreadsheet if you want to look at how it is structured.

This script can be found in the /inst/parser directory of FinancialInstrument on R-Forge.

After I parse the data, I want to store it and make it available for other R sessions to use. To do that, I’m going to use FinancialInstrument to register the location of the data so that I can retrieve it with getSymbols, much like I might do to retrieve data from Yahoo.

Let’s take a look at the script…

First, we load the required packages:


Warning: this requires the quantmod version to be a development build after revision 560 or so.
Next, I’ll set the storage directory, where I’ll store the data and access it through getSymbols.
filesroot = "~/Data/NAREIT"

Create and set the storage directory if it doesn’t exist.
if (!file.exists(filesroot))
  dir.create(filesroot, mode="0777")
if (!file.exists(paste(filesroot, "/NAREIT.IDX", sep="")))
  dir.create(paste(filesroot, "/NAREIT.IDX", sep=""), mode="0777")

The read.xls function from the package gdata will download the xls workbook directly from the web site.
x = read.xls("", pattern="Date", sheet="Data")

Let’s take a look at what we retrieved:
> head(x)
    Date Return  Index Return.1 Index.1 Return.2 Yield  X Return.3 Index.2 Return.4
1 Dec-71     NA 100.00       NA  100.00       NA    NA NA       NA  100.00       NA
2 Jan-72   1.22 101.22     0.33  100.33     0.89  6.51 NA     1.22  101.22     0.33
3 Feb-72   0.95 102.18     0.92  101.25     0.03  6.39 NA     0.95  102.18     0.92
4 Mar-72   0.25 102.44    -0.44  100.81     0.69  6.32 NA     0.25  102.44    -0.44
5 Apr-72   0.25 102.70    -0.39  100.41     0.65  6.52 NA     0.25  102.70    -0.39
6 May-72  -1.51 101.15    -1.78   98.63     0.27  7.10 NA    -1.51  101.15    -1.78
  Index.3 Return.5 Yield.1 X.1 Return.6 Index.4 Return.7 Index.5 Return.8 Yield.2 X.2
1  100.00       NA      NA  NA       NA      NA       NA      NA       NA      NA  NA
2  100.33     0.89    6.51  NA       NA      NA       NA      NA       NA      NA  NA
3  101.25     0.03    6.39  NA       NA      NA       NA      NA       NA      NA  NA
4  100.81     0.69    6.32  NA       NA      NA       NA      NA       NA      NA  NA
5  100.41     0.65    6.52  NA       NA      NA       NA      NA       NA      NA  NA
6   98.63     0.27    7.10  NA       NA      NA       NA      NA       NA      NA  NA
  Return.9 Index.6 Return.10 Index.7 Return.11 Yield.3 X.3 Return.12 Index.8 Return.13
1       NA  100.00        NA  100.00        NA      NA  NA        NA  100.00        NA
2     0.00  100.00     -1.35   98.65      1.35    6.13  NA      0.00  100.00     -1.35
3     1.74  101.74      1.74  100.37      0.00    5.74  NA      1.74  101.74      1.74
4    -0.32  101.42     -0.74   99.63      0.42    5.92  NA     -0.32  101.42     -0.74
5     3.84  105.31      3.24  102.85      0.60    5.72  NA      3.84  105.31      3.24
6    -9.37   95.44     -9.75   92.83      0.38    6.53  NA     -9.37   95.44     -9.75
  Index.9 Return.14 Yield.4 X.4 Return.15 Index.10 Return.16 Index.11 Return.17 Yield.5
1  100.00        NA      NA  NA        NA   100.00        NA   100.00        NA      NA
2   98.65      1.35    6.13  NA      1.24   101.24      0.87   100.87      0.38    6.10
3  100.37      0.00    5.74  NA      0.52   101.76      0.52   101.39      0.00    6.07
4   99.63      0.42    5.92  NA     -1.59   100.15     -2.74    98.61      1.16    6.02
5  102.85      0.60    5.72  NA      0.37   100.52     -0.10    98.51      0.47    6.45
6   92.83      0.38    6.53  NA      1.36   101.88      1.22    99.71      0.14    7.49
1  NA
2  NA
3  NA
4  NA
5  NA
6  NA

I’ll focus on the first three columns: the date, the total return, and the total return index. The next columns divide it into price, income and dividend yield components, and after that are the individual sectors. If you are interested, it would be trivial to loop through those additional columns and create symbols for the rest of the columns in the spreadsheet. But for now, I just want to parse the total return index.

First, I’m going to parse the dates. In the spreadsheet, these are formatted as month and year (strptime format is “%b-%y”), so they are not associated with a specific day. I want to transform these to reflect the last day of the month so that they are easier to align with other data I might use in a subsequent analysis.

To do that, I first use as.yearmon to parse the dates from “%b-%y”. Then I can add the last day of the month by transforming that using as.Date with frac=1. It’s worth knowing that a value of zero would select the first day of the month – that comes in handy every once in a while.

x.dates = as.Date(as.yearmon(x[,1], format="%b-%y"), frac=1)

I then transform the returns from whole numbers to percentages.
x.returns = xts(x[,2]/100, = x.dates)

I want to capture the price index as well, but I need to get rid of the commas in the string so that they are recognized as numbers.
x.price = as.numeric((sub(",","", x[,3], fixed=TRUE)))
x.price = xts(x.price, = x.dates)

Now I combine the data into a single object…
x.xts = cbind(x.price, x.returns)
colnames(x.xts) = c("Close", "Returns")

… and save it into an rda file on the filesystem.
save(x.xts, file=paste(filesroot,"NAREIT.IDX/NAREIT.IDX.rda", sep="/"))

Next, I want to add some metadata to the data, so that I remember its attributes, such as the currency it is denominated in (actually, that’s the only required one).

Create the currency the index is demoninated in, if it hasn’t been done already (FinancialInstrument does create some major currencies when loaded, but it doesn’t hurt to do it again).


Now I can describe the metadata for the index, using instrument.
instrument("NAREIT.IDX", currency="USD", multiplier=1, tick_size=.01, start_date="1971-12-31", description="FTSE NAREIT U.S. Real Estate Index", data="CR", source="", assign_i=TRUE)

Now, whenever you start a new R session you need to register the instruments. This might be a line you put into .Rprofile ( along with the packages and the filesroot definition) so that it happens automatically:
setSymbolLookup.FI(base_dir="~/Data/NAREIT", split_method='common')

Once you’ve done that you should be able to read that data from your local storage using quantmod‘s getSymbols:
chartSeries(Cl(NAREIT.IDX), theme="white")

That gives output that looks something like this:
            Close Returns
1971-12-31 100.00      NA
1972-01-31 101.22  0.0122
1972-02-29 102.18  0.0095
1972-03-31 102.44  0.0025
1972-04-30 102.70  0.0025
1972-05-31 101.15 -0.0151

The graph would normally go here

NAREIT Price Series

To leave a comment for the author, please follow the link and comment on their blog: tradeblotter » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)