Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This post introduces a useful R code snippet for transforming the long format data to the wide format. We occasionally encounter the long format data such as yield curve data since yield curve has two dimensions : maturity and date. For this end, we can use reshape() R built-in function.

# Long and Wide formats

What is the long and wide format? A picture paints a thousand words.

We want to transform the long format data to the wide format data for a panel of time series analysis such as a term structure of interest rates. As financial data is usually extracted from database system, we occasionally encounter the long format data. For example, the data in the above figure is a sample of Euro area yield curve which has the long format. To facilitate an empirical analysis, the wide format is appropriate.

Transforming between the long and wide format can be carried out by using reshape() R function. No further explanation is needed. Let’s see the R code below.

### R code

The following R code read sample data and transform the long format to the wide format and vice versa. When using the reshape() function, we need to set the direction argument as “long” or “wide”. In particular, we need to add new column name with some delimitator (., _, etc) to the wide format data when we transform it the long format.

 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103 #========================================================## Quantitative ALM, Financial Econometrics & Derivatives # ML/DL using R, Python, Tensorflow by Sang-Heon Lee ## https://kiandlee.blogspot.com#——————————————————–## Long to wide format and vice versa for yield data#========================================================# graphics.off(); rm(list = ls()) # sample data : ECB zero yieldsstr_data <– “term    date    rate            3M    2021-01-29    -0.625            3M    2021-02-26    -0.612            3M    2021-03-31    -0.636            3M    2021-04-30    -0.628            3M    2021-05-31    -0.632            3M    2021-06-30    -0.650            3M    2021-07-30    -0.663            3M    2021-08-31    -0.676            3M    2021-09-30    -0.712            3M    2021-10-29    -0.736            3M    2021-11-30    -0.895            3M    2021-12-31    -0.731            3Y    2021-01-29    -0.771            3Y    2021-02-26    -0.648            3Y    2021-03-31    -0.711            3Y    2021-04-30    -0.684            3Y    2021-05-31    -0.666            3Y    2021-06-30    -0.672            3Y    2021-07-30    -0.813            3Y    2021-08-31    -0.760            3Y    2021-09-30    -0.677            3Y    2021-10-29    -0.537            3Y    2021-11-30    -0.766            3Y    2021-12-31    -0.620            10Y    2021-01-29    -0.512            10Y    2021-02-26    -0.246            10Y    2021-03-31    -0.279            10Y    2021-04-30    -0.180            10Y    2021-05-31    -0.146            10Y    2021-06-30    -0.203            10Y    2021-07-30    -0.440            10Y    2021-08-31    -0.393            10Y    2021-09-30    -0.170            10Y    2021-10-29    -0.069            10Y    2021-11-30    -0.350            10Y    2021-12-31    -0.188            20Y    2021-01-29    -0.176            20Y    2021-02-26    0.103            20Y    2021-03-31    0.142            20Y    2021-04-30    0.252            20Y    2021-05-31    0.287            20Y    2021-06-30    0.201            20Y    2021-07-30    -0.059            20Y    2021-08-31    -0.033            20Y    2021-09-30    0.195            20Y    2021-10-29    0.103            20Y    2021-11-30    -0.115            20Y    2021-12-31    0.056″ #==========================================# Read a sample of ECB zero coupon yields#==========================================df_long <– read.table(text = str_data, header = TRUE) #==========================================# Transform LONG to WIDE format#========================================== # using “wide” optiondf_wide <– reshape(df_long, direction = “wide”,                    idvar = “date”,                    timevar = “term”)df_wide # initialize row namesrownames(df_long) <– NULL # delete a unnecessary prefix in column namescolnames(df_wide) <– gsub(“rate.”,“”, colnames(df_wide))df_wide #==========================================# Transform WIDE to LONG format#==========================================df_wide2 <– df_wide # need to add new column name as a prefixcolnames(df_wide2)[–1] <–     paste0(“term.”, colnames(df_wide)[–1]) # using “long” optiondf_long2 <– reshape(df_wide2, direction = “long”,                     idvar=“date”,                     varying = colnames(df_wide2)[–1],                     sep = “.”) # initialize row namesrownames(df_long2) <– NULLdf_long2 Colored by Color Scripter cs

Running the above R code produces the following wide format of the yield curve data.

We can also transform the wide format data to the long format conversely.