How to fit a copula model in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I have been working on this topic for a great amount of time and to be honest I find R documentation not that user-friendly as the documentation for most Python modules. Anyway the fact that copulas are not the easiest model to grasp has contributed to further delays too. But mainly the lack of examples and users of these models was the biggest obstacle. Then again, I might have looked in the wrong places, if you have any good resource to suggest please feel free to leave a comment. At the bottom of this page I’ll post some links that I found very useful.
If you are new to copulas, perhaps you’d like to start with an introduction to the Gumbel copula in R here.
The package I am going to be using is the copula package, a great tool for using copulas in R. You can easily install it through R-Studio.
The dataset
For the purpose of this example I used a simple dataset of returns for stock x and y (x.txt and y.txt). You can download the dataset by clicking here. The dataset is given merely for the purpose of this example.
First of all we need to load the data and convert it into a matrix format. Optionally one can plot the data. Remember to load the copula package with library(copula)
The plot of the data
Now we have our data loaded, we can clearly see that there is some kind of positive correlation.
The next step is the fitting. In order to fit the data we need to choose a copula model. The model should be chose based on the structure of data and other factors. As a first approximation, we may say that our data shows a mild positive correlation therefore a copula which can replicate such mild correlation should be fine. Be aware that you can easily mess up with copula models and this visual approach is not always the best option. Anyway I choose to use a normal copula from the package. The fitting process anyway is identical for the other types of copula.
Let’s fit the data
Note that the data must be fed through the function pobs() which converts the real observations into pseudo observations into the unit square [0,1].
Note also that we are using the “ml” method (maximum likelihood method) however other methods are available such as “itau”.
The parameter of the fitted copula, rho, in our case is equal to 0.7387409. Let’s simulate some pseudo observations
By plotting the pseudo and simulated observations we can see how the simulation with the copula matches the pseudo observations
This particular copula might not be the best since it shows a heavy tail correlation which is not that strong in our data, however it’s a start.
Optionally at the beginning we could have plot the data with the distribution for each random variable as below
And get this beautiful representation of our original dataset
Now for the useful documentation:
Copula package official documentation:
http://cran.r-project.org/web/packages/copula/copula.pdf
R blogger article on copulas
https://www.r-bloggers.com/copulas-made-easy/
An interesting question on CrossValidated
http://stats.stackexchange.com/questions/90729/generating-values-from-copula-using-copula-package-in-r
A paper on copulas and the copula package
http://www.jstatsoft.org/v21/i04/paper
That’s all for now.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.