Changes to Internet Connectivity in R on Windows
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This week we released version 3.0 of the curl R package to CRAN. You may have never used this package directly, but curl
provides the foundation for most HTTP infrastructure in R, including httr
, rvest
, and all packages that build on it. If R packages need to go online, chances are traffic is going via curl.
This release introduces an important change for Windows users: we are switching from OpenSSL to Secure Channel on Windows 7 / 2008-R2 and up. Let me explain this in a bit more detail.
Why Switching SSL backends
The libcurl C library requires an external crypto library to provide the SSL layer (the S in HTTPS). On Linux / MacOS, libcurl is included with the OS so we don’t worry about this. However on Windows we ship our own build of libcurl so we can choose if we want to build against OpenSSL or Windows native SSL api called Secure Channel, also referred to as just “WinSSL”.
Thus far we have always used libcurl with OpenSSL, which works consistently on all versions of Windows. However OpenSSL requires that we provide our own CA bundle, which is not ideal. In particular users on corporate / government networks have reported difficulty connecting to the internet in R. The reason is often that their enterprise gateway / proxy uses custom certificates which are installed in the Windows certificate manager, but are not present in R’s bundle.
Moreover shipping our own CA bundle can be a security risk. If a CA gets hacked, the corresponding certificate needs to be revoked immediately. Operating systems can quickly push a security update to all users, but we cannot do this in R.
Switching to WinSSL
If we build libcurl against Windows native Secure Channel, it automatically uses the same SSL certificates as Internet Explorer. Hence we do not have to ship and maintain a custom CA bundle. Earlier this year I tried to switch the curl
package to WinSSL, and everything seemed to work great on my machine.
However when we started checking reverse dependecies on CRAN WinBuilder, many packages depending on curl started to fail! It turned out Windows versions before Windows 7 do not natively support TLS 1.1 and 1.2 by default. Because TLS 1.2 is used by the majority of HTTPS servers today, WinSSL is basically useless on these machines. Unfortunately this also includes CRAN WinBuilder which runs Windows 2008 (the server edition of Vista).
So we had no choice but to roll back to OpenSSL in order to keep everything working properly on CRAN. Bummer.
Towards Dual SSL
I had almost given up on this when a few weeks ago Daniel Stenberg posted the following announcement on the libcurl mailing list:
Hi friends! As of minutes ago, libcurl has the ability to change SSL backend dynamically at run-time – if built with the support enabled. That means that the choice does no longer only have to happen at build-time.
This new feature gives us exactly the flexibility we need. We can take advantage of native Secure Channel on Windows 7 and up which are almost all users. However we can keep things working in legacy servers by falling back on OpenSSL on these machines, including the CRAN win builder.
So this is where we are. Version 3.0 of the curl R package uses the latest libcurl 7.56.0 and automatically switches to native SSL on Windows 7 and up. If all goes well, nobody should not notice any changes, except those people on enterprise networks where things will, hopefully, magically start working.
Feedback
Because each Windows network seems to have a different setup, testing and debugging these things is often difficult. We are interested to hear from Windows users if updating to curl 3.0 has improved the situation, or if any unexpected side effects arise. Please open an issue on Github if you run into problems.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.