So you want to get statistical? Nowadays one of the ways to go is to use R, mostly in combination with ggplot2 for generating the plots. These plots and graphs however need some data, for that we use data sources. There are a lot of data sources available for us to use and every company and consumer has its own opinion about which one to use, which can even differ per type of usage, application or website. Therefore a lot of database adapters exist for R; one of these adapters is RMySQL, packaged as the RMySQL package.
This post describes the installation of the RMySQL package from beginning to end, assuming you have a 32-bit Windows 7 machine ready.
First off we need a working MySQL installation. Head over to http://www.mysql.com/ and download the server of your wishes Or use, like me, mysql-5.5.17-win32.msi. Follow it’s installation instructions and make sure the checkbox next to “Client tools” is checked. I installed it to “C:\Program Files\MySQL\MySQL Server 5.5\”, and when you don’t, remember to change each of MySQL’s paths following. After the installation press the “Windows” key and enter “System”, this will open your system’s settings screen. Click on “Advanced settings” and choose Environment Variables. When the screen pops up, add a new environment variable for all users and name it “MYSQL_HOME” with a value of “C:\Program Files\MySQL\MySQL Server 5.5\.
This is not all yet! Create the directory “opt” within “C:\Program Files\MySQL\MySQL Server 5.5\lib” and copy-paste “C:\Program Files\MySQL\MySQL Server 5.5\lib\libmysql.lib” to “C:\Program Files\MySQL\MySQL Server 5.5\lib\opt\” and “C:\Program Files\MySQL\MySQL Server 5.5\lib\libmysql.dll” to “C:\Program Files\R\R-2.13.2\bin\i386\”.
After the installation of MySQL it’s time to install R from http://cran.r-project.org/bin/windows/base/, the version I use is 2.13.2. Again follow the installation instructions, choosing for a “Full installation” and a customized startup at the “Startup options” screen to setup your preferred way of display, for me that was: “MDI”, “HTML help” and “Internet2” while having “C:\Program Files\R\R-2.13.2” chosen as installation directory.
Now, that was easy, we already installed MySQL and R, what’s next?
For the more difficult part of this post we go to http://www.murdoch-sutherland.com/Rtools/ and install the Rtools for R 2.13.x, i.e. Rtools213.exe, this allows us to build packages for R under Windows. Note that there are other packages and tools available for doing this but in my experience this was the most easy one, so we stick with this one for now.
Install Rtools to “c:\Rtools” and go for a “Custom installation” by checking “Extras to build 32 bit R: TCL/TK, bitmap code, internationalization” and at the “Select R Source Home Directory”, paste “C:\Program Files\R\R-2.13.2\src” in the textfield. Click on “Next >” and choose to “Edit the system PATH” by ticking the checkbox to the left of it. The rest speaks for itself.
Download this from http://code.google.com/p/batchfiles/source/browse/trunk/Rcmd.bat into “C:\Program Files\R\R-2.13.2\bin”.
One application very useful to have, though not necessary is RStudio, downloaded from http://rstudio.org/. With this application we have direct access to the R command line tools, a plots window and more. I recommend downloading this application not only for the purpose of this tutorial but also for upcoming daily(?) usage.
Make sure you download “RStudio 0.94.110 – Windows XP/Vista/7” and choose “C:\Program Files\RStudio” as its installation directory.
Now hit the Windows key and enter “RStudio”, and when the program “RStudio” comes up, click it. At the console window type
and hit Enter. Choose the country most close to you as CRAN mirror for the fastest download, mine is 47.
Note: When the message:
“Warning in install.packages :
‘lib = “C:/Program Files/R/R-2.13.2/library”‘ is not writable”
shows up, head over to the directory and grant modification rights to the current user (or the RStudio user when installed under its own user profile) and execute the