EC2 AMI for scientific computing in Python and R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Like many people who crunch numbers frequently, I have increasingly been integrating Amazon’s cloud computing services into my daily workflow. In particular, I have been using their elastic cloud computing (EC2) on a regular basis. The service is an excellent way to offload computationally intensive work from your laptop for literally pennies on the dollar.
One drawback that I have found, however, is there are not any obvious pre-configured images, called AMIs, designed for scientific computing in the languages I use most: Python and R. The best public AMI I could find was an Ubuntu 10 image provided by the good people at MIT’s STARDEV Project, which includes several useful libraries pre-installed and optimized versions of core scientific Python libraries. This AMI is great, but was still missing several Python packages I use on a regular basis (NetworkX, scikits.learn, sympy, etc.), and had an old version of R with only base packages installed. This would simply not do.
Thus began the odyssey of modifying the StarCluster AMI to more fully support scientific computing in Python in R. I have now uploaded and made public the resulting image, which includes several hundred Python and R packages for scientific computing, statistics, machine learning, data mining and visualization. To access the AMI you can either search for the source name:
Source: aws.drewconway.com/starcluster-scientific-python-r.manifest.xml
Or, access it directly with the AMI ID:
AMI ID: ami-84bd41ed
This will only interest those that have AWS accounts for scientific computing in these languages, but I hope for those of you in that niche it is a useful convenience. For those unfamiliar with EC2 I highly recommend this tutorial, and this more detailed set of instructions for work with EC2 on the command-line. Also, Amazon is very generous with research grants for teachers and students at all levels, so if cost is a barrier you should consider applying for an educational grant.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
