Life is short, use Python
I started to play with Python two weeks ago due to the limitation of R in terms of handling large data, then a friend of mine suggested me to try Python since I had to do data massage frequently, "Python is the best choice, trust me", he said. Although I was unwilling to learn another new software, I couldn't bear with the low efficiency of R (or of my work) for large data. You may realize my learning curve as: Excellent free CSV splitter --> MySQL+RMySQL package --> Several R packages including bigmemory and ff. But to be honest, none of them satisfies me either because of the limitation of the method (slow + malfunction) or of my own computer (short of memory).
I am shocked by python's extreme power and easy-to-use design after nearly two weeks, dealing with a 10GB CSV had never become so easy. More importantly, you can access R from Python almost seamlessly with the package RPY. To get started, I would like to recommend the following readings to all Python newbies like me:
1, commands dictionary Matlab vs R vs Python;
2, free ebook Dive Into Python;
3, a text book Machine Learning: An Algorithmic Perspective by Prof. Stephen Marsland.
The third book is especially useful for data analysis, as there are lots of Python code examples in the book, the code and dataset are available to download @ the author's website http://www-ist.massey.ac.nz/smarsland/MLBook.html, take a look before deciding to add it to your shelf.
Tags - python , r
Read the full post at Life Is Short, Use Python.