Python and R: Is Python really faster than R?
[This article was first published on Analysis with Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A friend of mine asked me to code the following in R:
Staying with the default values, one would obtain
The output is a list of Matrix and Decision, wherein the first column of the first list (Matrix) is the computed $\bar{x}$; the second and third columns are the lower and upper limits of the confidence interval, respectively; and the fourth column, is an array of ones — if true mean is contained in the interval and zeros — true mean not contained.
Now how fast it would be if I were to code this in Python?
I do have a prior knowledge that Python beats R in terms of speed (confirmed from Nathan’s post), but out of curiosity I wasn’t satisfied with that fact; and leads me to the following Python equivalent,
Computing the elapsed time, we have
and Python,
Gets even worst! 64 seconds over 7 seconds? That’s a huge difference. I don’t know what is happening here, but I did my best to literally translate the R codes to Python, and yet R?
Any thoughts guys, especially to the Python gurus?
- Generate samples of size 10 from Normal distribution with $\mu$ = 3 and $\sigma^2$ = 5;
- Compute the $\bar{x}$ and $\bar{x}\mp z_{\alpha/2}\displaystyle\frac{\sigma}{\sqrt{n}}$ using the 95% confidence level;
- Repeat the process 100 times; then
- Compute the percentage of the confidence intervals containing the true mean.
Staying with the default values, one would obtain
The output is a list of Matrix and Decision, wherein the first column of the first list (Matrix) is the computed $\bar{x}$; the second and third columns are the lower and upper limits of the confidence interval, respectively; and the fourth column, is an array of ones — if true mean is contained in the interval and zeros — true mean not contained.
Now how fast it would be if I were to code this in Python?
I do have a prior knowledge that Python beats R in terms of speed (confirmed from Nathan’s post), but out of curiosity I wasn’t satisfied with that fact; and leads me to the following Python equivalent,
Computing the elapsed time, we have
- R
- Python
and Python,
Gets even worst! 64 seconds over 7 seconds? That’s a huge difference. I don’t know what is happening here, but I did my best to literally translate the R codes to Python, and yet R?
Any thoughts guys, especially to the Python gurus?
To leave a comment for the author, please follow the link and comment on their blog: Analysis with Programming.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.