“We need to regard statistical intuition with proper suspicion and replace impression formation by computation wherever possible” “We are pattern seekers, believers in a coherent world” “The hot hand is entirely in the eyes of the beholders, who are consistently” “too quick to perceive order and causality in randomeness. The hot hand is a” “massive
Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. The idea is: we sacrifice some of the flexibility and composability inherent to ggplot2 in R for a menu of prescribed … Continue reading...
I’ve had the great privilege to be a small part of the R open source community, contributing packages like broom, gganimate, fuzzyjoin, and ggfreehand. In the process I’ve become friends and colleagues with brilliant statisticians and data scientists and learned to engage with data in powerful ways.
But there’s one thing that my colleagues and I...
In much the same way that the IBM DataScientist Workbench seeks to provide some level of integration between analysis tools such as Jupyter notebooks and data access and storage, Azure Machine Learning studio also provides a suite of tools for accessing and working with data in one location. Microsoft’s offering is new to me, but
With the recent releases of R 3.2.4 and OpenBLAS 2.17, I decided it was time to re-benchmark R speed. I’ve settled on a particular set of tests, based on my experience as well as some of Simon Urbanek’s work which I separated into two groups: those focusing on BLAS-heavy operations and those which do not. Read the full...
It was just over a year ago that Dason Kurkiewicz and I released pacman to CRAN. We have been developing the package on GitHub in the past 14 months and are pleased to announce these changes have made their way … Continue reading →
Per a suggestion, I’m going to try to find a neat data set (prbly one from @jsvine) to feature each week and toss up some sample code (99% of the time prbly in R) and offer up a vis challenge. Just reply in the comments with a link to a gist/repo/rpub/blog/etc (or post directly, though
Wes McKinney, Software Engineer, Cloudera Hadley Wickham, Chief Scientist, RStudio This past January, we (Hadley and Wes) met and discussed some of the systems challenges facing the Python and R open source communities. In particular, we wanted to see if there were some opportunities to collaborate on tools for improving interoperability between Python, R, and
Organize your data manipulation tasks in a standard way, write clean and efficient code, and build reproducible data management processes, using the most modern R tools: tidyr, dplyr and lubridate.
The post "Efficient Data Manipulation with R" Course | April 11-12 Milan appeared first on MilanoR.