Why R AND Python?
From the very beginning, two key ideas have driven the work we do at RStudio:
- It’s better for everyone if the tools used for data science are free and open. This enhances the production and consumption of knowledge and facilitates collaboration and reproducible research in science, education and industry.
- Coding is the most powerful and efficient path to tackle complex, real-world data science challenges. It gives data scientists superpowers to tackle the hardest problems because code is flexible, reusable, inspectable, and reproducible.
Some data scientists, and even some organizations, believe they have to pick between R or Python. However, this turns out to be a false choice. In talking to our many customers and others in the data science field, as well as in the surveys we’ve done of the data science community, we’ve seen that many data science teams today are bilingual, leveraging both R and Python in their work. And while both languages have unique strengths, these teams frequently struggle to use them together.
Common Objections to using R and Python Together
We’ve heard three common criticisms from data science teams about using R and Python together:
- Data science leaders are often concerned that multilingual teams will have a harder time collaborating and sharing work than a team standardized on one language.
- Individual data scientists may worry that using two languages together will incur a higher cost of project organization and maintenance.
- IT organizations are often concerned that enabling two languages will mean doubling their effort, requiring they maintain, manage, and scale separate environments for R and Python.
Contrary to these concerns, in talking with many data science teams, we’ve found that:
- Modern tooling allows R and Python programmers to seamlessly share and build off of one another. Additionally, data science team leads find it easier to hire and recruit talent when they are able to reach into both R and Python communities.
- Many data scientists find that combining R and Python allows them to use each language for their best strengths, and improvements in data science tools like RStudio eliminate additional overhead.
- IT organizations find that common infrastructure and best practices can support both languages, enabling all the benefits without additional cost. One example of this common infrastructure is RStudio Team, a single centralized infrastructure for bilingual teams using R and Python.
As you can see, many of the potential concerns of using two languages are addressed through better tooling. In line with our ongoing mission to support the open source data science ecosystem, we’ve invested heavily in creating the best platform for data science using both R AND Python. This effort includes many features in the products that comprise RStudio Team. We have also made significant investments in our open source offerings to make it easier than ever to combine R and Python in a single data science project.
New Python Features in RStudio products
In our open source products, we improved and invested in a number of different features over the past year, including:
- Continuing to invest in the reticulate package to make it easy for R users to access Python capabilities.
- Providing native access from R to
torch, one of the most widely used deep learning frameworks.
- Investing in Ursa Labs for the development of cross language capabilities.
- Expanding capabilities for native Python coding in the RStudio IDE, including a Python environment and object explorer.
In RStudio Server Pro, which provides collaboration, centralized management, and security for data science teams developing in R and Python, we’ve added beta support for the VSCode IDE. This work is in addition to our existing support for Jupyter Notebooks and JupyterLab. These enhancements make RStudio Server Pro a true workbench for open source data science.
RStudio Connect provides a centralized platform where data science teams can operationalize the works they create in R and Python. We’ve solved the same challenges for Python users that have made Connect so popular with R users including:
- Publishing enhancements in Connect 1.8.0 that make it easier to share Jupyter Notebooks and mixed R and Python content.
- Support for Dash, Bokeh and Streamlit, allowing users to share a full suite of Python applications. See the announcements for Connect 1.8.4 and 1.8.6 for more details.
- The ability to use Flask to share Python APIs in Connect 1.8.2.
Finally, in RStudio Package Manager, which helps organize, manage and centralize packages across a team or an entire organization, we recently added beta support for PyPI, giving users access to full documentation, automatic syncs, and historic snapshots of Python packages.
To Learn More
If you’d like to learn more about the many ways that RStudio provides a single home for teams using both R and Python, we encourage you to register for our upcoming webinar on February 3rd and explore the information at R & Python: A Love Story.
We’ve also discussed R & Python in several previous blog posts, including:
- Why RStudio supports Python, which reviewed survey data from the data science community about the use of R and Python for data science.
- Debunking R and Python Myths, which answered questions from a recent joint webinar with our partner, Lander Analytics.
- Delivering Maximum value using R and Python, which provided multilingual best practices from Dan Chen of Lander Analytics.
- Wild-caught R and Python applications, which highlighted several bilingual applications suggested by the data science community.
- Why RStudio focuses on code-based data science, which recapped a recent podcast featuring RStudio’s Lou Bajuk and the Outcast’s Michael Lippis.