Site icon R-bloggers

3 Ways to Expand Your Data Science Compute Resources

[This article was first published on RStudio | Open source & professional software for data science teams on RStudio, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Photo by Richard Gatley on Unsplash

Data science leaders have embraced the work-from-home era created by COVID-19. Most data science teams have continued their work either using their company laptops or server-based IDEs such as RStudio Server. However, these home workers often run into the limitations of their laptops when they:

Embrace Server-Based Data Science Development

The key to freeing data scientists from laptop limitations is to embrace server-based development, as we noted in a prior post, Equipping Work From Home Data Science Teams. Providing data scientists with access to a server-based IDE like RStudio Server can give them more processors, cores, memory, and architecture options than would be available on their laptops. Additionally, with RStudio Server Pro, data scientists can go even further by launching interactive or batch sessions on SLURM and Kubernetes clusters.

As shown in Figure 1, RStudio offers three ways for data scientists to take advantage of centralized resources and escape the limitations of their laptops:

RStudio Server Interactive Launcher Sessions on RStudio Server Pro Launcher Jobs on RStudio Server Pro
Typical RAM Tens to hundreds of gigabytes Multiple terabytes Multiple terabytes
Typical Processor Cores Tens Hundreds to Thousands Hundreds to Thousands Typical Jobs Routine analyses Interactive tasks requiring large compute, GPUs, or RAM such as exploratory data analysis Batch tasks like parameter tuning, ETL, or model training and scoring
Setup required RStudio Server install RStudio Server Pro + Cluster add-in RStudio Server Pro + Cluster add-in
Limitations Server Resources Best for interactive work, not parallel tasks Jobs kicked off manually, limited job feedback

Figure 3: Three Ways to Expand Data Science Computational Resources Using RStudio Pro and Launcher.

Central Servers Improve Data Scientist Productivity

Data scientists benefit from using RStudio Server and RStudio Server Pro for their analysis because:

For More Information About Background and Cluster Jobs

To learn more about the new Launcher capabilities built into RStudio:

If you’d like to try out RStudio Server Pro for your team, you can learn how to download an evaluation copy from the RStudio Server Pro product page.

To leave a comment for the author, please follow the link and comment on their blog: RStudio | Open source & professional software for data science teams on RStudio.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.