Articles by Javier Luraschi

Training ImageNet with R

August 23, 2020 | Javier Luraschi

``` The MirroredStrategy can help us scale up to about 8 GPUs per compute instance; however, we are likely to need 16 instances with 8 GPUs each to train ImageNet in a reasonable time (see Jeremy Howard’s post on Training Imagenet in 18 Minutes). So where do we go from here? Welcome to MultiWorkerMirroredStrategy: ... [Read more...]

pins 0.4: Versioning

May 28, 2020 | Javier Luraschi

A new version of pins is available on CRAN today, which adds support for versioning your datasets and DigitalOcean Spaces boards! As a quick recap, the pins package allows you to cache, discover and share resources. You can use pins in a wide range of situations, from downloading a dataset ...
[Read more...]

pins 0.4: Versioning

April 12, 2020 | Javier Luraschi

A new release of pins is available on CRAN today. This release adds support to time travel across dataset versions, which improves collaboration and protects your code from breaking when remote resources change unexpectedly. [Read more...]

sparklyr 1.1: Foundations, Books, Lakes and Barriers

January 28, 2020 | Javier Luraschi

Today we are excited to share that sparklyr 1.1 is now available on CRAN! In a nutshell, you can use sparklyr to scale datasets across computing clusters running Apache Spark. For this particular release, we would like to highlight the following new features: Delta Lake enables database-like properties in Spark. Spark 3.0 ...
[Read more...]

pins 0.3: Azure, GCloud and S3

November 27, 2019 | Javier Luraschi

A new version of pins is available on CRAN! pins 0.3 comes with many improvements and the following major features: Support for new cloud boards to pin resources in Azure, GCloud and S3 storage. Retrieve pin information with pin_info() including properties particular to each board. You can install this new ...
[Read more...]

pins: Pin, Discover and Share Resources

September 8, 2019 | Javier Luraschi

Today we are excited to announce the pins package is available on CRAN! pins allows you to pin, discover and share remote resources, locally or in remote storage. If you find yourself using download.file() or asking others to download files before running your R code, use pin() to achieve ...
[Read more...]

sparklyr 1.0: Apache Arrow, XGBoost, Broom and TFRecords

March 14, 2019 | Javier Luraschi

With much excitement built over the past three years, we are thrilled to share that sparklyr 1.0 is now available on CRAN! The sparklyr package provides an R interface to Apache Spark. It supports dplyr, MLlib, streaming, extensions and many other features; however, this particular release enables the following new features: ...
[Read more...]

sparklyr 0.9

September 30, 2018 | Javier Luraschi

Today we are excited to share that a new release of sparklyr is available on CRAN! This 0.9 release enables you to: Create Spark structured streams to process real time data from many data sources using dplyr, SQL, pipelines, and arbitrary R code. Monitor connection progress with upcoming RStudio Preview 1.2 features ...
[Read more...]

sparklyr 0.6

July 30, 2017 | Javier Luraschi

We’re excited to announce a new release of the sparklyr package, available in CRAN today! sparklyr 0.6 introduces new features to: Distribute R computations using spark_apply() to execute arbitrary R code across your Spark cluster. You can now use all of your favorite R packages and functions in a ... [Read more...]

sparklyr 0.5

January 24, 2017 | Javier Luraschi

We’re happy to announce that version 0.5 of the sparklyr package is now available on CRAN. The new version comes with many improvements over the first release, including: Extended dplyr support by implementing: do() and n_distinct(). New functions including sdf_quantile(), ft_tokenizer() and ft_regex_tokenizer(). Improved compatibility, ...
[Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)