salesforcer 0.2.2 – Relationship Queries and the Reports API

[This article was first published on stevenmortimer.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The latest version of the {salesforcer} package (v0.2.2) is now available on CRAN and is ready to help you have better access to data in your Salesforce Org. Along with a host of bug fixes this release has three big features:

  1. Experimental Functions for the Reports and Dashboards REST API (jump to section) – Now you have programmatic access to executing and managing reports, dashboards, and analytics notifications in your Org. You can familiarize yourself with the Salesforce documentation HERE. Not all functions have been implemented yet, but your questions, comments, and feedback are welcome!

  2. Support for Bulk 2.0 API Queries (jump to section) – In Salesforce version 47.0 (Winter ’20) query jobs in the Bulk API 2.0 were added. Now you can leverage this resource from the {salesforcer} package in addition to the queries via the REST, SOAP, and Bulk 1.0 APIs.

  3. Support for Relationship Queries (jump to section) – In previous versions of the package the child-to-parent and nested parent-to-child queries did not work or returned a jumbled mess of results that were not parsed correctly. This version fixes those issues in both the REST and SOAP APIs with better test coverage on a variety of query types.

For a complete list of updates, please review the release notes from v0.2.0 onwards listed on the {salesforcer} pkgdown site here: https://stevenmmortimer.github.io/salesforcer/news/index.html.

Experimental Functions for Reports and Dashboards REST API

Salesforce has rich support for Reports in your Salesforce Org. Sometimes reports are a better way to collaborate with other users because they can create reports in the GUI or you can create one for them so they always have access to the most current recordset meeting your report criteria. The challenge comes when trying to access this data programmatically. Fortunately, Salesforce provides the Reports and Dashboards REST API as a means to not only execute Reports, but to also manage them in your Org.

In Salesforce there is a dedicated page to displaying the list of reports in your Org. It typically follows the pattern: https://na1.salesforce.com/00O/o (replace na1 with your server instance). When you click on a report in the GUI you should see the report’s results. Below is a screenshot of how a report may look in your Org (note the report Id in the URL bar):

The report Id above ("00O3s000006tE7zEAE") is the only information needed to pull those same results from an R session, like so:

my_report_id <- "00O3s000006tE7zEAE"
results <- sf_run_report(my_report_id)
results
#> # A tibble: 14 x 8
#>   `Contact ID` `First Name` `test number` `Contact Owner` `Account ID`
#>   <chr>        <chr>                <dbl> <chr>           <chr>       
#> 1 0036A000002… Edna                    NA Steven Mortimer 0016A000003…
#> 2 0036A000002… Liz                     NA Steven Mortimer 0016A000003…
#> 3 0036A000002… Tom                     NA Steven Mortimer 0016A000003…
#> 4 0036A000002… Sean                    NA Steven Mortimer 0016A000003…
#> 5 0036A000002… Rose                    NA Steven Mortimer 0016A000003…
#> # … with 9 more rows, and 3 more variables: `Account Name` <chr>, `Billing
#> #   City` <chr>, `Account Owner` <chr>

Currently, all of the report related functionality in the Reports and Dashboards REST API has been ported into the {salesforcer} package and you can do some pretty neat stuff like on-the-fly filtering and sorting:

# filter records that was created before this month
filter1 <- list(column = "CREATED_DATE",
                        operator = "lessThan", 
                        value = "THIS_MONTH")

# filter records where the account billing address city is not empty
filter2 <-  list(column = "ACCOUNT.ADDRESS1_CITY",
                        operator = "notEqual", 
                        value = "")

# combine filter1 and filter2 using 'AND' which means that records must meet both filters
results_using_AND <- sf_run_report(my_report_id, 
                                   report_boolean_logic = "1 AND 2",
                                   report_filters = list(filter1, filter2))
results_using_AND
#> # A tibble: 14 x 8
#>   `Contact ID` `First Name` `test number` `Contact Owner` `Account ID`
#>   <chr>        <chr>                <dbl> <chr>           <chr>       
#> 1 0036A000002… Rose                    NA Steven Mortimer 0016A000003…
#> 2 0036A000002… Sean                    NA Steven Mortimer 0016A000003…
#> 3 0036A000002… Jack                    99 Steven Mortimer 0016A000003…
#> 4 0036A000002… Tim                     NA Steven Mortimer 0016A000003…
#> 5 0036A000002… John                    23 Steven Mortimer 0016A000003…
#> # … with 9 more rows, and 3 more variables: `Account Name` <chr>, `Billing
#> #   City` <chr>, `Account Owner` <chr>

# combine filter1 and filter2 using 'OR' which means that records must meet one 
# of the filters but also throw in a row limit based on a specific sort order
results_using_OR <- sf_run_report(my_report_id, 
                                  report_boolean_logic = "1 OR 2",
                                  report_filters = list(filter1, filter2), 
                                  sort_by = "Contact.test_number__c", 
                                  decreasing = TRUE, 
                                  top_n = 3)
results_using_OR
#> # A tibble: 3 x 8
#>   `Contact ID` `First Name` `test number` `Contact Owner` `Account ID`
#>   <chr>        <chr>                <dbl> <chr>           <chr>       
#> 1 0033s000012… KEEP                  1000 Steven Mortimer 0013s00000z…
#> 2 0033s000012… KEEP                  1000 Steven Mortimer 0013s00000z…
#> 3 0033s000012… KEEP                  1000 Steven Mortimer 0013s00000z…
#> # … with 3 more variables: `Account Name` <chr>, `Billing City` <chr>, `Account
#> #   Owner` <chr>

For more detail on how to take advantage of this new functionality please see the pkgdown website https://stevenmmortimer.github.io/salesforcer and, more specifically, the Working with Reports vignette which provides a soft introduction to these concepts. Finally, keep an eye out as more dashboard and analytics notifications functionality is also added.

Support for Bulk 2.0 API Queries

In Salesforce version 47.0 (Winter ’20) query functionality added to the Bulk 2.0 API. In the overview of this feature Salesforce emphasizes the consistency with the REST APIs and the ease of use (e.g. “Automatic File Batching”), but does not mention any claims in terms of speed compared to the Bulk 1.0 query functionality. In {salesforcer 0.2.2} the default API when using sf_run_bulk_query() or sf_query_bulk() is now the Bulk 2.0 API, assuming it is better than the Bulk 1.0 API. However, You can easily switch between the APIs just as you did before in previous {salesforcer} releases by specifying it in the api_type argument. Please note that, because of additional standardization on the column ordering and arguments to guess types, the queries below will all return the same exact format of results. For example we prioritize the following fields in queries alphabetically within this prioritization waterfall:

  1. First, the sObject field (indicates the record’s object if multiple objects returned in the results)
  2. Second, the Id field (Id, id, sf__Id)
  3. Third, record success status (Success, success, sf_Success)
  4. Fourth, record created status (Created, created, sf__Created)
  5. Fifth, record error(s) status (Error, error, errors, errors.statusCode, errors.fields, errors.message, sf__Error)
  6. Sixth, all other fields from the target object (e.g. Name, Phone, etc.)
  7. Seventh, relationship fields (fields from a parent or child of the target). For example, anything typically containing a dot like Account.Id, Owner.Name, etc.

In short, Bulk 2.0 now has query functionality and it is consistent with the other API’s query functionality. I recommend checking to see for yourself which API works well. Below is a simple example comparing a single run of the REST, Bulk 1.0, and Bulk 2.0 APIs. Consider using the {microbenchmark} package to run more precise performance tests.

soql <- "SELECT Id, LastName, Account.Id, Account.Name, Owner.Id
         FROM Contact
         LIMIT 1000"
system.time(sf_query(soql, api_type = "REST"))
#>    user  system elapsed 
#>   4.033   0.032   5.065
system.time(sf_query(soql, object_name = "Contact", api_type = "Bulk 1.0"))
#>    user  system elapsed 
#>   0.186   0.018   8.345
system.time(sf_query(soql, api_type = "Bulk 2.0"))
#>    user  system elapsed 
#>   0.046   0.009   7.541

Support for Relationship Queries

One upgrade for {salesforcer 0.2.2} is better support for relationship queries, both child-to-parent lookups using the dot notation and parent-to-child nested queries. In prior releases the results were not parsed consistently and presented themselves in a variety of hard to debug issues on GitHub that were brought up in #19, #35, #38, and #54. This release finally aims to address some of those bugs through more consistent parsing methods for both the XML returned by the SOAP API and the JSON returned by the REST API. However, I would strongly recommend testing in your Org with your own queries to see the impact before deploying to a production environment. If any unexpected behavior crops up, then please file an issue on GitHub using the query issue template so we can get it resolved.

Old Nested Query Behavior (v0.1.4 or earlier)

sf_query("SELECT Name, (SELECT LastName FROM Contacts) FROM Account", api_type="SOAP")
#> # A tibble: 24 x 4
#>    Id    Name                        Contacts         LastName
#>    <lgl> <chr>                       <list>           <chr>   
#>  1 NA    GenePoint                   <named list [4]> <NA>    
#>  2 NA    <NA>                        <NULL>           Frank   
#>  3 NA    United Oil & Gas, UK        <named list [4]> <NA>    
#>  4 NA    <NA>                        <NULL>           James   
#>  5 NA    United Oil & Gas, Singapore <named list [5]> <NA> 
#> # … with 19 more rows

New Query Behavior (v0.2.2)

sf_query("SELECT Name, (SELECT LastName FROM Contacts) FROM Account", api_type="SOAP")
#> # A tibble: 16 x 4
#>    Name                        Contact.LastName
#>    <chr>                       <chr>
#>  1 GenePoint                   Frank
#>  2 United Oil & Gas, UK        James
#>  3 United Oil & Gas, Singapore D'Cruz
#>  4 United Oil & Gas, Singapore Ripley
#>  5 Edge Communications         Forbes
#> # … with 11 more rows

A new vignette has been included with this release that covers the types of queries currently supported by the package and is available here: Supported Queries. I highly recommend reviewing for guidance and inspiration on how to what types of queries are possible running against your Org.

Additional Information

For a complete listing of all changes made in recent releases of {salesforcer} please view the Changelog (aka NEWS.md) file. Bug reports and feature requests are welcome on GitHub in the repository issues section.

Thank you for your continued support!

To leave a comment for the author, please follow the link and comment on their blog: stevenmortimer.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)