# Modernizing sugar in Rcpp11

May 27, 2014
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m in the process of modernizing the implementation of sugar in Rcpp11. Previous work already improved performance of sugar by allowing sugar classes themselves to implement how to apply themselves into their target vector. For example the sugar class SeqLen leverages std::iota instead of a manual for loop.

``````template
inline void apply( Target& target ) const {
std::iota( target.begin(), target.end(), 1 ) ;
}
``````

sugar is based on the expression templates technique, a very popular pattern in C++ libraries that reduces the use of temporaries. Consider the following R code:

``````y <- exp(abs(x))
``````

To calculate y in R, we must first create a temporary vector to hold the result of abs(x) and then create a new vector to hold the result of exp(abs(x)). Then the vector we allocated to hold abs(x) is no longer referenced so becomes candidate for garbage collection, etc … we just allocated something to throw it away right after.

If we were to implement this manually in C++, we would typically write a for loop.

``````int n = x.size() ;
NumericVector y(n) ;
for( int i=0; i``````
``` There are other ways to do it as well. For example using std::transform NumericVector y(n) ; std::transform( x.begin(), x.end(), y.begin(), [](double a){ return exp(abs(a)) ; }) ; or using Rcpp::transform : NumericVector y = transform( x.begin(), x.end(), []( double a){ return exp(abs(a)); }) ; But sugar definitely gives us the most expressive and closest to R solution: NumericVector y = exp(abs(x)) ; Here exp and abs operate on the entire vector, not just the scalar double. As with other expression templates libraries, sugar delays the actual work as much as possible. The expression exp(abs(x)) itself does not create a vector but creates an object that can be assigned to a NumericVector : auto y = exp(abs(x)) ; Rprintf( "type(y) = %s", DEMANGLE(y) ) ; // type(y) = Rcpp::sugar::Sapply<14, true, Rcpp::sugar::Sapply<14, true, Rcpp::Vector<14, Rcpp::PreserveStorage>, double (*)(double)>, double (*)(double)> The expression exp(abs(x)) has created the same object as the expression sapply(sapply(x,::abs),::exp). The first benefit from this is that the code for abs and for exp is exactly the same. After all, we just vectorize a function that operate on scalar values. The second benefit is that we can identify that we want function composition here. We don’t want to first sapply abs over x and then sapply exp over the previous result, what we really want is sapply the composition of the abs and exp scalar functions. That’s exactly how it is implemented in Rcpp11. We can even retrieve that function: auto fun = y.fun ; Rprintf( "type(fun) = %s\n", DEMANGLE(decltype(fun)) ) ; // type(fun) = Rcpp::functional::Compose And the Compose class looks like this: template class Compose : public Functoid> { private: F1 f1 ; F2 f2 ; public: Compose( F1 f1_, F2 f2_ ) : f1(f1_), f2(f2_){} template inline auto operator()( Args&&... args ) const -> decltype( f2( f1( std::forward(args)... ) ) ) { return f2( f1( std::forward(args)... ) ); } } ; And we can just as easily compose lambda functions: #include using namespace Rcpp ; // [[export]] NumericVector test(NumericVector x){ auto square = []( double a ){ return a*a ; } ; auto twice = []( double a ){ return a*2 ; } ; auto y = sapply( sapply(x, square), twice ) ; Rprintf( "type(y) = %s\n", DEMANGLE(decltype(y)) ) ; auto fun = y.fun ; Rprintf( "type(fun) = %s\n", DEMANGLE(decltype(fun)) ) ; double val = fun(3.0); Rprintf( "val = %5.3f\n", val ) ; NumericVector res = y ; return res ; } /*** R test(1:10) */ which gives: \$ Rcpp11Script /tmp/exp.cpp > test(1:10) type(y) = Rcpp::sugar::Sapply<14, true, Rcpp::sugar::Sapply<14, true, Rcpp::Vector<14, Rcpp::PreserveStorage>, test(Rcpp::Vector<14, Rcpp::PreserveStorage>)::\$_0>, test(Rcpp::Vector<14, Rcpp::PreserveStorage>)::\$_1> type(fun) = Rcpp::functional::Compose)::\$_0, test(Rcpp::Vector<14, Rcpp::PreserveStorage>)::\$_1> val = 18.000 [1] 2 8 18 32 50 72 98 128 162 200 ` It becomes even more interesting for how missing values should be treated. Let’s now consider that the expression is used on an integer vector. auto square = []( int a ){ return a*a ; } ; auto twice = []( int a ){ return a*2 ; } ; auto y = sapply( sapply(x, square), twice ) ; ` In an iteration based implementation of sugar (like e.g. the implementation in Rcpp), to be correct we would have no choice but to check for NA twice because the two functions operate somewhat independently from one another. So the iteration based implementation in Rcpp would lead to code equivalent to this: for( int i=0; i With the new composition based approach, we only have to check for NA once, which leads to code much closer to what we would intuitively write manually: for( int i=0; i The composition based approach is still a work in progress, but I believe it will be yet another way to achieve performance improvements for the modernized version of sugar. We can also generalize the composition approach to several input vectors, via mapply. Consider the expression : x + exp(y) + abs(sin(z)). The challenge is to identify actual vectors we want to iterate over: x, y and z and generate the appropriate function composition. Should be fun. var vglnk = { key: '949efb41171ac6ec1bf7f206d57e90b8' }; (function(d, t) { var s = d.createElement(t); s.type = 'text/javascript'; s.async = true; s.src = '//cdn.viglink.com/api/vglnk.js'; var r = d.getElementsByTagName(t)[0]; r.parentNode.insertBefore(s, r); }(document, 'script')); Related ShareTweet To leave a comment for the author, please follow the link and comment on their blog: R Enthusiast and R/C++ hero. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook... ```
``` ```
``` Comments are closed. ```
``` Search R-bloggers Most visited articles of the week Effectively Deploying and Scaling Shiny Apps with ShinyProxy, Traefik and Docker Swarm 5 Ways to Subset a Data Frame in R How to write the first for loop in R R – Sorting a data frame by the contents of a column Date Formats in R Installing R packages Free Springer Books during COVID19 Mimic Excel's Conditional Formatting in R Linear and Logistic Regression in Practical Data Science with R 2nd Edition Sponsors // https://support.cloudflare.com/hc/en-us/articles/200169436-How-can-I-have-Rocket-Loader-ignore-my-script-s-in-Automatic-Mode- // this must be placed higher. Otherwise it doesn't work. // data-cfasync="false" is for making sure cloudflares' rocketcache doesn't interfeare with this // in this case it only works because it was used at the original script in the text widget function createCookie(name,value,days) { var expires = ""; if (days) { var date = new Date(); date.setTime(date.getTime() + (days*24*60*60*1000)); expires = "; expires=" + date.toUTCString(); } document.cookie = name + "=" + value + expires + "; path=/"; } function readCookie(name) { var nameEQ = name + "="; var ca = document.cookie.split(';'); for(var i=0;i < ca.length;i++) { var c = ca[i]; while (c.charAt(0)==' ') c = c.substring(1,c.length); if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length); } return null; } function eraseCookie(name) { createCookie(name,"",-1); } async function readTextFile(file) { // Helps people browse between pages without the need to keep downloading the same // ads txt page everytime. This way, it allows them to use their browser's cache. var random_number = readCookie("ad_random_number_cookie"); if(random_number == null) { var random_number = Math.floor(Math.random()*100*(new Date().getTime()/10000000000)); createCookie("ad_random_number_cookie",random_number,1) } file += '?t='+random_number; var rawFile = new XMLHttpRequest(); rawFile.onreadystatechange = function () { if(rawFile.readyState === 4) { if(rawFile.status === 200 || rawFile.status == 0) { // var allText = rawFile.responseText; // document.write(allText); document.write(rawFile.responseText); } } } rawFile.open("GET", file, false); rawFile.send(null); } // readTextFile('https://raw.githubusercontent.com/Raynos/file-store/master/temp.txt'); readTextFile("https://www.r-bloggers.com/wp-content/uploads/text-widget_anti-cache.txt"); Jobs for R usersSenior Multimodal Performance AnalystData Analytics Auditor, Future of Audit Lead @ London or NewcastleSenior Scientist, Translational Informatics @ Vancouver, BC, CanadaSenior Principal Data Scientist @ Mountain View, California, United StatesTechnical Research Analyst – New York, U.S.Movement Building AnalystInnovation Fellow python-bloggers.com (python/data-science news)Tutorial: Demystifying Deep Learning for Data ScientistsAdaOpt classification on MNIST handwritten digits (without preprocessing)Determine optimal sample sizes for business value in A/B testing, by Chris SaidMaking Pictures 3D using Context-aware Layered Depth InpaintingAdaOptAutomatically create perfect .gitignore file for your projectHow to Write a Git Commit Message, in 7 Steps Full list of contributing R-bloggers ```
``` R-bloggers was founded by Tal Galili, with gratitude to the R community. Is powered by WordPress using a bavotasan.com design. Copyright © 2020 R-bloggers. All Rights Reserved. Terms and Conditions for this website var snp_f = []; var snp_hostname = new RegExp(location.host); var snp_http = new RegExp("^(http|https)://", "i"); var snp_cookie_prefix = ''; var snp_separate_cookies = false; var snp_ajax_url = 'https://www.r-bloggers.com/wp-admin/admin-ajax.php'; var snp_ajax_nonce = '35227db995'; var snp_ignore_cookies = false; var snp_enable_analytics_events = false; var snp_enable_mobile = false; var snp_use_in_all = false; var snp_excluded_urls = []; snp_excluded_urls.push(''); Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.) Click here to close (This popup will not appear again) .snp-pop-109583 .snp-theme6 { max-width: 700px;} .snp-pop-109583 .snp-theme6 h1 {font-size: 17px;} .snp-pop-109583 .snp-theme6 { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field ::-webkit-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-moz-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-ms-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field input { border: 1px solid #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field { color: #000000;} .snp-pop-109583 .snp-theme6 { background: #f2f2f2;} jQuery(document).ready(function() { }); var CaptchaCallback = function() { jQuery('.g-recaptcha').each(function(index, el) { grecaptcha.render(el, { 'sitekey' : '' }); }); }; (function(){ var corecss = document.createElement('link'); var themecss = document.createElement('link'); var corecssurl = "https://www.r-bloggers.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shCore.css?ver=3.0.9b"; if ( corecss.setAttribute ) { corecss.setAttribute( "rel", "stylesheet" ); corecss.setAttribute( "type", "text/css" ); corecss.setAttribute( "href", corecssurl ); } else { corecss.rel = "stylesheet"; corecss.href = corecssurl; } document.head.appendChild( corecss ); var themecssurl = "https://www.r-bloggers.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shThemeDefault.css?ver=3.0.9b"; if ( themecss.setAttribute ) { themecss.setAttribute( "rel", "stylesheet" ); themecss.setAttribute( "type", "text/css" ); themecss.setAttribute( "href", themecssurl ); } else { themecss.rel = "stylesheet"; themecss.href = themecssurl; } document.head.appendChild( themecss ); })(); SyntaxHighlighter.config.strings.expandSource = '+ expand source'; SyntaxHighlighter.config.strings.help = '?'; SyntaxHighlighter.config.strings.alert = 'SyntaxHighlighter\n\n'; SyntaxHighlighter.config.strings.noBrush = 'Can\'t find brush for: '; SyntaxHighlighter.config.strings.brushNotHtmlScript = 'Brush wasn\'t configured for html-script option: '; SyntaxHighlighter.defaults['pad-line-numbers'] = false; SyntaxHighlighter.defaults['toolbar'] = false; SyntaxHighlighter.all(); // Infinite scroll support if ( typeof( jQuery ) !== 'undefined' ) { jQuery( function( \$ ) { \$( document.body ).on( 'post-load', function() { SyntaxHighlighter.highlight(); } ); } ); } _stq = window._stq || []; _stq.push([ 'view', {v:'ext',j:'1:7.3.2',blog:'11524731',post:'77120',tz:'-6',srv:'www.r-bloggers.com'} ]); _stq.push([ 'clickTrackerInit', '11524731', '77120' ]); jQuery(document).ready(function (\$) { //\$( document ).ajaxStart(function() { //}); for (var i = 0; i < document.forms.length; ++i) { var form = document.forms[i]; if (\$(form).attr("method") != "get") { \$(form).append('<input type="hidden" name="jIBkaqQVHTl" value="oCAPh9pY4Iy" />'); } if (\$(form).attr("method") != "get") { \$(form).append('<input type="hidden" name="cgSbLnGDmuop" value="0ymqxuI5tUKJ" />'); } if (\$(form).attr("method") != "get") { \$(form).append('<input type="hidden" name="lbnHGIegTFXc" value="hw1t.gx[]HkX" />'); } if (\$(form).attr("method") != "get") { \$(form).append('<input type="hidden" name="ZVuvrLMsGE_Xd" value="jA92xkTbXp]@l" />'); } } \$(document).on('submit', 'form', function () { if (\$(this).attr("method") != "get") { \$(this).append('<input type="hidden" name="jIBkaqQVHTl" value="oCAPh9pY4Iy" />'); } if (\$(this).attr("method") != "get") { \$(this).append('<input type="hidden" name="cgSbLnGDmuop" value="0ymqxuI5tUKJ" />'); } if (\$(this).attr("method") != "get") { \$(this).append('<input type="hidden" name="lbnHGIegTFXc" value="hw1t.gx[]HkX" />'); } if (\$(this).attr("method") != "get") { \$(this).append('<input type="hidden" name="ZVuvrLMsGE_Xd" value="jA92xkTbXp]@l" />'); } return true; }); jQuery.ajaxSetup({ beforeSend: function (e, data) { //console.log(Object.getOwnPropertyNames(data).sort()); //console.log(data.type); if (data.type !== 'POST') return; if (typeof data.data === 'object' && data.data !== null) { data.data.append("jIBkaqQVHTl", "oCAPh9pY4Iy"); data.data.append("cgSbLnGDmuop", "0ymqxuI5tUKJ"); data.data.append("lbnHGIegTFXc", "hw1t.gx[]HkX"); data.data.append("ZVuvrLMsGE_Xd", "jA92xkTbXp]@l"); } else { data.data = data.data + '&jIBkaqQVHTl=oCAPh9pY4Iy&cgSbLnGDmuop=0ymqxuI5tUKJ&lbnHGIegTFXc=hw1t.gx[]HkX&ZVuvrLMsGE_Xd=jA92xkTbXp]@l'; } } }); }); /* <![CDATA[ */ jQuery(function(){ jQuery("ul.sf-menu").supersubs({ minWidth: 12, maxWidth: 27, extraWidth: 1 }).superfish({ delay: 100, speed: 250 }); }); /* ]]> */ ```