Monthly Archives: January 2016

Data Cleaning Part 1 – NYC Taxi Trip Data, Looking For Stories Behind Errors

January 31, 2016
By
Data Cleaning Part 1 – NYC Taxi Trip Data, Looking For Stories Behind Errors

SummaryData cleaning is a cumbersome but important task for Data Science project in reality. This is a discussion on my practice of data cleaning for NYC Taxi Trip data. There are lots of domain knowledge, common sense and business thinking involved.

Read more »

Introduction to RcppNT2

January 31, 2016
By
Introduction to RcppNT2

Modern CPU processors are built with new, extended instruction sets that optimize for certain operations. A class of these allow for vectorized operations, called Single Instruction / Multiple Data (SIMD) instructions. Although modern compilers will u...

Read more »

Using RcppNT2 to Compute the Variance

January 31, 2016
By
Using RcppNT2 to Compute the Variance

Introduction The Numerical Template Toolbox (NT2) collection of header-only C++ libraries that make it possible to explicitly request the use of SIMD instructions when possible, while falling back to regular scalar operations when not. NT2 itself is p...

Read more »

Using RcppNT2 to Compute the Sum

January 31, 2016
By
Using RcppNT2 to Compute the Sum

Introduction The Numerical Template Toolbox (NT2) collection of header-only C++ libraries that make it possible to explicitly request the use of SIMD instructions when possible, while falling back to regular scalar operations when not. NT2 itself is powered by Boost, alongside two proposed Boost libraries – Boost.Dispatch, which provides a mechanism for efficient tag-based dispatch for functions, and Boost.SIMD, which provides a framework for the implementation of algorithms that...

Read more »

Connecting Religion and Demographics

January 31, 2016
By
Connecting Religion and Demographics

I have my second guest post up today at Ari Lamstein’s blog where I conclude my exploration of the Religious Congregations and Membership Study at the ARDA. In this post I show how we can look at the relationships between a data set like the religion census and demographic data to gain context and understanding. Go over there to...

Read more »

Using RcppNT2 to Compute the Variance

January 31, 2016
By
Using RcppNT2 to Compute the Variance

Introduction The Numerical Template Toolbox (NT2) collection of header-only C++ libraries that make it possible to explicitly request the use of SIMD instructions when possible, while falling back to regular scalar operations when not. NT2 itself is powered by Boost, alongside two proposed Boost libraries – Boost.Dispatch, which provides a mechanism for efficient tag-based dispatch for functions, and Boost.SIMD, which provides a framework for the implementation of algorithms that...

Read more »

Introduction to RcppNT2

January 31, 2016
By
Introduction to RcppNT2

Modern CPU processors are built with new, extended instruction sets that optimize for certain operations. A class of these allow for vectorized operations, called Single Instruction / Multiple Data (SIMD) instructions. Although modern compilers will use these instructions when possible, they are often unable to reason about whether or not a particular block of code can be executed using SIMD instructions. The Numerical Template Toolbox (NT2) is a...

Read more »

Shiny Developer Conference

January 31, 2016
By
Shiny Developer Conference

Really enjoying RStudio‘s Shiny Developer Conference | Stanford University | January 2016. Winston Chang just demonstrated profvis, really slick. You can profile code just by wrapping it in a profvis({}) block and the results are exported as interactive HTML widgets. For example, running the R code below: if(!('profvis' %in% rownames(installed.packages()))) { devtools::install_github('rstudio/profvis') } library('profvis') nrow … Continue reading Shiny...

Read more »

R Tagosphere!

January 31, 2016
By
R Tagosphere!

This post explores the inter-relationships of StackOverflow Tags for R-related questions. So I grabbed all the questions tagged with “r”, took the other tags in each question and made some network charts that show how often each tag is seen with the other tags. The point is to see the empirical relationships …

Read more »

Pitfall of XML package: issues specific to cp932 locale, Japanese Shift-JIS, on Windows

January 31, 2016
By
Pitfall of XML package:  issues specific to cp932 locale, Japanese Shift-JIS, on Windows

CRAN package XML has something wrong at parsing html pages encoded in cp932 (shift-jis).  In this report, I will show these issues and also their solutions which is workable at … Continue reading →

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)