I’ll admit I didn’t really know who Bill Nye was before yesterday. His name sounds a bit like Bill Nighy’s, that’s all I knew. But well science is all around and quite often scientists on Twitter start interesting campaigns. Remember the #actua...

In the exercises below we cover the basics of functional programming in R( part 1 of a two series exercises on functional programming) . We consider recursion with R , apply family of functions , higher order functions such as Map ,Reduce,Filter in R . Answers to the exercises are available here. If you obtained
Related exercise sets:

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) The Azure Data Science Virtual Machine (DSVM) is a curated VM which provides commonly-used tools and software for data science and machine learning, pre-installed. AzureDSVM is a new R package that enables seamless interaction with the DSVM from a local R session, by providing functions...

If you weren't able to make it to Chicago for R/Finance, the annual conference devoted to applications of R in the financial industry, don't fret: the entire conference is being livestreamed (with thanks to the team at Microsoft). You can watch the proceedings at aka.ms/r_finance, and recordings will be available at the same link after the event. Check out...

In this post, I describe the latest iteration of my automatic document production with R. It improves upon the methods used in Rtraining, and previous work on this topic can read by going to the auto deploying R documentation tag. The post Improving automatic document production with R appeared first on Locke Data. Locke Data are...

This document describes the ‘vwline’ package, which provides an R interface for drawing variable-width curves. The package provides functions to draw line segments through a set of locations, or a smooth curve relative to a set of control points, with the width of the line allowed to vary along the length of the line. Paul … Continue...

I have used on.exit() for several years, but it was not until the other day that I realized a very weird thing about it: you’d better follow the default positions of its arguments expr and add, i.e., the first argument has to be expr and the second has to be add. on.exit(expr = NULL, add = FALSE) If you do...

While the Riddler puzzle this week was anticlimactic, as it meant filling all digits in the above division towards a null remainder, it came as an interesting illustration of how different division is taught in the US versus France: when I saw the picture above, I had to go and check an American primary school

This set of exercises will help you to help you improve your skills with character functions in R. Most of the exercises are related with text mining, a statistical technique that analyses text using statistics. If you find them interesting I would suggest checking the library tm, this includes functions designed for this task. There
Related exercise sets:

Shinydashboard 0.6.0 is now on CRAN! This release of shinydashboard was aimed at both fixing bugs and also bringing the package up to speed with users’ requests and Shiny itself (especially fully bringing bookmarkable state to shinydashboard’s sidebar). In addition to bug fixes and new features, we also added a new “Behavior” section to the

Get the most out of the EARL Conference with the new phone app, available on iTunes and Google play. View the agenda, speakers, and sponsors in the palm of your hand. Bookmark the sessions you’d like to attend and rate … Continue reading →

In this article I will discuss array indexing, operators, and composition in depth. If you work through this article you should end up with a very deep understanding of array indexing and the deep interpretation available when we realize indexing is an instance of function composition (or an example of permutation groups or semigroups: some … Continue...

Unsupervised learning refers to data science approaches that involve learning without a prior knowledge about the classification of sample data. In Wikipedia, unsupervised learning has been described as “the task of inferring a function to describe hidden structure from ‘unlabeled’ data (a classification of categorization is not included in the observations)”. The overarching objectives of Related Post

A few days ago, Joel Courtheyn posted the following issue in the errors package repository on GitHub: Experimenting with the new package I detected a difference in calculation of the error depending on the way a formula was written. Originally I tried to calculate the error for z1 <- (x^3 - 2y)/x^0.5 but this gave me… Continuar leyendo...

Need help working with Census data in your project? Contact me at [email protected] to discuss consulting support or a training workshop! Commonly, studies that use US Census data focus on topics at the scale of the metropolitan area. However, subsetting Census geographic data by metropolitan area is not always straightforward. Such a workflow for Census tracts might...

A few months ago Jenny wanted me (and Karthik, if I remember correctly) to share some experience with GIFs. I have been busy with writing the blogdown book recently and don’t really have much time, so I’m going to write a quick post just to take a short break. I may expand this post in the future. First...

The Consumer Data Research Centre, the UK-based organization that works with consumer-related organisations to open up their data resources, recently published a new course online: An Introduction to Spatial Data Analysis and Visualization in R. Created by James Cheshire (whose blog Spatial.ly regularly features interesting R-based data visualizations) and Guy Lansley, both of University College London Department of Geography,...

Neural networks have been a very important area of scientific study that has evolved by different disciplines such as mathematics, biology, psychology, computer science, etc.The study of neural networks leapt from theory to practice with the emergence of computers.Training a neural network by adjusting the weights of the connections is computationally very expensive so its...

Copulas are a powerful statistical tool commonly used in the finance sector to generate samples from a given multivariate joint distribution. The principal advantage of using those types of function over other methods is that copulas describe the multivariate joint distribution as his margin and the dependence structure between them, which give the user the
Related exercise sets:

We are ready for the third R-Lab, the monthly appointment where we co-work together on a real data science problem using R. This time the R-Lab is promoted by nothing but the Assessorato alla Partecipazione, Cittadinanza Attiva e Open Data of the Comune di Milano! We will access their municipality budget data, and use one day of joint work...

Even though the data.frame object is one of the core objects to hold data in R, you'll find that it's not really efficient when you're working with time series data. You'll find yourself wanting a more flexible time series class in R that offers a variety of methods to manipulate your data. xts or the Extensible Time Series is one of...

Looks like I’ll be diving into some Bayesian analyses using JAGS. This post is primarily intended as a collection of links to useful information, but also includes a few initial thoughts (I might update it occasionally with new links). In terms of R packages, a very brief play suggests that R2jags is more user

By Ashwin Agrawal URL of the Project Idea: https://github.com/rstats-gsoc/gsoc2017/wiki/Biodiversity-data-cleaning Introduction There are an increasing number of scientists using R for their data analyses, however, the skill set required to handle biodiversity data in R, is considerably varies. Since, users need to retrieve, manage and assess high volume data with complex structure (Darwin Core standard, DwC); only