Using Markdown and Pandoc for PublicationThe other day I was involved in editing job, in which I was supposed to edit 18 articles written in Microsoft Word (doc/docx format) and convert them into pdf format (for printing into a book) and html format (for web publishing). Manuscripts written by people ... [Read more...]
There are a number of on line efforts to register victims of shootings online. Shootingtracker tries to register all mass shootings, those with four or more victims. Slate had the gun death tally (GDT), gun deaths starting at Newtown, running thro... [Read more...]
Quantifying spatial data (e.g. land cover) around points can be done in a variety of ways, some of which require considerable amounts of patience, clicking around, and/or cash for a license.
Here’s a bit of code that I cobbled together to quickly extract land cover data from ... [Read more...]
by Yanchang Zhao, RDataMining.com “Data is becoming a powerful and most valuable commodity in 21st century. It is leading to scientific insights and new ways of understanding human behaviour. Data can also make you rich. Very rich.” — SBS documentary … Continue reading →
[Read more...]
I was reading this interesting post about how to build a multi-lingual Shiny app. I’m also building a multi-lingual Shiny app and came up with slightly different take on it. First, I don’t use a function for finding the translation, … Continue reading →
[Read more...]
As we come up to the final two races of the 2014 Formula One season, the double points mechanism for the final race means that two drivers are still in with a shot at the Drivers’ Championship: Lewis Hamilton and Nico Rosberg. As James Allen describes in Hamilton closes in on ...
Data from Zacks Research have just been made available on Quandl. Registered Quandl users have free preview access to these data, which cover the following: Earnings Estimates: forward-looking consensus forecasts; Earnings Surprises: estimated future and actual historical earnings; Earnings Announcements: predictions for earnings announcement dates, parameters, and supplementary data; Sales ... [Read more...]
This is the bimonthly R Jobs post (for 2014-11-07), based on the R-bloggers’ sister website: R-users.com. If you are an employer who is looking to hire people from the [Read more...]
On Wednesday next week, I'll be presenting a live webinar to introduce Revolution R Open and several other open source projects from Revolution Analytics. In the webinar I'll describe: The enhancements included in Revolution R Open The Reproducible R Toolkit and the checkpoint package How to call R from other ... [Read more...]
You may have seen that we recently open sourced SmartDataCenter and Manta. Because Joyent pioneered the use of infrastructure containers, you would be forgiven for asking the question: does this mean that SDC somehow competes with Docker? The answer is emphatically not — and in fact, to the c… [Read more...]
This post will cover ideas from two individuals: David Varadi of CSS Analytics with whom I am currently collaborating on some … Continue reading →
[Read more...]
One of the drawbacks with R has been its limitation with big datasets. It stores everything in RAM so once you have more than 100K records your PC really starts to slow down. However, since AWS allows you to use any size machine, you could now consider using R for ... [Read more...]
We analyzed over six million flights to help you decide on the best time to travel to avoid delays.
The post When to fly to get there on time? Six million flights analyzed. appeared first on Decision Science News.
[Read more...]
A new release of RcppRedis is now on CRAN. It contains additional commands for hashes and sets, all contributed by John Laing and Whit Armstrong.
Changes in version 0.1.2 (2014-11-06)
New commands execv, hset, hget, sadd, srem, and smembers... [Read more...]
A couple of weeks ago, Twitter open-sourced their BreakoutDetection package for R, a package designed to determine shifts in time-series data. The Twitter announcement does a great job of explaining the main technique for detection (E-Divisive with Medians), so I won’t rehash that material here. Rather, I wanted to ... [Read more...]
I get questions about this almost every week. Here is an example from a recent comment on this blog: I have two large time series data. One is separated by seconds intervals and the other by minutes. The length of each time series is 180 days. I’m using R (3.1.1) for ... [Read more...]
If you like to make nice looking documents using Latex, I highly recommend using the 'xtable' package. In most instances, it works quite well for producing a reasonable looking table from an R object. I however recently wanted a LaTeX … Continue reading → [Read more...]
Use cases
Public reports.
Public data sharing, e.g. R packages download logs from CRAN's RStudio mirror - cran-logs.rstudio.com - mask ip addresses.
Reports or data sharing for external vendor.
Development works can operate on anonymized PRODUCTION data.
Manually or semi-manually populated data can often brings some new ... [Read more...]
In my previous blog post, I detailed how I created my first R package called geocodeHERE. This package is a convenient wrapper for Nokia's HERE geocoding API. The cool thing about this API is that it allows for bulk geocoding. So, instead of doing n API calls to geocode n ... [Read more...]