CategoriesVisualizing Data
Tags
R Programming
Text Mining
WordCloud
Having state election coming soon in Victoria (Australian state where I live) I decided to make a quick analysis and compare what politicians from major Australian party’s post in their Twitter accounts. I have collected Twitter hanldes of Australian Parliament members (...
Recently, I came up with Thoen’s law. It is an empirical one, based on several years of doing data science projects in different organisations. Here it is: The probability that you have worked on a data science project that failed, approaches one very quickly as the number of projects ... [Read more...]
I am pleased to announce that the RFishBC package has been released to CRAN. This package is intended to help fisheries scientists gather age and measurement data from digital images of calcified structures and, possibly, back-calculate p... [Read more...]
Week 12 Gold Mining and Fantasy Football Projection Roundup now available. Go get that free agent gold!
The post Gold-Mining Week 12 (2018) appeared first on Fantasy Football Analytics.
[Read more...]
This post is dedicated to the beautiful chaos created by double pendulums. I have seen
a great variety of animated versions, implemented with different tool but never in R.
Thanks to the amazing package gganimate, it is actually not that hard to produce them in R.
OpenCPU provides a mature and robust system for hosting R based services. The server exposes a simple HTTP API for calling R functions, scripts and managing data. The Cloud Server is completely free and scales up to many concurrent users. T...
A Le Monde mathematical puzzle from after the competition: A sequence of five integers can only be modified by subtracting an integer N from two neighbours of an entry and adding 2N to the entry. Given the configuration below, what is the minimal number of steps to reach non-negative entries ...
Ruth Thomson, Interim Director of Strategic Innovation sat down with Jelena, one of Mango’s machine learning experts. Thanks Jelena for your time. It is an absolute pleasure to have this opportunity to discuss machine learning today. Tell me about your background with machine learning I’ve been using machine ... [Read more...]
A few improvements were recently made to several functions in the fuzzySim package. Mainly, function modelTrim is now more independent (it used to require “attach” sometimes); multTSA allows either “AIC” or “significance” as a backward stepwise selection criterion, and provides … Continue reading →
[Read more...]
library(broom) library(cluster) library(dplyr) library(ggplot2) library(ggdendro) In the first part of this blog series, we examined the theoretical foundations of cluster analysis. In the following article we put the theory into practice using R. For the analysis in R, we will use the variables mpg (fuel ...
How does driving an electric car compare to driving a similar gasoline car in terms of total pollution damages? While some comments about the future of mobility may suggest that electric cars are already clearly much more environmental friendly, the an... [Read more...]
While there is more and more data available in structured formats (CSV, JSON) through initiatives like OpenData, sometimes nicely formatted data still not publicly available. When I decided to conduct a little study of what Australian politicians from the major party post in Twitter. So I decide to find list…
...
I like both Python and R, and teach them both, but for data science R is the clear choice. When asked why, I always note (a) written by statisticians for statisticians, (b) built-in matrix type and matrix manipulations, (c) great graphics, both base and CRAN, (d) excellent parallelization facilities, etc. ...
Preparing The Raspberry Pi
Setting Up The Server
Installing Shiny Server
Installing Rstudio Server
Extra Steps
Final Comments
I have recently participated in a topic at RStudio Community where @jladata was asking if a Raspberry Pi 3B+ could make it as a viable Shiny server, I currently use a Raspberry ... [Read more...]
Recently, I have introduced sensitivity and specificity as performance measures for model selection. Besides these measures, there is also the notion of recall and precision. Precision and recall originate from information retrieval but are also used in machine learning settings. However, the use of precision and recall can be problematic ...
The idea of having a 360 degree view of the entire job seeking and matching landscape has always been a dream of any labour economist. Just imagine, a dataset of CVs and job seekers matched with job advertisements and openings! The potential of such a dataset to answer existing questions on ...
Inspired by David Schoch’s blog post,
Traveling Beerdrinker Problem.
Check out his blog, he has some amazing posts!
Introduction
Luxembourg, as any proper European country, is full of castles. According to Wikipedia,
“By some optimistic estimates, there are as many as 130 castles in Luxembourg but more realistically
there are ... [Read more...]
Imagine you are a fish ecologist who compiled a list of fish species for your country. ????
Your list could be useful to others, so you publish it as a supplementary file to an article or in a research repository. That is fantastic, but it might be difficult for others to ... [Read more...]
This post contains an adapted R script based on prep_n_run_GOplot.pl from Trinity, the denovo transcriptome assembler, for the times R cannot read in the produced EC.* files.
prep_n_run_GOplot.pl is used to produce GOplot visualizing differential expression, sorted by GO terms (see http://... [Read more...]
On November 7th, Uwe Friedrichsen and I gave our talk from the JAX conference 2018: Deep Learning - a Primer again at the W-JAX in Munich.
A few weeks before, I gave a similar talk at two events about Demystifying Big Data and Deep Learning (and how to... [Read more...]