Monthly Archives: June 2012

Facts About R Packages (2)

June 6, 2012
By

R Packages All Well maintained? There are so many R packages, can they all be trusted? or are they well maintained? To answer this question, we just need to take a look of their archive histories. If a package has many versions, we can take that as the authors spent a lot of time to make their packages perfect, these...

Read more »

R-NOLD 2012-06-06 03:18:00

June 6, 2012
By
R-NOLD 2012-06-06 03:18:00

While traveling across the Visayas, I encountered barangay (villages) with the name same as my last name. Using R and map data from gadm.org I search and mapped other villages in the country named “Salvacion”.

Read more »

Facts About R Packages (1)

June 6, 2012
By

R Packages growth Curve Why R is so popular? There are a lot of reasons, such as: easy to learn and convenient to use, active community, open source, etc. Another important reason is the numerous contributed packages. Up to yesterday, there are 3854 R packages on CRAN. The following figure shows the growth curve of R package:

Managing the deluge of DNA data

June 5, 2012
By
Managing the deluge of DNA data

The explosion in DNA sequencing capacity has shifted the experimental bottleneck from sequencing to analyzing and interpreting sequences. The bioconductor package cummeRbund uses ggplot as part of its tool set for organizing, exploring and visualizing ...

Read more »

Constants and ARIMA models in R

June 5, 2012
By
Constants and ARIMA models in R

This post is from my new book Forecasting: principles and practice, available freely online at OTexts.com/fpp/. A non-seasonal ARIMA model can be written as (1)   or equivalently as (2)   where is the backshift operator, and is the mean of . R uses the parametrization of equation (2). Thus, the inclusion of a constant in a non-stationary ARIMA...

Read more »

Quasi-Random Number Generation in R

Random number generation is a core topic in numerical computer science. There are many efficient algorithms for generating random (strictly speaking, pseudo-random) variates from different probability distributions. The figure below shows a sampling of 1000 two-dimensional random variates from the … Continue reading →

Read more »

F-test to find UECLs

June 5, 2012
By
F-test to find UECLs

I have fixed the link to the video "Removing Y outliers from the validation set" and it´s time to see what could be the next step to the function. As we know the RMSEP is the sum of the explained (BIAS) and unexplained error (SEP). We get also the SEP...

Read more »

Example 9.34: Bland-Altman type plot

June 5, 2012
By
Example 9.34: Bland-Altman type plot

The Bland-Altman plot is a visual aid for assessing differences between two ways of measuring something. For example, one might compare two scales this way, or two devices for measuring particulate matter. The plot simply displays the difference between the measures against their average. Rather than a statistical test, it is intended...

Read more »

NBA Playoff Predictions Update 3 (4-2)

June 5, 2012
By
NBA Playoff Predictions Update 3 (4-2)

This is my third update to my original post on predicting the NBA playoffs with an algorithm. Here are updates 1 and 2. The algorithm correctly predicted a Boston win, but missed on the Spurs/Thunder game, so it is currently 4-2. Haven't had any time...

Read more »

Digitize linear and (semi-)log scale graphs with multiple point sets

June 5, 2012
By
Digitize linear and (semi-)log scale graphs with multiple point sets

Working on a paper, I ran into the problem of needing data from a graph that was not mine, and for which no underlying table was published. With today's software packages, it is however not very difficult to digitize a figure yourself. I remembered rea...

Read more »