In my my last post I described how to produce a multivariate choropleth map with R. Now I will show …Continuar leyendo »

We know the words but what do they mean? Some definitions Here are some definitions of “passive investment management”. Investopedia says: A style of management associated with mutual and exchange-traded funds (ETF) where a fund’s portfolio mirrors a market index. Wikipedia says: Passive management (also called passive investing) is a financial strategy in which an investor (or … Continue reading...

It is pretty easy to monitor the progress of a long loop in R using the original txtProgressBar function in the utils package. It works like this: mypb <- txtProgressBar() m <- sapply(1:1000, function(x) { setTxtProgressBar(mypb, x/1000) mean(rnorm(x)) }) close(mypb) You could even get a GUI-type output using tkProgressBar from the tcltk package, or winProgressBar. Or you could build your own....

This video shows how to obtain and install R on the Windows (PC) platform. It also shows a few basic functions in R, such as how to install packages in R and load them for use.

This video shows how to obtain and install R on the Windows (PC) platform. It also shows a few basic functions in R, such as how to install packages in R and load them for use.

SAS is much touted for its ability to read in huge datasets, and rightly so. However, that ability comes at a cost: for smaller datasets, since files remain on the disk rather than in memory (as is the case with Stata and R), it is potentially less fa...

If you are reading this vis-à-vis R-Bloggers, then you know how good R, LaTeX, and Sweave are for generating reports and/or conducting reproducible research. It has been particularly valuable for me in Institutional Research where there are many reports that I need to prepare on a regular basis (some monthly, some quarterly, some annually). However, one issue

Another problem generated by X’validated (on which I spent much too much time!): given an unbiased coin that produced M heads in the first M tosses, what is the expected number of additional tosses needed to get N (N>M) consecutive heads? Consider the preliminary question of getting a sequence of N heads out of k

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) This is another pretreatment used quite often in Near Infrared to remove the scatter. It is applied to every spectrum individually. The average and standard deviation of all the data points for that spectra is calculated. Every data point of the spectra is substracted from the mean and...

Open source is amazing! I cannot even start to imagine the amount of work invested in R, in firefox browser (Mozilla), or Rstudio IDE, all of which are used extensively around the globe, free. Not free as in: free sample … Continue reading →

The new RcppArmadillo release 0.2.35 now supports the Rcpp::Rcout output stream device. Based on a contributed Rcpp patch by Jelper Ypma, the Rcpp::Rcout output stream gets redirected to R's buffered output. In other words, R's own output and that e...

The Winter 2012 edition of Sybase's Financial Services Viewpoint newsletter includes two articles related to R. The "Industry Insight" article "R is Hot" (written by yours truly) is a one-page summary of the R phenomenon: what it is, how it's used, and how it's revolutionizing data analysis. The "Industry Insight" article by SAP's Melinda Wilson, "R swings the pendulum...

Now that Rcpp 0.9.10 is released and on CRAN, other packages can take advantage of a small change needed to make use of the quasi-output stream Rcpp::Rcout. So the new release 0.2.35 of RcppArmadillo does just that---and input/output from Armadillo...

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) Mean spectrum calculation is important: To center a matrix of spectra, we subtract the mean spectrum, from every spectrum in the matrix. There are also many options to use the mean spectrum, like average subsamples. Let´s calculate and plot the mean spectra for the Yarn NIR Data:...

I’ve largely avoided “time” in R to date, but following a chat with @mhawksey at #dev8d yesterday, I went down a rathole last night exploring a few ways of visualising a Twitter user timeline and as a result also had a quick initial play with some time handling features of R, such as timeseries objects,

A new release 0.9.10 of Rcpp is now on CRAN and in Debian. This is mostly internal release with a little bit of code reorgination (some of which will be used by a forthcoming RcppArmadillo release), some changes to make R CMD check happy, and one or ...

This post will consider some useful functions for dealing with data frames during data processing and validation. Consider an artifical data set create using the expand.grid function where there are duplicate rows in the data frame. > des = expand.grid(A = c(2,2,3,4), B = c(1,3,5,5,7)) > des A B 1 2 1 2 2 1

Many ecologists are R users, but we vary in our understanding of the math and statistical theory behind models we use. There is no clear consensus on what should be the basic mathematical training of ecologists. To learn what the community thinks, we ...