The annoucement below just went to the R-SIG-Finance list. More information is as usual at the R / Finance page: Now open for registrations: R / Finance 2013: Applied Finance with R May 17 and 18, 2013 Chicago, IL, USA The registration for R/Fin...

I wanted to play a bit Julia, the new language for technical computing, but no binaries were available yet for current versions of Ubuntu. So I decided to try and build them myself by backporting the julia 1.2.0 source package available in Sid and Raring. On Quantal, the packages were building out of the box. ...

UPDATE: The day was great! There are many people doing really amazing things with open data and it was amazing to meet them. Here are my slides from the panel talk. Next Saturday, I’ll be sitting on a panel discussing future avenues for open data at ODX13. From the odx13 site: Odx13 is a mini-conference

An announcement for an Icelandic meeting next September, meeting I would have loved to attend (darn!)… This meeting is sponsored by the BayesComp session, of course!!! We are pleased to announce that the University of Iceland will host the 3rd Workshop on Bayesian Inference for Latent Gaussian Models with Applications (LGM). The workshop will be

So as discussed in this post I will be investigating the different members of the 'apply function family' in R. This post starts with the most basic one, called apply(). The R manual states the following apply(X, MARGIN, FUN, ...) With the following arguments X an array, including a matrix. MARGIN a vector giving the subscripts which

I am playing with R now for little over a year. Not very intensive, but once in a while I start up R Studio and do some coding and analysis. But I am still far, far away from becoming an R-Pro. If you talk to or read some of the posts of the more seaso...

Business dashboards are available in many shapes and sizes. Business dashboards are useful to create an overview of key performance indicators (KPIs) important for the business strategy and/or operations. There are many flavours of dashboard frameworks and apps available, ranging in price from thousands of dollars to open-source implementations. Apparently Read more »

I’m working on a one-hour ggplot2 lecture for the San Diego R users group, which I will post here when I’m done. I think there are many great intro to R data visualization resources out there so I’ll only share working examples on my blog. A retail chain client employs a few hundred field agents who perform

JW Emerson, WA Green, B Schloerke, J Crowley, D Cook, H Hofmann, H Wickham (2013) The Generalized Pairs Plot. Journal of Computational and Graphical Statistics 22(1). Here's a free preprint version. Until this new paper and implementation by Emerson et al., there were no widely available pairs plots that accommodated both numerical and categorical fields.

Benford’s law is nowadays extremely popular (see e.g. http://en.wikipedia.org/…). It is usually claimed that, for a given set data set, changing units does not affect the distribution of the first digit. Thus, it should be related to scale invariant distributions. Heuristically, scale (or unit) invariance means that the density of the measure (or probability function) should be proportional to...

by Joseph Rickert When talking with data scientists and analysts — who are working with large scale data analytics platforms such as Hadoop — about the best way to do some sophisticated modeling task it is not uncommon for someone to say, "We have all of the data. Why not just use it all?" This sort of comment often...

R2 is a useful tool for determining how strong the relationship between two variables is. Unfortunately, the definition of R2 for mixed effects models is difficult – do you include the random variable or just the fixed effects? Including just the fixed effects is essentially a standard linear model, while including the random effects could

clickme is an amazing R package. I was not sure what to expect when I first saw Nacho Caballero's announcement. I actually was both skeptical and intimidated, but neither reaction was justified. The examples prove its power, and his wiki tutorials ease...

As you may (or may not) be aware of, R 3.0.0 is scheduled to be released on April 3rd. Since this is a major release and there may be some growing pains (but I hope not) in the move 3.0.0, here is some information about how I will handle R 3.0.0 on CR...

This morning, Mathieu had a nice experience in his course on computational method in actuarial science. But let us start with some mathematical formal definitions. First, recall that is – somehow – a standard expression. No one should be surprised to see such an expression. Generally (as explained in http://en.wikipedia.org/… ), this function is defined only when . The...

The presentation below by Carlos Somohano (founder of Data Science London) provides the best description of a Data Scientist that I've seen in some time: Highlights include: On Slide 14, a history of the Data Science On Slide 22, the essential skills of data scientists (and a platypus) On Slide 26, 10 things data scientists do On Slide 27,...

(Note: this was initially posted on my other blog at Glacial Till, but there were some good bits of information that I wanted to share with the Paleoposse.) Last week I attended my first science conference: The Lunar and Planetary Science Conference in Houston, TX. If you followed me on Twitter, then (for better or for worse)

You want your analytics to just work… with extremely large data sets as nimbly as small ones. You don’t want to have to think about parallelism, data formatting, and memory management. Paradigm4 presents a webinar about SciDB-R, an R package that lets you remain an R programmer, but expands R’s power with SciDB, the massively

SciDB-R, a package for R that lets R programmers perform massive-scale data-management and analytical tasks from inside R programs, is now available. You can download the package from GitHub here. It is also available on The Comprehensive R Archive Net...

by Thomas Dinsmore This is the third in a series of posts highlighting new features in Revolution R Enterprise Release 6.2, which is scheduled for General Availability April 22. This week's post features our new Stepwise Regression capability. The Stepwise process starts with a specified model and then sequentially adds into or removes from the model the variable that...

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and “fights” about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; … Continue reading →The post Python vs R vs SPSS … Can’t All Programmers Just Get Along?...