March 2013

Open Data Exchange 2013, April 6. Montreal

March 29, 2013 | Corey Chivers

UPDATE: The day was great! There are many people doing really amazing things with open data and it was amazing to meet them. Here are my slides from the panel talk. Next Saturday, I’ll be sitting on a panel discussing future avenues for open data at ODX13. From the ... [Read more...]

latent Gaussian model workshop in Reykjavik

March 28, 2013 | xi'an

An announcement for an Icelandic meeting next September, meeting I would have loved to attend (darn!)… This meeting is sponsored by the BayesComp session, of course!!! We are pleased to announce that the University of Iceland will host the 3rd Workshop on Bayesian Inference for Latent Gaussian Models with Applications (... [Read more...]

The apply function in R

March 28, 2013 | Geoffrey

So as discussed in this post I will be investigating the different members of the 'apply function family' in R. This post starts with the most basic one, called apply(). The R manual states the following apply(X, MARGIN, FUN, ...) With the following arguments X an array, including a matrix. ... [Read more...]

Swimming in a sea of code

March 28, 2013 | Dimiter Toshkov

If you are looking for code here, move on. __ In the beginning, there was only the relentless blinking of the cursor. With the maddening regularity of waves splashing on the shore: blink, blink, blink, blink…Beyond the cursor, the white wasteland … Continue reading → [Read more...]

Creating a Business Dashboard in R

March 28, 2013 | Bart

Business dashboards are available in many shapes and sizes. Business dashboards are useful to create an overview of key performance indicators (KPIs) important for the business strategy and/or operations. There are many flavours of dashboard frameworks and apps available, ranging in price from thousands of dollars to open-source implementations. ... [Read more...]

Data visualization with R and ggplot2

March 28, 2013 | Kevin Davenport

I’m working on a one-hour ggplot2 lecture for the San Diego R users group, which I will post here when I’m done. I think there are many great intro to R data visualization resources out there so I’ll only share working examples on my blog. A retail ... [Read more...]

Generalized Pairs Plot: It’s about time!

March 28, 2013 | BioStatMatt

JW Emerson, WA Green, B Schloerke, J Crowley, D Cook, H Hofmann, H Wickham (2013) The Generalized Pairs Plot. Journal of Computational and Graphical Statistics 22(1). Here's a free preprint version. Until this new paper and implementation by Emerson et al., there were no widely available pairs plots that accommodated both numerical ... [Read more...]

Benford law and lognormal distributions

March 28, 2013 | arthur charpentier

Benford’s law is nowadays extremely popular (see e.g. http://en.wikipedia.org/…). It is usually claimed that, for a given set data set, changing units does not affect the distribution of the first digit. Thus, it should be related to scale invariant distributions. Heuristically, scale (or unit) invariance ... [Read more...]

Lots of data != "Big Data"

March 28, 2013 | Joseph Rickert

by Joseph Rickert When talking with data scientists and analysts — who are working with large scale data analytics platforms such as Hadoop — about the best way to do some sophisticated modeling task it is not uncommon for someone to say, "We have all of the data. Why not just use ... [Read more...]

Rencontres R, Lyon 27-28 June

March 28, 2013 | Martyn

Last year, the first French-speaking R conference, “Rencontres R“  was held in Bordeaux.  The meeting was a great success, and a second one will be  held in Lyon on 27 and 28 June 2013. The abstract submission deadline of 7 … Continue reading → [Read more...]

Mixed model R2 (UPDATED)

March 28, 2013 | aghaynes

R2 is a useful tool for determining how strong the relationship between two variables is. Unfortunately, the definition of R2 for mixed effects models is difficult – do you include the random variable or just the fixed effects? Including just the fixed effects is essentially a standard linear model, while including ... [Read more...]

Moving to R 3.0.0 on Ubuntu

March 27, 2013 | The Ubuntu R Blog

As you may (or may not) be aware of, R 3.0.0 is scheduled to be released on April 3rd. Since this is a major release and there may be some growing pains (but I hope not) in the move 3.0.0, here is some information about how I will handle R 3.0.0 on CR... [Read more...]

Rationality, and MS Excel (and other calculators)

March 27, 2013 | arthur charpentier

This morning, Mathieu had a nice experience in his course on computational method in actuarial science. But let us start with some mathematical formal definitions. First, recall that is – somehow – a standard expression. No one should be surprised to see such an expression. Generally (as explained in http://en.wikipedia.... [Read more...]

What does a data scientist do?

March 27, 2013 | David Smith

The presentation below by Carlos Somohano (founder of Data Science London) provides the best description of a Data Scientist that I've seen in some time: Highlights include: On Slide 14, a history of the Data Science On Slide 22, the essential skills of data scientists (and a platypus) On Slide 26, 10 things data ... [Read more...]

Build a search engine in 20 minutes or less

March 27, 2013 | Ben Ogorek

…or your money back.
author = "Ben Ogorek"<br>Twitter = "@baogorek"<br>email = paste0(sub("@", "", Twitter), "@gmail.com")<br>
Setup Pretend this is Big Data:
doc1 <- "Stray cats are running all over the place. I see 10 a day!"<br>doc2 <- "Cats are killers. They kill billions of animals a year."<br>doc3 <- "The best food in Columbus, OH is   the North Market."<br>doc4 <- "Brand A is the best tasting cat food around. Your cat will love it."<br>doc5 <- "Buy Brand C cat food for your cat. Brand C makes healthy and happy cats."<br>doc6 <- "The Arnold Classic came to town this weekend. It reminds us to be healthy."<br>doc7 <- "I have nothing to say. In summary, I have told you nothing."<br>
and this is the Big File System:
doc.list <- list(doc1, doc2, doc3, doc4, doc5, doc6, doc7)<br>N.docs <- length(doc.list)<br>names(doc.list) <- paste0("doc", c(1:N.docs))<br>
You have an information need that is expressed via the following text query:
query <- "Healthy cat food"<br>
How will you meet your information need amidst all this unstructured text? Jokes aside, we're going ... [Read more...]
1 2 3 4 14

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)