Making Data Visually Appealing

December 16, 2012
By
Making Data Visually Appealing

I’ve recently been considering the graphical presentation of data. I get the feeling that we, ecologists/scientsits, could be better at data presentation. Graphs must be informative, but they don’t have to be ugly. I think that making visually appealing charts … Continue reading →

Read more »

Building R packages: missing path to pdflatex

December 15, 2012
By
Building R packages: missing path to pdflatex

Recently whiling trying to build an R package for generalized estimating equation model selection (QICpack on github), I was getting an error related to latex creating the PDF package manuals. It seems like this is a relatively common problem on … Continue reading →

Read more »

Data Science, Data Analysis, R and Python

The October 2012 issue of Harvard Business Review prominently features the words “Getting Control of Big Data” on the cover, and the magazine includes these three related articles:“Big Data: The Management Revolution,” by Andrew McAfee and Erik Brynjolfsson, pages 61 – 68;“Data Scientist: The Sexiest Job of the 21st Century,” by Thomas H. Davenport and D.J. Patil, pages...

Read more »

Text analysis made too easy with the tm package

December 15, 2012
By
Text analysis made too easy with the tm package

Today’s Gist takes the CNN transcript of the Denver Presidential Debate, converts paragraphs into a document-term matrix, and does the absolute most basic form of text analysis: a raw word count. There are actually quite a few steps in this proc...

Read more »

Le Monde puzzle (#800)

December 14, 2012
By
Le Monde puzzle (#800)

Here is the mathematical puzzle of the weekend edition of Le Monde: Consider a sequence where the initial number is between 1 and 10³, and each term in the sequence is derived from the previous term as follows: if the last digit of the previous term is between 6 and 9, multiply it by 9;

Read more »

Predictive models in R: a new book in Polish

December 14, 2012
By
Predictive models in R: a new book in Polish

Together with Mateusz Zawisza I have just published a new book in Polish on building predictive models in GNU R. It can be bought at Oficyna Wydawnicza SGH. The book presents complete examples of basic data mining processes.Although the book is in Poli...

Read more »

d3, Shiny, and R Reporting Performance

December 14, 2012
By
d3, Shiny, and R Reporting Performance

I thought it would be interesting to offer a little different example of how we can use d3, R, and Rstudio Shiny.  This time we will offer a simple example to report portfolio or index performance.  Just as a test of my progress, I also threw...

Read more »

2D MODPATH particle tracking animations with R and ImageMagick

December 14, 2012
By
2D MODPATH particle tracking animations with R and ImageMagick

The PMPATH particle tracking output, with a file format similar to the pathline output mode of MODPATH (see above), can be transformed easily into a GIF animation using R and ImageMagick (see below for a simple example).First of all, you...

Read more »

Revolution Newsletter: December 2012

December 14, 2012
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full December edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Tell us what you're looking for in R training. 2013 is the International Year...

Read more »

R Journal Volume 4/2, December 2012

December 14, 2012
By
R Journal Volume 4/2, December 2012

The 'Winter edition' of the R Journal is out! Get it from here.

Read more »

What is Correctness for Statistical Software?

December 14, 2012
By
What is Correctness for Statistical Software?

Introduction A few months ago, Drew Conway and I gave a webcast that tried to teach people about the basic principles behind linear and logistic regression. To illustrate logistic regression, we worked through a series of progressively more complex spam detection problems. The simplest data set we used was the following: This data set has

Read more »

How I learned to stop worrying and really love lists

December 14, 2012
By
How I learned to stop worrying and really love lists

One of the first weird things to get used to in R is unlearning some of the things that you think you know. As often happens, this reminds me of a quote I once read about Zen, which went about like this (I’m paraphrasing), “When I knew nothing of Zen, mountains were mountains, rivers were

Read more »

Let it snow!

December 14, 2012
By

A couple days ago I noticed a fun piece of R code by Allan Roberts, which lets you create a digital snowflake by cutting out virtual triangles. Go give it a try. Roberts inspired me to create a whole night sky of snowflakes. I tried to make the snowfall look as organic as possible. There

Read more »

Computing for Data Analysis Returns

December 14, 2012
By

I'm happy to announce that my course Computing for Data Analysis will return to Coursera on January 2nd, 2013. While I had previously announced that the course would be presented again right here, it made more sense to do it … Continue reading →

Read more »

When R, or any other language, is not enough

December 14, 2012
By
When R, or any other language, is not enough

This post is tangential to R, although R has a fair share of the issues I mention here, which include research reproducibility, open source, paying for software, multiple languages, salt and pepper. There is an increasing interest in the reproducibility … Continue reading →

Read more »

Sending commands from Notepad++ to a remote R session

December 14, 2012
By

If you have your working environment set up in a Windows operating system, it can be a bit of a hassle to work with R sessions on remote Linux servers.I use WinSCP + Notepad++ to handle my projects and Putty + screen to handle the R sessions. It become...

Read more »

Everything is a Network, featuring the sna package

December 14, 2012
By
Everything is a Network, featuring the sna package

We’ve gotten some requests, through the Ask us anything page, to do some plotting of networks. We may come back to this later, but today’s Gist shows how you can plot pretty much literally anything as a network. First, we go back to our

Read more »

R pitfalls #4: redefining the basics

December 13, 2012
By
R pitfalls #4: redefining the basics

I try to be economical when writing code; for example, I tend to use single quotes over double quotes for characters because it saves me one keystroke. One area where I don’t do that is when typing TRUE and FALSE … Continue reading →

Read more »

Predictive Modeling using R and the OpenScoring-Engine – a PMML approach

December 13, 2012
By
Predictive Modeling using R and the OpenScoring-Engine – a PMML approach

On November, the 27th, a special post took my interest. Scott Mutchler presented a small framework for predictive analytics based on the PMML (Predictive Model Markup Language) and a Java-based REST-Interface. PMML is a XML based standard for the description and exchange of analytical models. The idea is that every piece of software which supports the corresponding...

Read more »

Multisite, multivariate genetic analysis: simulation and analysis

December 13, 2012
By
Multisite, multivariate genetic analysis: simulation and analysis

The email wasn’t a challenge but a simple question: Is it possible to run a multivariate analysis in multiple sites? I was going to answer yes, of course, and leave it there but it would be a cruel, non-satisfying answer. … Continue reading →

Read more »

Trading with SVMs: Performance

December 13, 2012
By
Trading with SVMs: Performance

To get a feeling of SVM performance in trading, I run different setups on the S&P 500 historical data from … the 50s. The main motif behind using this decade was to decide what parameters to vary and what to keep steady prior to running the most important tests. Treat it as an “in-sample” test

Read more »

Shiny, R, d3 Adaptation of Mike Bostock’s Calendar

December 13, 2012
By
Shiny, R, d3 Adaptation of Mike Bostock’s Calendar

The idea with all the posts http://timelyportfolio.blogspot.com/search/label/shiny was to learn both d3 and shiny by iterating through multiple experiments.  This example adaptation was my quickest yet at about 30 minutes.  Mike Bostock had d...

Read more »

Mapping GPS Tracks in R

December 13, 2012
By
Mapping GPS Tracks in R

This is an explanation of how I used R to combine all my GPS cycling tracks from my Garmin Forerunner 305.Converting to CSVYou can convert pretty much any GPS data to .csv by using GPSBabel. For importing directly from my Garmin, I used the comman...

Read more »

Testing Assumption Testing

December 13, 2012
By
Testing Assumption Testing

I’ve been doing a lot of linear modeling this year. That’s not much different than any ordinary year, but now I’m doing it in R. I had spent a bit of time in recent years trying to look at loss reserving as a multivariate regression. Excel is happy to do that, but testing various predictor

Read more »

Political revolutions on Twitter, visualized with R

December 13, 2012
By

Twitter has become a powerful medium for organizing and communicating with factions during popular uprisings: the crisis in Egypt, the uprising in Syria, the revolution in Iran, and other conflicts all around the world. Twitter's effectiveness relies on its ability for the various factions to self-organize and to fight the information battle in social media. Esteban Moro Egido, a...

Read more »

"R" : Identifying peaks

December 13, 2012
By
"R" : Identifying peaks

The function "identify"  from "R",  is very useful to check the spectrum for peaks or areas of interest. I use it here to see the wavelength with the highest variability in the Shootout-2012 Calibration Set. This wavelength has a high variabi...

Read more »

Community detection algorithm with igraph and R – (2)

December 13, 2012
By

In the last post I presented a slightly modified LPA algorithm from the igraph wiki. This algorithm needed around 40s for 10,000,000 edges and 1000 unique vertices. I promised, that one could do much better. Here you go!Iterators in igraphTo understand...

Read more »

Nataniele Argento

December 13, 2012
By
Nataniele Argento

Effectively, the Italian election campaign is already in full swing, so I tried to start collecting some data to see what we are about to face. If I have time and manage to get some good data, I'll try to replicate the analysis I made for the US election (surely Italy needs its own version of 

Read more »

Fuzzy clustering with fanny()

December 13, 2012
By
Fuzzy clustering with fanny()

This is kind of a fun example, and you might find the fuzzy clustering technique useful, as I have, for exploratory data analysis. In this Gist, I use the unparalleled breakfast dataset from the smacof package, derive dissimilarities from breakfast it...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.