And Now I Blog Again

August 4, 2012
By

One of my goals for 2012 has been to blog more. Much more. When I first set this goal, I had great aspirations of posting frequently. However, I had a Ph.D. to complete, and quite frankly, it demanded much higher priority. Now that I have submitted my ...

Read more »

Getting Started Using R, Part 1: RStudio

August 4, 2012
By
Getting Started Using R, Part 1:  RStudio

Despite my preference for SAS over R, there are some add-ons to “basic” R that I’ve found that have made my learning process way easier.  While I’m still in my infancy in learning R, I feel like once I found … Continue reading →Getting Started Using R, Part 1: RStudio is an article from randyzwitch.com,...

Read more »

Discriminating Between Iris Species

August 4, 2012
By
Discriminating Between Iris Species

The Iris data set is a famous for its use to compare unsupervised classifiers. The goal is to use information about flower characteristics to accurately classify the 3 species of Iris. We can look at scatter plots of the 4 variables in the data set and see that no single variable nor bivariate combination can achieve this. One approach to improve the separation

Read more »

Feature Comparison of Sweave (R+LaTeX) Tools: TeXmaker vs RStudio

August 4, 2012
By
Feature Comparison of Sweave (R+LaTeX) Tools: TeXmaker vs RStudio

link to the document

Read more »

Transformation of axes in R

August 4, 2012
By
Transformation of axes in R

As a general rule, you should not transform your data to try to fit a linear model. But proportions can be tricky. If the proportion data do not arise from a binomial process (e.g., proportion of a leaf consumed by a caterpillar), then transformation is still the best option. In an excellent paper, David Warton*

Read more »

Surveys continue to rank R #1 for Data Mining

August 3, 2012
By
Surveys continue to rank R #1 for Data Mining

KDnuggets recently posted its annual poll on data mining software, and the R language retains its #1 ranking as the most commonly-used software for data mining: R is now used by 52.5% of poll respondents, compared with 45% last year. Donnie Berkholz provides an analysis of the year-on-year trends for Redmonk. He provides the chart below, and notes "the...

Read more »

Horizon Plots in Base Graphics

August 3, 2012
By
Horizon Plots in Base Graphics

for background please see prior posts More on Horizon Charts, Application of Horizon Plots, Horizon Plot Already Available, and Cubism Horizon Charts in R There are three primary graphics routes in R (base graphics, lattice, and ggplot2), and each have...

Read more »

2012 Olympics Swimming – 100m Butterfly Men Finals prediction

August 3, 2012
By
2012 Olympics Swimming – 100m Butterfly Men Finals prediction

2012 Olympics Swimming - 100m Butterfly Men Finals prediction Author: Matt Malin Inspired by mages’ blog with predictions for 100m running times, I’ve decided to perform some basic modelling (loess and linear modelling) on previous Olympic results for the 100m Butterfly Men’s medal winning results. Code setup library(XML) library(ggplot2) swimming_path <- "http://www.databasesports.com/olympics/sport/sportevent.htm?sp=SWI&enum=200" swimming_data <- readHTMLTable( readLines(swimming_path), which = 3, stringsAsFactors...

Read more »

R training: Visualization, Big Data, Data Mining, and Marketing Analytics

August 2, 2012
By

Revolution Analytics is hosting several live and online courses over the next couple of months that will be of interest to R users looking to hone their skills: Visualization in R with ggplot2. Garrett Grolemund and Winston Chang instruct how to use the ggplot2 package to make, format, label and adjust graphs using R. (August 28, Redwood City, CA.)...

Read more »

plotting raster data in R: adjusting the labels and colors of a classified raster

August 2, 2012
By
plotting raster data in R: adjusting the labels and colors of a classified raster

Thank’s to Andrej who wrote this comment: “Is it possible to to color the resulting 12 clusters within your original image to get a feel for visual separation?” You can do so: But how to get values at a location? You will need these values to determine whether the defined class is representing a water

Read more »

Who wants to maintain pgfSweave?

August 2, 2012
By

So the time has come for me to face the fact that I have no time to maintain pgfSweave. It was recently archived because I didn’t make necessary changes to comply with some CRAN policies. SO, I need someone to step up to the plate to make some tweakes, put it back up on CRAN

Read more »

Spacing of multi-panel figures in R

August 2, 2012
By
Spacing of multi-panel figures in R

In a previous post, I showed how to keep text and symbols at the same size across figures that have different numbers of panels. The figures in that post were ugly because they used the default panel spacing associated with the mfrow argument of the par( ) function. Below I will walk through how to

Read more »

How do you say “We Will Do Whatever It Takes” in Thai?

August 2, 2012
By
How do you say “We Will Do Whatever It Takes” in Thai?

As the market has already started to poke holes in Draghi’s promise, I thought it would be good to continue the series of posts that I began with the British version “We Will Do Whatever it Takes” with my favorite article written during the Asia ...

Read more »

Data Parallelism Using Oracle R Enterprise

August 2, 2012
By

Modern computer processors are adequately optimized for many statistical calculations, but large data operations may require hours or days to return a result.  Oracle R Enterprise (ORE), a set of R packages designed to process large data computations in Oracle Database, can run many R operations in parallel, significantly reducing processing time. ORE supports parallelism through the transparency layer,...

Read more »

Multivariate Data Analysis Work Flow

August 2, 2012
By
Multivariate Data Analysis Work Flow

Here is an example of a data analysis work flow supported in imDEV. This network visualization was made using CmapTools.

Read more »

Units and metadata

August 2, 2012
By

Handling meta-data is not natural in R, or any traditional rectangular shaped type data storage system.There are several tricks and packages which attempt to solve this problem, with Hmisc using the atrribute feature and the IRange package having its o...

Read more »

CFP: AusDM 2012, deadline extended to 31 August 2012

August 2, 2012
By
CFP: AusDM 2012, deadline extended to 31 August 2012

The Tenth Australasian Data Mining Conference (AusDM 2012) Sydney, Australia 5-7 December 2012 http://ausdm12.togaware.com/ Deadline extended to 31 August 2012 The Australasian Data Mining Conference has established itself as the premier Australasian meeting for both practitioners and researchers in data … Continue reading →

Read more »

unsupervised classification of a Landsat image in R: the whole story or part two

August 1, 2012
By
unsupervised classification of a Landsat image in R: the whole story or part two

The main question when using remote sensed raster data, as we do, is the question of NaN-treatment. Many R functions are able to use an option like rm.NaN=TRUE to treat these missing values. In our case the kmeans function in R is not capable to use such a parameter. After reading the tif-files and creating

Read more »

More on Horizon Charts

August 1, 2012
By
More on Horizon Charts

for background please see prior posts Application of Horizon Plots, Horizon Plot Already Available, and Cubism Horizon Charts in R Some feedback has led me to think that I might have been a little ambitious with my last post on horizon charts. I though...

Read more »

Genetic algorithms: a simple R example

August 1, 2012
By
Genetic algorithms: a simple R example

Genetic algorithm is a search heuristic. GAs can generate a vast number of possible model solutions and use these to evolve towards an approximation of the best solution of the model. Hereby it mimics evolution in nature. GA generates a population, the individuals in this population (often called chromosomes) have  Read more »

Analytics for Marketing online training 25 – 28 September 2012

August 1, 2012
By
Analytics for Marketing online training 25 – 28 September 2012

I am excited to be giving the Analytics for Marketing online training course on 25-28 September 2012. Sign up before 25 August 2012 for the early bird discount. Our friends at Revolution Analytics who will provide the infrastructure to host the event. Update: For clarification, this is an online, instructor led training course. We are...

Read more »

Analytics for Marketing online training 25 – 28 September 2012

August 1, 2012
By
Analytics for Marketing online training 25 – 28 September 2012

I am excited to be giving the Analytics for Marketing online training course on 25-28 September 2012. Sign up before 25 August 2012 for the early bird discount. Our friends at Revolution Analytics who will provide the infrastructure to host the event. Update:...

Read more »

Genetic algorithms: a simple R example

August 1, 2012
By
Genetic algorithms: a simple R example

Genetic algorithm is a search heuristic. GAs can generate a vast number of possible model solutions and use these to evolve towards an approximation of the best solution of the model. Hereby it mimics evolution in nature. GA generates a population, the individuals in this population (often called chromosomes) have a given state. Once the population is generated, the state of these individuals is evaluated...

Read more »

Analytics for Marketing online training 25 – 28 September 2012

August 1, 2012
By
Analytics for Marketing online training 25 – 28 September 2012

I am excited to be giving the Analytics for Marketing online training course on 25-28 September 2012. Sign up before 25 August 2012 for the early bird discount.

Read more »

Bio7 1.6 for Windows and Linux released!

August 1, 2012
By
Bio7 1.6 for Windows and Linux released!

01.08.2012 Finally i released a new version of Bio7 with many improvements and new features. Updated tutorials are available, too. The new Bio7 1.6 release can be downloaded here. Please also download the examples *.zip file from the sourceforge website which contains new examples for Bio7 1.6 (e.g. an example to cluster an image folder with

Read more »

Hadley Wickham’s ggplot2 basics

August 1, 2012
By

If you haven't made the plunge yet to making R graphics with Hadley Wickham's ggplot2 package, his "ggplot2 basics" slides (from the recent Introduction to Data Visualization and Analysis course at JSM) is a good place to start. Once you get the hang of the "grammar of graphics" notation, you'll be building beautiful data visualizations like this or this...

Read more »

Creating a text grob that automatically adjusts to viewport size

August 1, 2012
By
Creating a text grob that automatically adjusts to viewport size

I recently wanted to construe a dashboard widget that contains some text and other elements using the grid graphics system. The size available for the widget will vary. When the sizes for the elements of the grobs in the widget are specified as Normalised Parent Coordinates the size adjustments happen automatically. Text does not automatically adjust though. The

Read more »

Olympic body match and 1:1 BMI

August 1, 2012
By
Olympic body match and 1:1 BMI

In my morning attempt to read the whole internet before beginning work, I came across a program on the BBC website which allows you to see which Olympic athletes are your body doubles. Or rather, which athletes share your height and weight, and therefore your body mass index. Being a Canadian, I exist in an

Read more »

Building a presentation, report or paper in R

August 1, 2012
By

If you need to build a presentation, obviously you have following options: Powerpoint alike presentation Online engines LaTex The first two are beloved by business people and the third one is widely used in academia. The objective of the first group is shiny presentation, contrary to the second where asceticism and demand for automation are

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.