painful truncnorm

April 8, 2013
By
painful truncnorm

As I wanted to simulate truncated normals in a hurry, I coded the inverse cdf approach: instead of using my own accept-reject algorithm. Poor shortcut as the method fails when a and b are too far from μ So I introduced a control (and ended up wasting more time than if I had used my

Read more »

Instructions for Installing & Using R on Amazon EC2

April 8, 2013
By

If you’re an R user, you’ve surely heard all the hype around ‘big data’ and how R is commonly used to analyze these volumes of data. One thing that’s often missing from the discussion is HOW to work around issues using big data and R, specifically how to deal with the fact that R stores Instructions for Installing...

Read more »

Use foursquare to locate a twitter user using R

April 8, 2013
By
Use foursquare to locate a twitter user using R

I've been doing some work with Twitter data. In much of this work, my life would be so much easier if we could geographically locate the origin of the tweets. There are some ways to do this using the twitter APIs. For example, if a user has geo-locatio...

Read more »

Visualize large data sets with the bigvis package

April 8, 2013
By
Visualize large data sets with the bigvis package

Creating visualizations of large data sets is a tough problem: with a limited number of pixels available on the screen (or just with the limited visual acuity of the human eye), massive numbers of symbols on the page can easily result in an uninterpretable mess. On Friday we shared one way of tackling the problem using Revolution R Enterprise:...

Read more »

Halo Effects vs. Intention-Laden Ratings: Separating Baby and Bathwater

April 8, 2013
By
Halo Effects vs. Intention-Laden Ratings: Separating Baby and Bathwater

Are halo effects real or illusory?  Much has been written arguing that rating scales contain extensive amounts of measurement bias.  Some tells us to avoid ratings altogether (What do customers really want?).  Others warn against the use of ratings scales without major adjustments (e.g., overcoming scale usage heterogeneity with the R package bayesm).  Obviously, by including the...

Read more »

More variables, spinoff projects, and RuPaul’s Drag Race season 5 predictions: episode 10

April 8, 2013
By
More variables, spinoff projects, and RuPaul’s Drag Race season 5 predictions: episode 10

Last week, Alyssa got the boot and Jinkx kept her place. And I totally called it with my first model that accounted for the proportional hazards assumption. I think the model is having a little more success as the season plods on. Before I get to the predictions for episode 10, there’s two really interesting… Continue reading →

Read more »

Spring Cleaning Data: 1of 6- Downloading the Data & Opening Excel Files

April 8, 2013
By

With spring in the air, I thought it would be fun to do a series on (spring) cleaning data. The posts will follow my efforts to to download the data, import into R, cleaned it up, merge the different files, add columns of information created, and then ...

Read more »

Starting Analysis and Visualisation of Spatial Data with R

April 8, 2013
By
Starting Analysis and Visualisation of Spatial Data with R

Last week I ran an introductory workshop on the analysi

Read more »

Dynamic Wrapping and Recursion with Rcpp

April 8, 2013
By
Dynamic Wrapping and Recursion with Rcpp

We can leverage small parts of the R’s C API in order to infer the type of objects directly at the run-time of a function call, and use this information to dynamically wrap objects as needed. We’ll also present an example of recursing through a list. To get a basic familiarity with the main functions exported from R API, I...

Read more »

Next Kölner R User Meeting: 12 April 2013

April 8, 2013
By
Next Kölner R User Meeting: 12 April 2013

Quick reminder: The next Cologne R user group meeting is scheduled for this Friday, 12 April 2013. We will discuss cluster analysis and shiny. Further details and the agenda are available on our KölnRUG Meetup site. Please sign up if you would like to come along. Notes from the last Cologne R user group meeting are...

Read more »

analyze the pesquisa nacional por amostra de domicilios (pnad) with r

April 7, 2013
By

think of the pesquisa nacional por amostra de domicilios (pnad) as the brazilian census for off-years - the ones that don't end in zero.  the principal household survey for the nation of brazil, pnad measures general education, labor, income, and ...

Read more »

Dirichlet Process, Infinite Mixture Models, and Clustering

April 7, 2013
By
Dirichlet Process, Infinite Mixture Models, and Clustering

The Dirichlet process provides a very interesting approach to understand group assignments and models for clustering effects.   Often time we encounter the k-means approach.  However, it is necessary to have a fixed number of clusters.  Often we encounter situations where we don’t know how many fixed clusters we need.  Suppose we’re trying to identify

Read more »

A quick guide to non-transitive Grime Dice

April 7, 2013
By
A quick guide to non-transitive Grime Dice

A very special package that I am rather excited about arrived in the mail recently. The package contained a set of 6-sided dice. These dice, however, don’t have the standard numbers one to six on their faces. Instead, they have assorted numbers between zero and nine. Here’s the exact configuration: Aside from maybe making for

Read more »

Venue Recommendation – A Simple Use Case Connecting R and Neo4j

April 7, 2013
By
Venue Recommendation – A Simple Use Case Connecting R and Neo4j

Last month I attended the CeBIT trade fair in Hannover. Besides the so called “shareconomy” there was also another main topic across all expedition halls - Big Data. This subject is not completely new and I think that a lot of you also have experiences with some of the tools associated with Big Data. But due to the great...

Read more »

Mastering Matrices

April 7, 2013
By
Mastering Matrices

R has many ways to store information.  Most of the time, our data comes in the form of a dataset, which we bring into R as a data.frame object. However, there are times when we want to use matrices as well. This post will show you how matrices can...

Read more »

Sync

April 7, 2013
By
Sync

I am listening to the audiobook Sync: How Order Emerges from Chaos in the Universe, Nature, and Daily Lifeby Steven Strogatz which I got from Audible. Obviously a mathematical book is not ideal to listen to, but lacking illustrations I can ma...

Read more »

Travis CI for R?

April 7, 2013
By
Travis CI for R?

I'm always worried about CRAN: a system maintained by FTP and emails from real humans (basically one of Uwe, Kurt or Prof Ripley). I'm worried for two reasons: the number of R packages is growing exponentially; time and time again I see frustrations ...

Read more »

Guide to accessing MS SQL Server and MySQL server on Mac OS X

April 6, 2013
By

Native GUI client access to MS-SQL and MySQL We can use Oracle SQL Developer with the jTDS driver to access Microsoft SQL Server. Note: jTDS version 1.3.0 did not work for me; I had to use version 1.2.6. Detailed instructions can be found here. We can use MySQL Workbench to access MySQL server. Setup is... Read more »

Mortality after paediatric heart surgery using public domain data

April 6, 2013
By
Mortality after paediatric heart surgery using public domain data

This post comes with some big health warnings. The recent events in Leeds highlight the difficulties faced in judging the results of surgery by individual hospital. A clear requirement is timely access to data in a form easily digestible by the public. Here I’ve scraped the publically available data from the central cardiac audit database

Read more »

Retirement : simulating wealth with random returns, inflation and withdrawals – Shiny web application

April 6, 2013
By
Retirement : simulating wealth with random returns, inflation and withdrawals – Shiny web application

Today, I want to share the Retirement : simulating wealth with random returns, inflation and withdrawals – Shiny web application (code at GitHub). This application was developed and contributed by Pierre Chretien, I only made minor updates. This is application is a great example of how easy it is to convert your R script into

Read more »

Worry about correctness and repeatability, not p-values

April 5, 2013
By
Worry about correctness and repeatability, not p-values

In data science work you often run into cryptic sentences like the following: Age adjusted death rates per 10,000 person years across incremental thirds of muscular strength were 38.9, 25.9, and 26.6 for all causes; 12.1, 7.6, and 6.6 for cardiovascular disease; and 6.1, 4.9, and 4.2 for cancer (all P < 0.01 for linear Related posts:

Read more »

Reconstructing Principal Component Analysis Matrix

April 5, 2013
By
Reconstructing Principal Component Analysis Matrix

PCA is widely used method for finding patterns in high-dimensional data. Whether you use it to compress large matrix or to remove one of the principal components in biological datasets, you’ll end up with the task of performing series of … Continue reading →

Read more »

Organise your data

April 5, 2013
By

Use R to specify factors, recode variables and begin by-group analyses. Video Files This file contains data on pain score after laparoscopic vs. open hernia repair. Age, gender and primary/recurrent hernia also included. The ultimate aim here is to work out which of these factors are associated with more pain after this operation. lap_hernia Script

Read more »

Properly “internationalized” regular expressions in R

April 5, 2013
By

We should pay special attention to writing a truly portable code that works in the same fashion under different locales and character encodings. Currently, R has two Regex engines, ERE (via TRE) and PRE (via PCRE). What is surprising, they…Read more ›

Read more »

Security in R: RAppArmor package & paper updates

April 5, 2013
By

This week version 0.8.3 of RAppArmor appeared on CRAN. RAppAmor is a package to dynamically enforce security policies and hardware restrictions in R on Linux systems. It currently supports Ubuntu 12.04+, Debian 7 and OpenSuse 12.1+. The readme page has more info, and helpful video tutorials to get you started. One important change in the ...

Read more »

Multiple pairwise comparisons for categorical predictors

April 5, 2013
By
Multiple pairwise comparisons for categorical predictors

Dale Barr (@datacmdr) recently had a nice blog post about coding categorical predictors, which reminded me to share my thoughts about multiple pairwise comparisons for categorical predictors in growth curve analysis. As Dale pointed out in his post, the R default is to treat the reference level of a factor as a...

Read more »

Interview by DecisionStats

April 5, 2013
By

Ajay Ohri interviewed me on his popular DecisionStats blog. Topics discussed ranged widely from Fellows Statistics, to Deducer, to statnet, to Poker A.I., to Big Data.    

Read more »

Extending RevoScaleR for Mining Big Data – Hexbins

April 5, 2013
By
Extending RevoScaleR for Mining Big Data – Hexbins

by Derek McCrae Norton, Senior Sales Engineer It is my job to help potential clients see that the tasks they are used to completing can be completed on big data in Revolution R Enterprise (and that it is easy). Honestly, this is my dream job, and in my eyes it is sort of like playing and getting paid for...

Read more »

Import/Export data to and from xlsx files

April 5, 2013
By
Import/Export data to and from xlsx files

As Ive already written, getting data into R from your precious xlsx files is really handy. No need to clutter up your computer with txt or csv files. The previous post I wrote about the gdata package for importing data from xlsx files and was pointed to, among others, the xlsx package. xlsx seems to

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.