June 2013

Using Metadata to find Paul Revere

June 9, 2013 | Kieran Healy

London, 1772. I have been asked by my superiors to give a brief demonstration of the surprising effectiveness of even the simplest techniques of the new-fangled Social Networke Analysis in the pursuit of those who would seek to undermine the liberty enjoyed by His Majesty's subjects. This is in connection with ... [Read more...]

Why are Birds Dinosaurs?

June 9, 2013 | Patrick

Month after month, one of the most popular posts on the Paleocave blog is the How to Read a Cladogram post I did some time ago. I always intended to follow it up with more cladistic fun. So, hold onto your butts, we’re going to let the dinosaurs loose. ... [Read more...]

Improve The Efficiency in Joining Data with Index

June 9, 2013 | statcompute

When managing big data with R, many people like to use sqldf() package due to its friendly interface or choose data.table() package for its lightening speed. However, very few would pay special attentions to small details that might significantly boost the efficiency of these packages by adding index to ... [Read more...]

Mahout for R Users

June 9, 2013 | simonraper

I have a few posts coming up on Apache Mahout so I thought it might be useful to share some notes. I came at it as primarily an R coder with some very rusty Java and C++ somewhere in the back of my head so that will be my point ... [Read more...]

How to read quickly large dataset in R?

June 9, 2013 | G-Tch

Here, or there, I read many techniques to import a large dataset in R. The option read.table or read.csv doesn't work anyway because, as discusshere, R load in memory. And sometimes, when we try to load a big dataset, we got this message : Warning messages:  1: Reached total allocation ... [Read more...]

Medal Allocations at the Comrades Marathon

June 9, 2013 | andrew

Following up on my previous post regarding attrition rates at Comrades Marathon 2013, here are the statistics I have gathered for medal allocations. There is some interesting history behind the Comrades Marathon medals. For reference, the medals are allocated as follows: Gold medals to the first ten finishers in the men’... [Read more...]

Quartiles, Deciles, and Percentiles

June 9, 2013 | Al-Ahmadgaid Asaad

The measures of position such as quartiles, deciles, and percentiles are available in quantile function. This function has a usage,where:x - the data pointsprob - the location to measurena.rm - if FALSE, NA (Not Available) data points are not ignoredna... [Read more...]

Estimating Finite Mixture Models with Flexmix Package

June 9, 2013 | statcompute

In my post on 06/05/2013 (http://statcompute.wordpress.com/2013/06/05/estimating-composite-models-for-count-outcomes-with-fmm-procedure), I’ve shown how to estimate finite mixture models, e.g. zero-inflated Poisson and 2-class finite mixture Poisson models, with FMM and NLMIXED procedure in SAS. Today, I am going to demonstrate how to achieve the same results with flexmix package ... [Read more...]

Quick and Simple D3 Network Graphs from R

June 8, 2013 | Christopher Gandrud

Sometimes I just want to quickly make a simple D3 JavaScript directed network graph with data in R. Because D3 network graphs can be manipulated in the browser–i.e. nodes can be moved around and highlighted–they're really nice for data exploration. They're also really nice in HTML presentations. ... [Read more...]

Mean and Median

June 8, 2013 | Al-Ahmadgaid Asaad

Mean in R is computed using the function mean. Consider the scores of 20 MSU-IIT students in Stat 101 exam with a hundred items: 70, 78, 66, 65, 50, 53, 48, 88, 95, 80, 85, 84, 81, 63, 68, 73, 75, 84, 49, and 77. Compute and interpret the mean and medi... [Read more...]

Bulk search for domain names using R

June 8, 2013 | Francis Smart

# There are several domain name servers that allow # for bulk searching of domain names.# http://www.godaddy.com/bulk-domain-search.aspx# http://www.namestation.com/bulk-domain-search# However, they do not provide any wildcard support # and instead exp... [Read more...]

Matrix Operations

June 8, 2013 | Al-Ahmadgaid Asaad

Matrix manipulation in R are very useful in Linear Algebra. Below are lists of common yet important functions in dealing operations with matrices:Transpose - tMultiplication - %*%Determinant - detInverse - solve, or ginv of MASS libraryEigenvalues and ... [Read more...]

R and MongoDB

June 7, 2013 | statcompute

MongoDB is a document-based noSQL database. Different from the relational database storing data in tables with rigid schemas, MongoDB stores data in documents with dynamic schemas. In the demonstration below, I am going to show how to extract data from a MongoDB with R. Before starting the R session, we ... [Read more...]

Hey, I Just did a Significance Test!

June 7, 2013 | Wesley

I’ve seen it happens quite often. The sig test. Somebody simply needs to know the p-value and that one number will provide all of the information about the study that they need to know. The dataset is presented and the client/boss/colleague/etc invariably asks the question “is ... [Read more...]

Robust logistic regression

June 7, 2013 | andrew

Corey Yanofsky writes: In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers ... [Read more...]
1 9 10 11 12 13 14

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)