Sentiment Analysis on Twitter with Viralheat API

September 2, 2013
By
Sentiment Analysis on Twitter with Viralheat API

Hi there! Some time ago I published a post about doing a sentiment analysis on Twitter. I used two wordlists to do so; one with positive and one with negative words. For the first try of a sentiment analysis it is surely a good way to start but if you want to receive more accurate …

Read more »

Poll: R top language for data science three years running

September 2, 2013
By
Poll: R top language for data science three years running

KDDNuggets has completed its annual poll of top languages for analytics, data mining and data science, and just as in the prior two years the R language is ranked the most popular. R is used by almost 61% of respondents: R's usage grew year over year as well, up 16% compared to the 2012 poll. By contrast, the rate...

Read more »

Showing results from Cox Proportional Hazard Models in R with simPH

September 2, 2013
By
Showing results from Cox Proportional Hazard Models in R with simPH

Effectively showing estimates and uncertainty from Cox Proportional Hazard (PH) models, especially for interactive and non-linear effects, can be challenging with currently available software. So, researchers often just simply display a results table. These are pretty useless for Cox PH models. It is difficult to decipher a simple linear variable’s estimated effect and basically impossible to understand time...

Read more »

Passing-Bablok Regression: R code for SAS users

September 2, 2013
By
Passing-Bablok Regression: R code for SAS users

While at the Joint Statistical Meeting a few weeks ago I was talking to a friend about various aspects to clinical trials. He indicated that no current R package was able to perfectly reproduce Passing-Bablok (PB) regression so that it exactly matched SAS. He ultimately wrote a couple of functions and kindly shared them with

Read more »

Easy 3-Minute Guide to Making apply() Parallel over Distributed Grids and Clusters in R

September 1, 2013
By
Easy 3-Minute Guide to Making apply() Parallel over Distributed Grids and Clusters in R

Last week I attended a workshop on how to run highly parallel distributed jobs on the Open Science Grid (osg). There I met Derek Weitzel who has made an excellent contribution to advancing R as a high performance computing language by developing BoscoR. BoscoR greatly facilitates the use of the already existing package “GridR” by The post Easy...

Read more »

Latent Variable Analysis with R: Getting Setup with lavaan

September 1, 2013
By
Latent Variable Analysis with R: Getting Setup with lavaan

Getting Started with Structural Equation Modeling Part 1Getting Started with Structural Equation Modeling: Part 1 Introduction For the analyst familiar with linear regression fitting structural equation models can at first feel strange. In the R environment, fitting structural equation models involves learning new modeling syntax, new plotting...

Read more »

Fair weather fans, redux

September 1, 2013
By
Fair weather fans, redux

Fair weather fans, redux Or, A little larger small sample On August 11 the Victoria HarbourCats closed out their 2013 West Coast League season with a 4-3 win over the Bellingham Bells. In an earlier...

Read more »

Mixed models exercise 2. Repeated measurements

September 1, 2013
By

Continuing my exploration of mixed models, I now understand what is happening in the second SAS(R)/STAT example for proc mixed (page 5007 of the SAS/STAT 12.3 Manual). It is all about correlation between the time-points within subjects. The data as suc...

Read more »

Win Your Fantasy Football Snake Draft with this Shiny App in R

August 31, 2013
By

In a previous post, I showed how to determine the best starting lineup to draft in an auction draft using an optimizer tool.  In this post, I use a Shiny app in R to determine The post Win Your Fantasy Football Snake Draft with this Shiny App in R appeared first on Fantasy Football Analytics.

Read more »

Win Your Fantasy Football Snake Draft with this Shiny App in R

August 31, 2013
By

In a previous post, I showed how to determine the best starting lineup to draft in an auction draft using an optimizer tool.  In this post, I use a Shiny app in R to determine the best possible players to pick in a fantasy...

Read more »

StarCluster and R

August 31, 2013
By
StarCluster and R

StarCluster is a utility for creating and managingdistributed computing clusters hosted on Amazon's Elastic ComputeCloud (EC2). StarCluster utilizes Amazon´s EC2 web service to createand destroy clusters of Linux virtual machines on demand. Justin Riley http://star.mit.edu/cluster/docs/latest/index.html StarCluster documentation StarCluster provides a convenient way to quickly set up a cluster of machines to run some data parallel jobs using a distributed memory framework. Install...

Read more »

Introducing ‘propagate’

August 31, 2013
By
Introducing ‘propagate’

With this post, I want to introduce the new ‘propagate’ package on CRAN. It has one single purpose: propagation of uncertainties (“error propagation”). There is already one package on CRAN available for this task, named ‘metRology’ (http://cran.r-project.org/web/packages/metRology/index.html). ‘propagate’ has some additional functionality that some may find useful. The most important functions are: * propagate: A

Read more »

GitHub Package Ideas I Stole

August 31, 2013
By
GitHub Package Ideas I Stole

One of my favorite sources of good ideas is looking at the GitHub repositories of others and modeling my repos after the good ideas I see others doing. Here's Steve Jobs on stealing ideas: In the past few weeks I've … Continue reading →

Read more »

MLB Rankings Using the Bradley-Terry Model

August 31, 2013
By
MLB Rankings Using the Bradley-Terry Model

Today, I take my first shots at ranking Major League Baseball (MLB) teams. I see my efforts at prediction and ranking an ongoing process so that my models improve, the data I incorporate are more meaningful, and ultimately my predictions are largely accurate. For the first attempt, let’s rank MLB teams using the Bradley-Terry (BT) model. Before we discuss the rankings, we need...

Read more »

The Dutch Dataverse Network: a host for the ChEMBL-RDF v13.5 data, and some thoughts in workflow integration

August 31, 2013
By
The Dutch Dataverse Network: a host for the ChEMBL-RDF v13.5 data, and some thoughts in workflow integration

Last Thursday, there was a UM library network drink. And as I see a library where knowledge is found, and libraries still rarely think of knowledge as ever being able to be stored outside books and papers, I was happy to see the library promoting the D...

Read more »

Visualising Shrinkage

August 31, 2013
By
Visualising Shrinkage

A useful property of mixed effects and Bayesian hierarchical models is that lower level estimates are shrunk towards the more stable estimates further up the hierarchy. To use a time honoured example you might be modelling the effect of a new teaching method on performance at the classroom level. Classes of 30 or so students … Continue reading...

Read more »

Encouraging citation of software – introducing CITATION files

August 30, 2013
By

Summary: Put a plaintext file named CITATION in the root directory of your code, and put information in it about how to cite your software. Go on, do it now – it’ll only take two minutes! Software is very important in science – but good software takes time and effort that could be used to do

Read more »

The joy and martyrdom of trying to be a Bayesian

August 30, 2013
By

Some of my fellow scientists have it easy. They use predefined methods like linear regression and ANOVA to test simple hypotheses; they live in the innocent world of bivariate plots and lm(). Sometimes they notice that the data have odd histograms and they use glm(). The more educated ones use … Continue reading →

Read more »

Tutorial: Parallel programming with foreach

August 30, 2013
By

Exegetic Analytics extols the wonders of foreach package for iterative operations that go beyond the standard "for" loop in R. For example, here's a neat (if not optimally efficient) construct using filters to calculate the primes less than 100: foreach(n = 1:100, .combine = c) %:% when (isPrime(n)) %do% n The open-source team at Revolution Analytics created the foreach...

Read more »

ECVP tutorial on classification images

August 30, 2013
By
ECVP tutorial on classification images

The slides for my ECVP tutorial on classification images are available here. Try this alternative version if the equations look funny. (image from Mineault et al. 2009) The slides are in HTML and contain some interactive elements. They’re the result of experimenting with R Markdown, D3 and pandoc. You write the slides in R Markdown,

Read more »

Making regex examples work for you!

August 30, 2013
By

One of the most frequently used string recognition algorithms out there is regex and R implements regex.  However, users can often be frustrated with how despite taking examples verbatim from many sources such as stackoverflow they do not seem to ...

Read more »

Knitr/Markdown OpenCPU App

August 30, 2013
By
Knitr/Markdown OpenCPU App

A new little OpenCPU app allows you to knit and markdown in the browser. It has a fancy pants code editor which automatically updates the output after 3 seconds of inactivity. It uses the Ace web editor with mode-r.js (thanks to RStudio for making the latter available). Like all OpenCPU apps, the source...

Read more »

Knitr/Markdown OpenCPU App

August 30, 2013
By

A new little OpenCPU app allows you to knit and markdown in the browser. It has a fancy pants code editor which automatically updates the output after 3 seconds of inactivity. It uses the Ace web editor with mode-r.js (thanks to RStudio for making the latter available). Like all OpenCPU apps, the source package lives in the opencpu app...

Read more »

Drafting the Best Starting Lineup in Fantasy Football by Taking into Account Uncertainty in the Projections: An Optimization Simulation

August 29, 2013
By
Drafting the Best Starting Lineup in Fantasy Football by Taking into Account Uncertainty in the Projections: An Optimization Simulation

In a previous post, I showed how to determine the best starting lineup to draft using an optimizer tool.  The optimizer identifies the players that maximize your projected points within your The post Drafting the Best Starting Lineup in Fantasy Football by Taking into Account Uncertainty in the Projections: An Optimization Simulation appeared first on Fantasy Football Analytics.

Read more »

Drafting the Best Starting Lineup in Fantasy Football by Taking into Account Uncertainty in the Projections: An Optimization Simulation

August 29, 2013
By
Drafting the Best Starting Lineup in Fantasy Football by Taking into Account Uncertainty in the Projections: An Optimization Simulation

In a previous post, I showed how to determine the best starting lineup to draft using an optimizer tool.  The optimizer identifies the players that maximize your projected points within your risk tolerance.  The optimizer does not take i...

Read more »

Plot Weekly or Monthly Totals in R

August 29, 2013
By
Plot Weekly or Monthly Totals in R

When plotting time series data, you might want to bin the values so that each data point corresponds to the sum for a given month or week. This post will show an easy way to use cut and ggplot2's stat_summary to plot month totals in R wi...

Read more »

A simple amortization function

August 29, 2013
By

I was working on a project yesterday where I needed to amortize out a bunch of loans to calculate the total interest a borrower would pay if he or she paid the minimum monthly payment for the full term of the loan. I couldn’t find any package in R that already contained the necessary math,

Read more »

R and Linear Algebra

August 29, 2013
By

by Joseph Rickert I was recently looking through upcoming Coursera offerings and came across the course Coding the Matrix: Linear Algebra through Computer Science Applications taught by Philip Klein from Brown University. This looks like a fine course; but why use Python to teach linear algebra? I suppose this is a blind spot of mine: MATLAB I can see....

Read more »

New Video: Credit Scoring & R: Reject inference, nested conditional models, & joint scores

August 29, 2013
By

This post shares the video from the talk presented in August 2013 by Ross Gayler on Credit Scoring and R at Melbourne R Users. Credit scoring tends to involve the balancing of mutually contradictory objectives spiced with a liberal dash … Continue reading →

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.