More Command-Line Text Munging Utilities

May 19, 2011
By
More Command-Line Text Munging Utilities

In a previous post I linked to gcol as a quick and intuitive alternative to awk. I just stumbled across yet another set of handy text file manipulation utilities from the creators of the BEAGLE software for GWAS data imputation and analysis. In additio...

Read more »

Hadley Wickham’s R Development Master Class coming to SF

May 19, 2011
By

Hadley Wickham, the Rice professor and prolific R hacker best known as the author of the ggplot2 graphics package, will be coming to San Francisco June 8-9 to deliver his new R Development Master Class (in conjunction with Revolution Analytics). This course will build on the skills of basic R programmers with instruction in advanced R programming techniques, development...

Read more »

Converting vectors to numeric in mixed-type dataframe

May 19, 2011
By

Coercing variables of character and numeric type into a single dataframe yields all vectors to be defined as factors all <- data.frame(cbind(site, year, model, x, y, z)) The following converts selected variables from “factor” back to “numeric” all$x <- as.numeric(x) … Continue reading →

Read more »

Bar Graphs in ggplot2

May 19, 2011
By
Bar Graphs in ggplot2

As part of my continuing fun and games getting to grips with ggplot2′s vast multitude of functions, here I give …Continue reading »

Read more »

Applying PDQ in R to Load Testing

May 19, 2011
By
Applying PDQ in R to Load Testing

PDQ is a library of functions that helps you to express and solve performance questions about computer systems using the abstraction of queues. The queueing paradigm is a natural choice because, whether big (a web site) or small (a laptop), all computer systems can be represented as a network or circuit of buffers and a buffer is a...

Read more »

R-bloggers – a recommendation

May 18, 2011
By
R-bloggers – a recommendation

As I decided to try and blog a little more often now, and touch on "R" every now and then, I decided to take R-bloggers up on their standing offer to include R-related feeds at their site. So, everything I tag with "rstats" (you can guess where that ca...

Read more »

Mapping locations in R with the Data Science Toolkit

May 18, 2011
By
Mapping locations in R with the Data Science Toolkit

Pete Warden's Data Science Toolkit (which we mentioned briefly last week) is an open-source information server that provides an API you can query for information useful for building data science applications, like identifying proper names in unstructured text, or converting IP addresses to lat/long coordinates. You can make queries via the Web interface or by direct interface to the...

Read more »

RStudio the missing link between your brain and statistics

May 18, 2011
By
RStudio the missing link between your brain and statistics

RStudio is a graphical user interface for R. Or as the developers put it. RStudio™ is a new integrated development environment (IDE) for R. RStudio combines an intuitive user interface with powerful coding tools to help you get the most out of R.   While there have been a few projects (e.g. RCommander, RkWard, JaguaR)

Read more »

Wonderful New Blog TimeSeriesIreland

May 18, 2011
By
Wonderful New Blog TimeSeriesIreland

I returned from Scotland to find a wonderful new blog from Ireland http://timeseriesireland.wordpress.com.  To highlight his work, I thought I would apply his most recent post AIB Stock Price, EGARCH-M, and rgarch to the S&P 500.  Clearly...

Read more »

Fractional Factorial Designs using FrF2

May 18, 2011
By

The FrF2 package for R can be used to create regular and non-regular Fractional Factorial 2-level designs. It is reasonably straightforward to use. First step is to install the package then make it available for use in the current session: require(FrF2) A basic call to the main functino FrF2 specifies the number of runs in

Read more »

Vehicle Routing Problem

May 18, 2011
By
Vehicle Routing Problem

This is a follow-up to a previous question on VRP. I investigated R libraries and several other options to solve VRP and decided to build a custom desktop application using open source libraries from COIN-OR. Screenshots attached below.Leave a comment ...

Read more »

Stata-like Marginal Effects for Logit and Probit Models in R [2]

May 18, 2011
By
Stata-like Marginal Effects for Logit and Probit Models in R [2]

My thanks to those who emailed comments and suggestions for my ‘mfx’ function, I’m happy that I could fill a void for some people. I also received a request/suggestion from Tony Cookson, along with a helpful fix for a bug in the code, to include an option that would allow the user to specify values

Read more »

The RDSTK Presentation at Denver R Users Group

May 18, 2011
By
The RDSTK Presentation at Denver R Users Group

Last night I presented a talk at the DRUG introducing the R wrapper for the Data Science Toolkit.  Lots of good questions, good forking, and good beer afterwards at Freshcraft.  The slides are given below.

Read more »

New R User Groups in Turin, Belgrade

May 18, 2011
By

Two new local R user groups in Europe to announce this week. For R users in Serbia, there's a new group based in Belgrade. You can find more information about the group and upcoming meetings at the Croatian-language blog Sav tar R. And for R users in northern Italy, there's a new group based in Turin: Torino R net...

Read more »

Resources for Learning R

May 17, 2011
By

The information below will be periodically updated at the folowing permanent link: http://www.backsidesmack.com/r-resources/ Searching for information on R sucks. Not only is the language name a letter of the alphabet (an ignominy it shares with C and some less well known languages), there is Pearson’s r and the coefficient of determination, R squared! if you…

Read more »

Stata-like Marginal Effects for Logit and Probit Models in R

May 17, 2011
By
Stata-like Marginal Effects for Logit and Probit Models in R

Although this blog’s primary focus is time series, one feature I missed from Stata was the simple marginal effects command, ‘mfx compute’, for cross-sectional work, and I could not find an adequate replacement in R. To bridge this gap, I’ve written a (rather messy) R function to produce marginal effects readout for logit and probit

Read more »

Simulating Win/Loss streaks with R rle function

May 17, 2011
By
Simulating Win/Loss streaks with R rle function

The following script allows you to simulate sample runs of Win, Loss, Breakeven streaks based on a random distribution, using the run length encoding function, rle in R. Associated probabilities are entered as a vector argument in the sample function.Y...

Read more »

A survey of the [60′s] Monte Carlo methods [2]

May 17, 2011
By
A survey of the [60′s] Monte Carlo methods [2]

The 24 questions asked by John Halton in the conclusion of his 1970 survey are Can we obtain a theory of convergence for random variables taking values in Fréchet spaces? Can the study of Monte Carlo estimates in separable Fréchet spaces give a theory of global approximation? When sampling functions, what constitutes a representative sample

Read more »

How to do a quantitative literature review in R

May 17, 2011
By
How to do a quantitative literature review in R

In the early stages of a literature review, you may have hundreds of papers and not know how to even begin sorting through them. In this post, I show you how to perform a two-stage clustering analysis with R so that you can identify the main groups within your data, based on key attributes of each paper.

Read more »

Gifts from BAC ML and the Federal Reserve

May 17, 2011
By
Gifts from BAC ML and the Federal Reserve

Bank of America Merrill Lynch and the Federal Reserve Bank of St. Louis Fed continue to surprise me with even more gifts.  This time they added Emerging Market Bond Indexes with history back to 1998 (cannot see Asia Pacific Crisis of 1997-1998 but...

Read more »

A simple function for plotting phylogenies in ggplot2

May 17, 2011
By
A simple function for plotting phylogenies in ggplot2

I wrote a simple function for plotting a phylogeny in ggplot2. However, it only handles a 3 species tree right now, as I haven't figured out how to generalize the approach to N species.Any ideas on how to improve this?

Read more »

Russell Napier, ASIP in FT Says Emerging Market Currencies

May 17, 2011
By
Russell Napier, ASIP in FT Says Emerging Market Currencies

Clearly I have succumbed to confirmation bias, since my second favorite presentation from the CFA Institute Annual Conference this year came from Scotland native Russell Napier, ASIP who shares my views nearly completely http://video.ft.com/v/946244201...

Read more »

R tips from around the web…

May 17, 2011
By

The great thing of R, is that the number of available resources on the web is increasing dramatically. If you cannot afford expensive books, or you are looking straight questions or some tutorials you will find almost anything you need. Here's my short...

Read more »

AIB Stock Price, EGARCH-M, and rgarch

May 17, 2011
By
AIB Stock Price, EGARCH-M, and rgarch

This post examines conditional heteroskedasticity models in the context of daily stock price data for Allied Irish Banks (AIB), specifically how to test for conditional heteroskedasticity in a series, how to approach model specification and estimation when time-varying volatility is present, and how to forecast with these models; all of this is done in R,

Read more »

In case you missed it: April Roundup

May 17, 2011
By

In case you missed them, here are some articles from April of particular interest to R users. The Heritage Health Prize, a competition to build predictive models for hospitalization with USD$3.2M in prizes, is open. The Inside-R.org community site now provides the ability to search and view the help files for CRAN packages. Revolution R Enterprise 4.3 released: R...

Read more »

TreeBASE in R: a first tutorial

May 16, 2011
By
TreeBASE in R: a first tutorial

My TreeBASE R package is essentially functional now.  Here’s a quick tutorial on the kinds of things it can do.  Grab the treebase package here, install and load the library into R. TreeBASE provides two APIs to query the database, one which searches by the metadata associated with different publications (called OAI-PMH), and another which

Read more »

A survey of [the 60’s] Monte Carlo methods

May 16, 2011
By
A survey of [the 60’s] Monte Carlo methods

“The only good Monte Carlos are the dead Monte Carlos” (Trotter and Tukey, quoted by Halton) When I presented my history of MCM methods in Bristol two months ago, at the Julian Besag memorial, Christophe Andrieu mentioned a 1970 SIAM survey by John Halton on A retrospective and prospective survey of the Monte Carlo

Read more »

My first ‘R’ plot

May 16, 2011
By
My first ‘R’ plot

Started learning 'R'.My first attempt was to plot data from Forbes 1000 list (refer to the exercise posted by Prasoon sharma)Here is a bubble chart showing Forbes top 25 companies by Market CapitalizationSource code:## read the csv fileFORBES...

Read more »

Day #38-39 Data-manipulation Part 1

Last week i created some plots, always for 1 feature. Today I started working on the full script that creates all these plots, 1 per feature. This means, using for loops in R. Let’s see how this is going to work out. Today I mostly worked on data...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.