Hello all of you Stata loving statistical analysts out there! I have great news. I am finally nearly done with the package I have been working on which provides the mechanism for Stata users to seamlessly move from Stata to R though use of ...

R news and tutorials contributed by (552) R bloggers

During a lunchtime discussion among recent GCaP class attendees, the topic of weather came up and I casually mentioned that the weather in Melbourne, Australia, can be very changeable because the continent is so old that there is very little geographical relief to moderate the prevailing winds coming from the west.In general, Melbourne...

I had no intention to blog this, but @jayjacobs convinced me otherwise. I was curious about the recent (end of March, 2014) California earthquake “storm” and did a quick plot for “fun” and personal use using ggmap/ggplot. I used data from the Southern California Earthquake Center (that I cleaned up a bit and that you

R is a statistical analysis package based on writing short scripts or programs (versus being based on GUIs like spreadsheets or directed workflow editors). I say “writing short scripts” because R’s programming language (itself called S) is a bit of an oddity that you really wouldn’t be using except it gives you access to superiorRelated posts:

by Seth Mottaghinejad, Analytic Consultant for Revolution Analytics You may have heard before that R is a vectorized language, but what do we mean by that? One way to read that is to say that many functions in R can operate efficiently on vectors (in addition to singletons). Here are some examples: > log(1) # input and output are...

Facts do not speak (Henry Poincare) Mr. Penney is my best friend. He is maths teacher and loves playing. Yesterday we were in his office at the university when he suggested me a game: When you toss a coin three times, you can obtain eight different sequences of tails and heads: TTT, TTH, THT, HTT, THH,

In econometrics, generalized method of moments (GMM) is one estimation methodology that can be used to calculate instrumental variable (IV) estimates. Performing this calculation in R, for a linear IV model, is trivial. One simply uses the gmm() function in the excellent gmm package like an lm() or ivreg() function. The gmm() function will estimate

Does the transition to and from Daylight Saving Time (DST) have a (significant) effect on the stock market? In a recent blog post on The UK Stock Market Almanac, the author found that the average return of the FTSE100 index for the days following the start of British Summer Time (BST) was -0.07% during the

Here’s a post that appears on my new website, ragscripts.com. On-line resources for analysts are often either too general to be of practical use or too specialised to be accessible. The aim of ragscripts.com is to remedy this by providing start to finish directions for complex analytical tasks. The site is under construction at the … Continue reading...

with more than a decade of microdata aimed at gauging the political mood across european nations, the european social survey (ess) allows scientists like you to examine socio-demographic shifts among broad groups all the way down to pirate party (pirat...

An interesting question was posted on http://math.stackexchange.com/726205/…: if one knows the covariances and , is it possible to infer ? I asked myself a question close to this one a few weeks ago (that I might also relate to a question I asked a long time ago, about possible correlations between three exchange rates, on financial markets). More precisely, if one knows the...

Welcome to the blog post! We all know the predictive analysis is very hot topic now days. Everyone is looking for how the power of predictive analysis can be used in their business and get their business questions solved. Recently, I was doing study on the predictive analysis in ecommerce. I found many interesting things The post Predictive...

To give this years April Fools’ day a more analytical touch, we decided last week do a little poll on internet cartoons. We asked our friends and colleagues to select their favourite data related cartoon on the web, and organized a voting session to construct a top 5 list. (You can always share your own

Price Earnings ratio (P/E) is one of the very popular ratios reported with all stocks. Very simply this is thought as - Current Market Price / Earning per Share. An operational definition of Earning per Share would be Total profit divided by # of Shares . I will redirect interested readers for further reading towww.investopedia.com/terms/p/price-earningsratio.aspIn this post,...

Today I am going to introduce the moustache target distribution (moustarget distribution for brievety). Load some packages first. Let’s invoke the moustarget distribution. This defines a target distribution represented by a SVG file using RShapeTarget. The target probability density function is defined on and is proportional to on the segments described in the SVG files,

Francis Smart offers five excellent reasons to use R, in a well-researched post ideal for sharing with anyone thinking about making the switch to R. (You might also share this YouTube video for a quick 90-second introduction to R.) The post also includes a novel analysis of interest in R, as tracked by Google Trends. Given its single-letter name,...

I want to follow up the Intraday data post with testing the Probabilistic Momentum strategy on Intraday data. I will use Intraday data for SPY and GLD from the Bonnot Gang to test the strategy. Next, let’s examine the hourly perfromance of the strategy. There are lots of abnormal returns in the 9:30-10:00am box due

Here is the second part of my review of Gelman et al.’ Bayesian Data Analysis (third edition): “When an iterative simulation algorithm is “tuned” (…) the iterations will not in general converge to the target distribution.” (p.297) Part III covers advanced computation, obviously including MCMC but also model approximations like variational Bayes and expectation propagation

Recently my student Yingkang Xie and I have developed freqparcoord, a novel approach to the parallel coordinates method for multivariate data visualization. Our approach: Addresses the screen-clutter problem in parallel coordinates, by only plotting the “most typical” cases, meaning those with the highest estimated multivariate density values. This makes it easier to discern relations between variables.

Hi, Norm Matloff here. I’m a professor of computer science at UC Davis, and was a founding member of the UCD Dept. of Statistics. You may know my book, The Art of R Programming (NSP, 2011). I have some strong views on statistics–which you are free to call analytics, data science, machine learning or whatever your favorite term is–so

I have been watching the awesome Netflix show “House of Cards” and been fascinated by the devious schemes that Underwood is constantly plotting. The show often mentions approval ratings and it got me to wondering what Obama’s ratings currently were, and all other past US president for that matter. However, I didn’t have much chance

The annoucement below just went to the R-SIG-Finance list. More information is as usual at the R / Finance page:Now open for registrations: R / Finance 2014: Applied Finance with R May 16 and 17, 2014 Chicago, IL, USA The reg...

As announced on the R-SIG-Finance mailing list, registration for R/Finance 2014 is now open! The conference will take place May 17 and 18 in Chicago.Building on the success of the previous conferences in 2009-2013, we expect more than 250 attendees fro...