Here is a little example of what I do. While learning R isn't easy, it can be very powerful and efficient once you get your feet wet. I intend for this example to whet your appetite. This should take you less than 20 minutes. By the end, you will...

So where did we mess up? In the calculation of returns for the market cap weighted portfolio and the portfolio optimization portfolio, we simply took the starting weights (W0) and multiplied them by the relevant series of returns.resEqual = as.matrix(returns) %*% t(ret)andsubRes = as.matrix(subRes) %*% t(ret)To correct this, we have 2 options. Recalculate the weight at each time point assuming a starting weight. ...

A review of market predictions and results for 2011, and a calibration for 2012 predictions (of 19 equity indices plus oil). Previously One year ago the post “Revised market prediction distributions” presented plots showing the variability of various markets assuming no market-moving forces. The follow-up post “Some market predictions enhanced some of those plots with … Continue reading...

CloudStat School is a not yet released open source project. The objective is to create an interactive R Learning Platform. The best way to learn R programming is doing while learning. In CloudStat School, you will see a console box at your top left han...

My last post discussed a method to decode a substitution cipher using a Metropolis-Hastings algorithm. It was brought to my attention that this code could be improved by using Simulated Annealing methods to jump around the sample space and avoid some of the local maxima. Here is a basic description of the difference: In a

R-bloggers.com is now two years young. The site is an (unofficial) online R journal written by bloggers who agreed to contribute their R articles to the site. In this post I wish to celebrate R-bloggers’ second birthmounth by sharing with you: Links to the top 20 posts of 2011 Statistics on “how well” R-bloggers did Read more...

In post 6 we introduced some econometrics code that will help those working with time-series to gain asymptoticly efficient results. In this post we look at the different commands and libraries necessary for testing our assumptions and such. Testing our Assumptions and Meeting the Gauss-Markov TheoremIn this section we will seek to test and verify the assumptions of the simple linear...

RTextTools v1.3.5 addresses some key concerns that have been raised in recent months. Many of the algorithms used in RTextTools require that any new data presented to a trained classifier contain the same features as the original document-term matrix. Since this rarely (if ever) happens in the real world, I have added an originalMatrix parameter to the create_matrix() function...

Well, here we go again! It's time of year that we make all of those resolutions - the ones that usually get broken before the holiday decorations have been packed away. Not this year, though!In 2012, and in no particular order, I firmly resolve to:Increase my use of the R statistical environment in my research and teaching, and foster "Reconometrics".Become more...

Best One-Sentence Pitch for CloudStat (TechCrunch): Techcrunch.com is running a one sentence pitch competition for startup. So, I made one for CloudStat (using their format): My company, CloudStat is developing a cloud-based statistical platform to h...

I love board games. Over the holidays, I came across this interesting post over at Arthur Charpentier’s Freakonometrics blog about the classic game of snakes and ladders. The post is a nice little demonstration of how the game can be formulated completely as a Markov chain, and can be analysed simply using the mathematics of

Here is an interview with Kai Chew, Founder of Cloudstat. CloudStat is developing a cloud-based statistical platform to help researchers who want to make sense of data to do statistical analysis collaboratively with its high performance computing infra...

"Big Data" = data that come in amounts that are too large for current computer hardware and software to deal with. That sounds like fun!Norman Nie developed the well known SPSS statistical package in the 1960, and is currently President and CEO of Revolution Analytics, a California company that promotes the use of the R computing...

Greg Campbell writes: I am a Canadian archaeologist (BSc in Chemistry) researching the past human use of European Atlantic shellfish. After two decades of practice I am finally getting a MA in archaeology at Reading. I am seeing if the habitat or size of harvested mussels (Mytilus edulis) can be reconstructed from measurements of the The post Using...

Hello and welcome to the CloudStat official blog! We’ll be using this space to talk about product updates, getting the most out of CloudStat, and random thoughts on data analysis learning, especially in R language. More about CloudStat can be vie...

I was rummaging around in the source of R looking for trouble, as one does, when I came across what I believed to be a less than optimally accurate floating-point algorithm (function R_pos_di in src/main/arithemtic.c). Analyzing the accuracy of floating-point code is notoriously difficult and those having the required skills tend to concentrate their efforts

IT is now appropriate to lay out our two regression models in full for empirical estimation over our two separate time periods. The first estimation is from 4/1/71 to 7/1/97 and the second is from 4/1/01 to 4/1/11. The methodology employed in the estimation of these two models is a procedure using Generalized Least Squares with a Cochrane-Orcutt, style iterated...

A minor new release of the RcppExamples package is now on CRAN. RcppExamples contains a few illustrations of how to use Rcpp. It grew out of documentation for the classic API (now in its own package RcppClassic), and while we added a few more funct...