<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-bloggers &#187; statistics</title>
	<atom:link href="http://www.r-bloggers.com/tag/statistics/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-bloggers.com</link>
	<description>R news and tutorials contributed by (300) R bloggers</description>
	<lastBuildDate>Tue, 07 Feb 2012 10:27:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>The US market will absolutely positively definitely go up in 2012</title>
		<link>http://www.r-bloggers.com/the-us-market-will-absolutely-positively-definitely-go-up-in-2012/</link>
		<comments>http://www.r-bloggers.com/the-us-market-will-absolutely-positively-definitely-go-up-in-2012/#comments</comments>
		<pubDate>Mon, 06 Feb 2012 09:37:58 +0000</pubDate>
		<dc:creator>Pat</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[P-value]]></category>
		<category><![CDATA[R Language]]></category>
		<category><![CDATA[random permutation test]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Super Bowl Indicator]]></category>

		<guid isPermaLink="false">http://www.portfolioprobe.com/?p=5987</guid>
		<description><![CDATA[The Super Bowl tells us so. The Super Bowl Indicator The championship of American football decides the direction of the US stock market for  the year.  If a &#8220;National&#8221; team wins, the market goes up; if an &#8220;American&#8221; team wins, the market goes down. Yesterday the Giants, a National team, beat the Patriots. The birth &#8230; <a href="http://www.portfolioprobe.com/2012/02/06/the-us-market-will-absolutely-positively-definitely-go-up-in-2012/">Continue reading <span>&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://feedproxy.google.com/~r/PortfolioProbeRLanguage/~3/guZdD0qVLMQ/"> Portfolio Probe » R language</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p>The Super Bowl tells us so.</p>
<h2>The Super Bowl Indicator</h2>
<p>The championship of American football decides the direction of the US <a href="http://www.portfolioprobe.com/2011/08/01/the-benchmark-gambit/" ref="nofollow" target="_blank">stock market</a> for  the year.  If a &#8220;National&#8221; team wins, the market goes up; if an &#8220;American&#8221; team wins, the market goes down. Yesterday the Giants, a National team, beat the Patriots.</p>
<p>The <a href="http://blogs.wsj.com/marketbeat/2011/01/28/super-bowl-indicator-the-secret-history/" ref="nofollow" target="_blank">birth of the indicator</a> was quite auspicious.</p>
<p>Eight years ago I wrote about <a href="http://www.burns-stat.com/pages/Working/superbowl.pdf" ref="nofollow" target="_blank">using random permutation tests to explore the efficacy</a> of the indicator.  One of the points is that the fraction of correct predictions is not necessarily a good indication of the value of a predictor.</p>
<p>Thanks to <a href="http://online.barrons.com/article/SB50001424052748703964504577193143465934510.html" ref="nofollow" target="_blank">Robin Blumenthal of Barron&#8217;s</a> we now have updated data.  The indicator has been right 28 times out of 45.  Now I should say at this point that over the years the notion of &#8220;National&#8221; versus &#8220;American&#8221; has become fuzzy.  Also people have tried to fudge the meaning of &#8220;market&#8221; to make the indicator look better (the Dow Jones Industrials is used here).  So other values of &#8220;number correct&#8221; are possible.</p>
<p>Figure 1 shows the results of a permutation test on the 45 years.  The p-value for this test is about 13%.  Only 3 of the 8 updated years were correct.</p>
<p>Figure 1: Random permutation test of 45 years of the Super Bowl Indicator. <a href="http://www.portfolioprobe.com/2012/02/06/the-us-market-will-absolutely-positively-definitely-go-up-in-2012/superbowl_perm_all-2/" rel="attachment wp-att-6001" ref="nofollow" target="_blank"><img class="aligncenter size-full wp-image-6001" title="superbowl_perm_all" src="http://www.portfolioprobe.com/wp-content/uploads/2012/02/superbowl_perm_all1.png" alt="" width="512" height="480" /></a></p>
<p>This, of course, includes the years that were used to form the &#8220;hypothesis&#8221;.  A stricter test excludes the first 11 years.  Figure 2 shows the permutation test in this case.  The p-value here is about 40%.</p>
<p>Figure 2: Random permutation test of the Super Bowl Indicator excluding the first 11 years.  <a href="http://www.portfolioprobe.com/2012/02/06/the-us-market-will-absolutely-positively-definitely-go-up-in-2012/superbowl_perm_outsamp-2/" rel="attachment wp-att-6000" ref="nofollow" target="_blank"><img class="aligncenter size-full wp-image-6000" title="superbowl_perm_outsamp" src="http://www.portfolioprobe.com/wp-content/uploads/2012/02/superbowl_perm_outsamp1.png" alt="" width="512" height="480" /></a></p>
<h2>The meaning of p-values</h2>
<p>What does a p-value of 13% or 40% mean?</p>
<h3>The setup</h3>
<p>Statistical hypothesis tests start by <strong>assuming</strong> a &#8220;null hypothesis&#8221; is true.  Calculations are performed <strong>as if</strong> this hypothesis were true.  The p-value is the probability of seeing something at least as extreme as the actual result under those calculations.</p>
<p>The permutation tests done for the Super Bowl have the null hypothesis:</p>
<ul>
<li>the Super Bowl does not predict the market</li>
</ul>
<p>If it doesn&#8217;t predict the market, then it doesn&#8217;t matter which team wins.  The test shuffles the winners and counts the number of correct predictions.  That is done a large number of times (10,000 in this case).</p>
<p>Figure 1 shows that there were a few permutations that resulted in 34 correct predictions, and also a few that resulted in 16 correct predictions.  There were no cases more extreme than these.</p>
<p>If the p-value is small enough, we reject the null hypothesis.</p>
<h3>The common misinterpretation</h3>
<p>Undoubtedly the most common error in interpreting p-values is to think:</p>
<p><span style="color: #ff0000;">The p-value is the probability that the null hypothesis is true.</span></p>
<p>That is written in red because it is absolutely positively definitely <span style="color: #ff0000;">WRONG</span>.</p>
<p>By my reckoning the probability that the Super Bowl does not predict the market is 100% (to rounding error) &#8212; not 13% or 40%.</p>
<h3>Surprise</h3>
<p>P-values are about surprise, not believability.</p>
<p>When you do a hypothesis test, you are playing a game &#8212; like a lottery.  But this is sort of a reverse lottery.  You know you have &#8220;won&#8221;, the p-value tells you how surprised you should be that you won.</p>
<p>In the Super Bowl test we got a p-value of 13%.  Not much surprise.</p>
<p>In <a href="http://www.portfolioprobe.com/2012/01/23/the-distribution-of-financial-returns-made-simple/" ref="nofollow" target="_blank">&#8220;The distribution of financial returns made simple&#8221;</a> there is a test of the null hypothesis that the daily log returns of the S&amp;P 500 are normally distributed.  The p-value for that test is 10 to the minus 2762.  If we have a one in ten million chance of winning a certain lottery that we play once a week, then this p-value is equivalent to winning that lottery every week for seven and a half years.  I think we are surprised.</p>
<h3>Hypothesis believability</h3>
<p>How much we should believe a hypothesis depends on more than p-values.</p>
<p>Here are my beliefs about the two null hypotheses:</p>
<ul>
<li>Super Bowl does not predict market: believe</li>
<li>Normally distributed returns: do not believe</li>
</ul>
<p>In both cases my beliefs are at the absolutely-positively-definitely level.  But my beliefs are arrived at differently in the two cases.</p>
<p>The null hypothesis for the Super Bowl is inherently very believable to me.  Plus we don&#8217;t get a very surprising p-value.  It would take an extremely surprising p-value to overcome my inherent belief because I see no plausible mechanism that would make a sports game affect the stock market.</p>
<p>The null hypothesis of normally distributed returns is also inherently believable  &#8211;  it is reasonable to expect that.  However, we get an extremely surprising p-value.  This outweighs the inherent belief.  It takes essentially absolute faith in the inherent believability &#8212; as the persona of the return distribution post has &#8212; in order to ignore the surprise of the p-value and hold to the null hypothesis.</p>
<h3>Financial addendum</h3>
<p>Above I said:</p>
<blockquote><p>I see no plausible mechanism that would make a sports game affect the stock market.</p></blockquote>
<p>I lied. (See the <a href="http://www.portfolioprobe.com/2012/01/30/review-of-models-behaving-badly-by-emanuel-derman/" ref="nofollow" target="_blank">epilogue of this</a>.)</p>
<p>If the statement had been about some physical system, that would have been fine.  But this is finance and there <strong>is</strong> a plausible mechanism &#8212; if enough people believe it, it can become true.  In finance there can be <a href="http://www.portfolioprobe.com/2010/09/07/perception-switching/" ref="nofollow" target="_blank">self-fulfilling models</a>.</p>
<p>Parental logic &#8212; it&#8217;s true because I say it&#8217;s true &#8212; is hard for children to deal with.  It is hard for adults as well.</p>
<h2>Epilogue</h2>
<blockquote><p>And they asked if I believe<br />
And do the angels really grieve</p></blockquote>
<p>from &#8220;I Am the Ride&#8221; by Chris Smithers</p>
<p><object width="480" height="360" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/C4Izkay88YI?version=3&amp;hl=en_GB" /><param name="allowfullscreen" value="true" /><embed width="480" height="360" type="application/x-shockwave-flash" src="http://www.youtube.com/v/C4Izkay88YI?version=3&amp;hl=en_GB" allowFullScreen="true" allowscriptaccess="always" allowfullscreen="true" /></object></p>
<h2>Appendix R</h2>
<h4>get data</h4>
<p>The main data can be put into <a href="http://www.portfolioprobe.com/user-area/some-hints-for-the-r-beginner/" ref="nofollow" target="_blank">R</a> with the command:</p>
<p><tt>dowjones.super = read.table("http://www.portfolioprobe.com/R/blog/dowjones_super.csv", header=TRUE, sep=",")</tt></p>
<p>With or without R, the file is at <a href="http://www.portfolioprobe.com/R/blog/dowjones_super.csv" ref="nofollow" target="_blank">http://www.portfolioprobe.com/R/blog/dowjones_super.csv</a></p>
<p>The functions (plus the <tt>superscore</tt> object) are in <a href="http://www.portfolioprobe.com/R/blog/permutationTest.R" ref="nofollow" target="_blank">permutationTest.R</a>.  You can get these into R with:</p>
<p><tt>source("http://www.portfolioprobe.com/R/blog/permutationTest.R")</tt></p>
<p>There are some subtle changes in the plot function compared to the one released with the <a href="http://www.burns-stat.com/pages/Working/superbowl.pdf" ref="nofollow" target="_blank">&#8220;Permuting Super Bowl Theory&#8221;</a> paper.</p>
<h4>compute the tests</h4>
<p><tt>fulltest &lt;- permutation.test.discrete(dowjones.super[,"Winner"], dowjones.super[,"DowJonesUpDown"], superscore)</tt></p>
<p><tt>latetest &lt;- permutation.test.discrete(dowjones.super[-1:-11,"Winner"], dowjones.super[-1:-11,"DowJonesUpDown"], superscore)</tt></p>
<h4>plot the tests</h4>
<p><tt>plot(latetest)<br />
plot(fulltest)</tt></p>
<p><a href="http://feedburner.google.com/fb/a/mailverify?uri=PortfolioProbe&amp;loc=en_US" ref="nofollow" target="_blank">Subscribe to the Portfolio Probe blog by Email</a></p>
<img src="http://feeds.feedburner.com/~r/PortfolioProbeRLanguage/~4/guZdD0qVLMQ" height="1" width="1"/>
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://feedproxy.google.com/~r/PortfolioProbeRLanguage/~3/guZdD0qVLMQ/"> Portfolio Probe » R language</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/the-us-market-will-absolutely-positively-definitely-go-up-in-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>speed of R, C, &amp;tc.</title>
		<link>http://www.r-bloggers.com/speed-of-r-c-tc/</link>
		<comments>http://www.r-bloggers.com/speed-of-r-c-tc/#comments</comments>
		<pubDate>Thu, 02 Feb 2012 23:12:45 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[Baum-Welch algorithm]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[EM]]></category>
		<category><![CDATA[HMM]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[Octave]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Running]]></category>
		<category><![CDATA[Scilab]]></category>
		<category><![CDATA[speed]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[University life]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14459</guid>
		<description><![CDATA[My Paris colleague (and fellow-runner) Aurélien Garivier has produced an interesting comparison of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that matlab is a lot [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14459&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/02/03/speed-of-r-c-tc/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><strong>M</strong>y Paris colleague (and <a title="chance meeting" href="http://xianblog.wordpress.com/2011/09/29/chance-meeting/" ref="nofollow" target="_blank">fellow-runner</a>) Aurélien Garivier has produced <a href="http://perso.telecom-paristech.fr/~garivier/code/index.php" ref="nofollow" target="_blank">an interesting comparison</a> of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that</p>
<ul style="text-align:justify;">
<li>matlab is a lot faster than R and python, especially when vectorization is important : this is why the difference is spectacular on filtering/smoothing, not so much on the creation of the sample;</li>
<li>octave is a good matlab emulator, if no special attention is payed to execution speed&#8230;;</li>
<li>scilab appears as a credible, efficient alternative to matlab;</li>
<li>still, C is <strong>a lot</strong> faster; the inefficiency of matlab in loops is well-known, and clearly shown in the creation of the sample.</li>
</ul>
<p style="text-align:justify;">(In this implementation, R is &#8220;only&#8221; three times slower than matlab, so this is not so damning&#8230;) All the codes are <a href="http://perso.telecom-paristech.fr/~garivier/code/index.php" ref="nofollow" target="_blank">available</a> and you are free to make suggestions to improve the speed of of your favourite language!</p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/running/' ref="nofollow" target="_blank">Running</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/baum-welch-algorithm/' ref="nofollow" target="_blank">Baum-Welch algorithm</a>, <a href='http://xianblog.wordpress.com/tag/c/' ref="nofollow" target="_blank">C</a>, <a href='http://xianblog.wordpress.com/tag/em/' ref="nofollow" target="_blank">EM</a>, <a href='http://xianblog.wordpress.com/tag/hmm/' ref="nofollow" target="_blank">HMM</a>, <a href='http://xianblog.wordpress.com/tag/matlab/' ref="nofollow" target="_blank">Matlab</a>, <a href='http://xianblog.wordpress.com/tag/octave/' ref="nofollow" target="_blank">Octave</a>, <a href='http://xianblog.wordpress.com/tag/python/' ref="nofollow" target="_blank">Python</a>, <a href='http://xianblog.wordpress.com/tag/r/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/tag/scilab/' ref="nofollow" target="_blank">Scilab</a>, <a href='http://xianblog.wordpress.com/tag/speed/' ref="nofollow" target="_blank">speed</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14459/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14459/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14459/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14459/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14459/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14459/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14459/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14459/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14459&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/02/03/speed-of-r-c-tc/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/speed-of-r-c-tc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
		</item>
		<item>
		<title>tenured research position with ABC skills!</title>
		<link>http://www.r-bloggers.com/tenured-research-position-with-abc-skills/</link>
		<comments>http://www.r-bloggers.com/tenured-research-position-with-abc-skills/#comments</comments>
		<pubDate>Thu, 02 Feb 2012 11:12:35 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[ABC]]></category>
		<category><![CDATA[INRA]]></category>
		<category><![CDATA[Job offer]]></category>
		<category><![CDATA[Paris-Grignon]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Travel]]></category>
		<category><![CDATA[University life]]></category>
		<category><![CDATA[variational Bayes methods]]></category>
		<category><![CDATA[Versailles]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14542</guid>
		<description><![CDATA[I just received this announcement for the opening of a (tenured/civil servant) position in the national research institute in biostatistics, genetics, and agronomy, INRA: Position opening with profile Approximate inference techniques in complex systems Key activities and required skills: You will develop methodological research in the field of statistical inference for models used in environmental [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14542&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/02/02/tenured-research-position-with-abc-skills/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><strong>I</strong> just received this announcement for the opening of a (tenured/civil servant) position in the national research institute in biostatistics, genetics, and agronomy, <strong><em><a href="http://www.inra.fr/drh/cr2012/index.php?langue=EN" ref="nofollow" target="_blank">INRA</a></em></strong>:</p>
<blockquote>
<p style="text-align:justify;"><strong>Position opening with profile</strong> <em>Approximate inference techniques in complex systems</em></p>
<p style="text-align:justify;"><strong>Key activities and required skills:</strong> You will develop methodological research in the field of statistical inference for models used in environmental sciences. These inference techniques will account for the complex dependency structure due to the temporal, spatial and evolutionary organisation of the observations, for the heterogeneity of the data and for the existence of unobserved variables or incomplete data. Solid experience in statistical modelling of complex data (graphic models, multi-scale spatio-temporal data) and a strong orientation towards the applications in environment and biology would be appreciated. Skills in approximation techniques (variational inference, ABC techniques) will be welcome.</p>
<p style="text-align:justify;"><strong>Contact person:</strong> Stéphane Robin (<strong><em>robin [chez] agroparistech [lepoint] fr</em></strong>)</p>
<p style="text-align:justify;"><strong>Location:  </strong>Versailles-Grignon (Paris)</p>
<p style="text-align:justify;"><strong>Deadline:</strong> February 25, 2012</p>
<p style="text-align:justify;"><strong>Website:</strong> <a href="http://www.inra.fr/drh/cr2012/profil-cr2.php?NumProfil=CR2-2012-11-MIA-1&amp;langue=EN" ref="nofollow" target="_blank">INRA offer</a></p>
</blockquote>
<p style="text-align:justify;"><strong>T</strong>his should appeal to (some) readers of the blog, esp. since the offer has no nationality constraint.</p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/travel/' ref="nofollow" target="_blank">Travel</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/abc/' ref="nofollow" target="_blank">ABC</a>, <a href='http://xianblog.wordpress.com/tag/inra/' ref="nofollow" target="_blank">INRA</a>, <a href='http://xianblog.wordpress.com/tag/job-offer/' ref="nofollow" target="_blank">job offer</a>, <a href='http://xianblog.wordpress.com/tag/paris-grignon/' ref="nofollow" target="_blank">Paris-Grignon</a>, <a href='http://xianblog.wordpress.com/tag/variational-bayes-methods/' ref="nofollow" target="_blank">variational Bayes methods</a>, <a href='http://xianblog.wordpress.com/tag/versailles/' ref="nofollow" target="_blank">Versailles</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14542/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14542/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14542/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14542/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14542/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14542/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14542/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14542/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14542&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/02/02/tenured-research-position-with-abc-skills/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/tenured-research-position-with-abc-skills/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
		</item>
		<item>
		<title>the birthday problem [X&#039;idated]</title>
		<link>http://www.r-bloggers.com/the-birthday-problem-xidated/</link>
		<comments>http://www.r-bloggers.com/the-birthday-problem-xidated/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 12:13:20 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[birthday]]></category>
		<category><![CDATA[coincidence]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[University life]]></category>
		<category><![CDATA[William Feller]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14500</guid>
		<description><![CDATA[The birthday problem (i.e. looking at the distribution of the birthdates in a group of n persons, assuming [wrongly] a uniform distribution of the calendar dates of those birthdates) is always a source of puzzlement [for me]! For instance, here is a recent post on Cross Validated: I have 360 friends on facebook, and, as [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14500&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/02/01/the-birthday-problem/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><strong><a href="http://xianblog.files.wordpress.com/2012/02/birthday.jpg" ref="nofollow" target="_blank"><img class="aligncenter size-full wp-image-14529" title="birthday" src="http://xianblog.files.wordpress.com/2012/02/birthday.jpg?w=450&#038;h=450" alt="" width="450" height="450" /></a>T</strong>he <a title="Common ancestors" href="http://xianblog.wordpress.com/2011/06/08/common-ancestors/" ref="nofollow" target="_blank">birthday problem</a> (i.e. looking at the distribution of the birthdates in a group of n persons, assuming [wrongly] a uniform distribution of the calendar dates of those birthdates) is always a source of puzzlement [for me]! For instance, here is a <a href="http://stats.stackexchange.com/q/22009/7224" ref="nofollow" target="_blank">recent post</a> on Cross Validated:</p>
<blockquote>
<p style="text-align:justify;"><span style="color:#ff9900;"><em>I have 360 friends on facebook, and, as expected, the distribution of their birthdays is not uniform at all. I have one day with that has 9 friends with the same birthday. So, given that some days are more likely for a birthday, I&#8217;m assuming the number of 23 is an upperbound.</em></span></p>
</blockquote>
<p style="text-align:justify;">The figure 9 sounded unlikely, so I ran the following computation:</p>
<p><pre class="brush: r; gutter: false;">
extreme=rep(0,360)
for (t in 1:10^5){
  i=max(diff((1:360)[!duplicated(sort(sample(1:365,360,rep=TRUE))))]))
  extreme[i]=extreme[i]+1
  }
extreme=extreme/10^5
barplot(extreme,xlim=c(0,30),names=1:360)
</pre></p>
<p style="text-align:justify;">whose output shown on the above graph.<em> (Actually, I must confess I first forgot the </em>sort<em> in the code, which led me to then believe that 9 was one of the most likely values and post it on <a href="http://stats.stackexchange.com/q/22009/7224" ref="nofollow" target="_blank">Cross Validated</a>! The error was eventually picked by one administrator. I should know better than trust my own R code!)</em> According to this simulation, observing 9 or more people having the same birthdate has an approximate probability of 0.00032&#8230; Indeed, fairly unlikely!</p>
<p style="text-align:justify;"><strong>I</strong>ncidentally, this question led me to uncover how to print the above <a href="http://www.statmethods.net/advgraphs/parameters.html" ref="nofollow" target="_blank">on this webpage</a>. And to learn from <a href="http://stats.stackexchange.com/users/919/whuber" ref="nofollow" target="_blank">the X&#8217;idated moderator whuber</a> the use of <a href="http://stats.stackexchange.com/a/22012/7224" ref="nofollow" target="_blank">tabulate</a>. Which avoids the above loop:</p>
<p><pre class="brush: r; gutter: false;">
&gt; system.time(test(10^5)) #my code above
user  system elapsed
26.230   0.028  26.411
&gt; system.time(table(replicate(10^5, max(tabulate(sample(1:365,360,rep=TRUE))))))
user  system elapsed
5.708   0.044   5.762
</pre></p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/birthday/' ref="nofollow" target="_blank">birthday</a>, <a href='http://xianblog.wordpress.com/tag/coincidence/' ref="nofollow" target="_blank">coincidence</a>, <a href='http://xianblog.wordpress.com/tag/r/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/tag/william-feller/' ref="nofollow" target="_blank">William Feller</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14500/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14500/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14500/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14500/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14500/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14500/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14500/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14500/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14500&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/02/01/the-birthday-problem/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/the-birthday-problem-xidated/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
<enclosure url="http://xianblog.files.wordpress.com/2012/02/birthday.jpg" length="" type="" />
		</item>
		<item>
		<title>Weak Law of Large Numbers</title>
		<link>http://www.r-bloggers.com/weak-law-of-large-numbers/</link>
		<comments>http://www.r-bloggers.com/weak-law-of-large-numbers/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 23:49:05 +0000</pubDate>
		<dc:creator>H.Ishimaru</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://www.knowledgediscovery.jp/?p=221</guid>
		<description><![CDATA[1 Description The weak law of large numbers is a result in probability theory also known as Bernoulli&#8217;s  [...]]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/"> Knowledge Discovery » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/%E3%82%AD%E3%83%A3%E3%83%97%E3%83%81%E3%83%A34/" rel="attachment wp-att-289" ref="nofollow" target="_blank"><img src="http://www.knowledgediscovery.jp/wp-content/uploads/dda9bb331e5adc2b5a3ca1088fa71280.png" alt="" title="image0" width="663" height="500" class="alignleft size-full wp-image-289" /></a></p>
<h2>1 Description</h2>
<p>The weak law of large numbers is a result in probability theory also known as Bernoulli&#8217;s theorem. According to the law, the mean of the results obtained from a large number of trials is close to the  population mean.</p>
<p>Let <img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=X_%7B1%7D,%5C%20%5Cldots%5C%20,%20X_%7Bn%7D" /> be a sequence of independent and identically distributed random variables, each having a mean  <img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=E(X_%7Bi%7D)=%5Cmu" /> and variance <img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=V(X_%7Bi%7D)=%5Csigma%5E%7B2%7D" />. </p>
<p>Define a new variable,<br />
<img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=%5Coverline%7BX%7D%5Cequiv%5Cfrac%7BX_%7B1%7D%20+%20%5C%20%20%5Cldots%5C%20%20%20+%20X_%7Bn%7D%7D%7Bn%7D" /></p>
<p>Then,<br />
<img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=E(%5Coverline%7BX%7D)=E(%5Cfrac%7BX_%7B1%7D%20+%20%5Cldots%20+%20X_%7Bn%7D%7D%7Bn%7D)=%5Cfrac%7B1%7D%7Bn%7D(E(%7BX_%7B1%7D)%20+%20%5Cldots%5C%20%20%20+%20E(%7BX_%7Bn%7D))=%5Cfrac%7Bn%5Cmu%7D%7Bn%7D=%5Cmu"/></p>
<p><img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=V(%5Coverline%7BX%7D)=V(%5Cfrac%7BX_%7B1%7D%20+%20%5Cldots%20+%20X_%7Bn%7D%7D%7Bn%7D)=%5Cfrac%7B1%7D%7Bn%5E%7B2%7D%7D(V(X_%7B1%7D)%20+%20%5Cldots%20+%20V(%7BX_%7Bn%7D%7D))=%5Cfrac%7B1%7D%7Bn%5E%7B2%7D%7D(%5Csigma%5E%7B2%7D%20+%20%5Cldots%20+%20%5Csigma%5E%7B2%7D)=%5Cfrac%7B%5Csigma%5E%7B2%7D%7D%7Bn%7D"/></p>
<p>By the Chebyshev inequality, </p>
<p><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/codecogseqn-2/" rel="attachment wp-att-244" ref="nofollow" target="_blank"><img src="http://www.knowledgediscovery.jp/wp-content/uploads/CodeCogsEqn-2.gif" alt="" title="CodeCogsEqn (2)" width="304" height="110" class="alignleft size-full wp-image-244" /></a><br />
</BR></BR></BR></BR></BR></p>
<p>In brief,<br />
as <img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=n%20%5Cto%20%5Cinfty"/>, the sample mean <img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=%5Coverline%7BX%7D"/>equals the population mean <img src="http://chart.apis.google.com/chart?cht=tx&#038;chs=1x0&%23038;chco=000000&%23038;chl=%5Cmu"/> .</p>
<h2>2 Simulation in R</h2>
<p>The following is the results of simulations(Bi(n,p)).<br />
Moreover, parameter of the population mean is 0.4, sample number is 1,000.</p>
<p><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/%E3%82%AD%E3%83%A3%E3%83%97%E3%83%81%E3%83%A31/" rel="attachment wp-att-293" ref="nofollow" target="_blank"><img src="http://www.knowledgediscovery.jp/wp-content/uploads/eb22bf7a906b7f73af3d68cd620dade7-315x315.png" alt="" title="i1" width="315" height="315" class="alignleft size-medium wp-image-293" /></a></p>
<p><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/%E3%82%AD%E3%83%A3%E3%83%97%E3%83%81%E3%83%A32-2/" rel="attachment wp-att-292" ref="nofollow" target="_blank"><img src="http://www.knowledgediscovery.jp/wp-content/uploads/44d29d2792be3bad9a9ed98c7bbb20d61-314x315.png" alt="" title="i2" width="314" height="315" class="alignleft size-medium wp-image-292" /></a></p>
<p><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/%E3%82%AD%E3%83%A3%E3%83%97%E3%83%81%E3%83%A33-4/" rel="attachment wp-att-294" ref="nofollow" target="_blank"><img src="http://www.knowledgediscovery.jp/wp-content/uploads/0e4aae38a1f5494592124a884058f7cb3-313x315.png" alt="" title="i3" width="313" height="315" class="alignleft size-medium wp-image-294" /></a></p>
<h2>3 Appendix</h2>
<p>This is the sample script of R.<br />
Let&#8217;s try the Simulation in R with different parameters.</p>
<pre class="brush: xml; title: ; notranslate">
#setting a parameters of Bi(n, p)
n &lt;- 1000
p &lt;- 0.4

#dataframe
df &lt;- data.frame(bi = rbinom(n, 1, p)  ,count = 0, mean = 0)
ifelse(df$bi[1] == 1, df[1, 2:3] &lt;- 1, 0)
for (i in 2 : n){
  df$count[i] &lt;- ifelse(df$bi[i] == 1, df$count[i]&lt;-df$count[i - 1]+1, df$count[i - 1])
  df$mean[i] &lt;- df$count[i] / i
}

#graph
plot(df$mean, type='l',
      main = &quot;Simulation of the Low of Large Numbers&quot;,
      xlab=&quot;Numbers&quot;, ylab=&quot;Sample mean&quot;)
abline(h = p, col=&quot;red&quot;)
</pre>

<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://www.knowledgediscovery.jp/weak-law-of-large-numbers/"> Knowledge Discovery » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/weak-law-of-large-numbers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ultimate R recursion</title>
		<link>http://www.r-bloggers.com/ultimate-r-recursion/</link>
		<comments>http://www.r-bloggers.com/ultimate-r-recursion/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 15:16:19 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[accept-reject algorithm]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[computer language]]></category>
		<category><![CDATA[exam]]></category>
		<category><![CDATA[Monte Carlo methods]]></category>
		<category><![CDATA[normalising constant]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[recursion]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[University life]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14492</guid>
		<description><![CDATA[One of my students wrote the following code for his R exam, trying to do accept-reject simulation (of a Rayleigh distribution) and constant approximation at the same time: which I find remarkable if alas doomed to fail! I wonder if there exists a (real as opposed to fantasy) computer language where you could introduce constants [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14492&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/01/31/ultimate-r-recursion/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><strong>O</strong>ne of my students wrote the following code for <a title="R exam" href="http://xianblog.wordpress.com/2011/11/28/r-exam-2/" ref="nofollow" target="_blank">his R exam</a>, trying to do accept-reject simulation (of a <a href="http://en.wikipedia.org/wiki/Rayleigh_distribution" ref="nofollow" target="_blank">Rayleigh distribution</a>) and constant approximation at the same time:</p>
<p><pre class="brush: r; gutter: false;">
fAR1=function(n){
 u=runif(n)
 x=rexp(n)
 f=(C*(x)*exp(-2*x^2/3))
 g=dexp(n,1)
 test=(u&lt;f/(3*g))
 y=x[test]
 p=length(y)/n #acceptance probability
 M=1/p
 C=M/3
 hist(y,20,freq=FALSE)
 return(x)
 }
</pre></p>
<p style="text-align:justify;">which I find remarkable if alas doomed to fail! I wonder if there exists a (real as opposed to fantasy) computer language where you could introduce constants C and only define them later&#8230; (What&#8217;s rather sad is that I keep insisting on the fact that <a href="http://www.amazon.com/gp/product/1441915753?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1441915753" ref="nofollow" target="_blank">accept-reject does not need the constant C to operate</a>. And that I found the same mistake in several of the students&#8217; code. There is a further mistake in the above code when defining <em>g</em>. I also wonder where the <em>3</em> came from&#8230;)</p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/books/' ref="nofollow" target="_blank">Books</a>, <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/accept-reject-algorithm/' ref="nofollow" target="_blank">accept-reject algorithm</a>, <a href='http://xianblog.wordpress.com/tag/computer-language/' ref="nofollow" target="_blank">computer language</a>, <a href='http://xianblog.wordpress.com/tag/exam/' ref="nofollow" target="_blank">exam</a>, <a href='http://xianblog.wordpress.com/tag/monte-carlo-methods/' ref="nofollow" target="_blank">Monte Carlo methods</a>, <a href='http://xianblog.wordpress.com/tag/normalising-constant/' ref="nofollow" target="_blank">normalising constant</a>, <a href='http://xianblog.wordpress.com/tag/r/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/tag/recursion/' ref="nofollow" target="_blank">recursion</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14492/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14492/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14492/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14492/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14492/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14492/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14492/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14492/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14492&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/01/31/ultimate-r-recursion/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/ultimate-r-recursion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
		</item>
		<item>
		<title>the Art of R Programming [guest post]</title>
		<link>http://www.r-bloggers.com/the-art-of-r-programming-guest-post/</link>
		<comments>http://www.r-bloggers.com/the-art-of-r-programming-guest-post/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 23:12:51 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[Norman Matloff]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[University life]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14442</guid>
		<description><![CDATA[(This post is the preliminary version of a book review by Alessandra Iacobucci, to appear in CHANCE. Enjoy [both the review and the book]!) As Rob J. Hyndman enthusiastically declares in his blog, &#8220;this is a gem of a book&#8221;. I would go even further and argue that The Art of R programming is a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14442&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/01/31/the-art-of-r-programming-guest-post/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><em>(This post is the preliminary version of a book review by Alessandra Iacobucci, to appear in CHANCE. Enjoy [both the review and the book]!)</em></p>
<p style="text-align:justify;"><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1593273843" ref="nofollow" target="_blank"><img class="aligncenter" src="http://akamaicovers.oreilly.com/images/9781593273842/lrg.jpg" alt="" width="300" height="397" /></a></p>
<p style="text-align:justify;"><strong>A</strong>s Rob J. Hyndman enthusiastically declares <a href="http://robjhyndman.com/researchtips/matloff/" ref="nofollow" target="_blank">in his blog</a>, &#8220;this is a gem of a book&#8221;. I would go even further and argue that <em><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1593273843" ref="nofollow" target="_blank">The Art of R programming</a></em> is a whole mine of gems. The book is well constructed, and has a very coherent structure.</p>
<p style="text-align:justify;"><strong>A</strong>fter an introductory chapter, where the reader gets a quick overview on R basics that allows her to work through the examples in the following chapters, the rest of the book can be divided in three main parts. In the first part (Chapters 2 to 6) the reader is introduced to main R objects and to the functions built to handle and operate on each of them. The second part (Chapters 7 to 13) is focussed on general programming issues: R structures and object-oriented nature, I/O, string handling and manipulating issues, and graphics. Chapter 13 is all devoted to the topic of debugging. The third part deals with more advanced topics, such as speed of execution and performance issues (Chapter 14), mix-matching functions written in R and C (or Python), and parallel processing with R. Even though this last part is intended for more experienced programmers, the overall programming skills of the intended reader &#8220;may range anywhere from those of a professional software developer to `I took a programming course in college&#8217;.&#8221; (p.xxii).</p>
<p style="text-align:justify;"><strong>W</strong>ith a fluent style, Matloff is able to deal with a large number of topics in a relatively limited number of pages, resulting in an astonishingly complete yet handy guide. At almost every page we discover a new command, most likely <em>the</em> command we had always looked for and done without by means of more or less cumbersome roundabouts. As a matter of fact, it is possible that there exists a ready-made and perfectly suited R function for nearly anything that comes up to one&#8217;s mind. Users coming from compiled programming languages may find it difficult to get used to this wealth of functions, just as they may feel uncomfortable not declaring variable types, not initializing vectors and arrays, or getting rid of loops. Nevertheless, through numerous examples and a precise knowledge of its strengths and limitations, Matloff masterly introduces the reader to the flexibility of R. He repeatedly underlines the functional nature of R in every part of the book and stresses from the outset how this feature has to be exploited for an effective programming.<span id="more-14442"></span></p>
<blockquote>
<p style="text-align:justify;"><em>&#8220;One of the most effective ways to achieve speed in R code is to use operations that are {\em vectorized}, meaning that a function applied to a vector is actually applied individually to each element.&#8221; (p.40). </em></p>
</blockquote>
<p style="text-align:justify;"><strong>T</strong>he result is so convincing that it pushes even the strictest code purist to free herself from prejudices and surrender to the  pleasures of an interpreted language. This probably was the hardest challenge in writing <em><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1593273843" ref="nofollow" target="_blank">The Art of R programming</a></em>, and the author brilliantly met it.</p>
<p style="text-align:justify;"><strong>T</strong>he climax is unquestionably attained in the final chapters, where Matloff introduces some advanced and unusual topics with remarkable clarity and briskness. Within a few pages, he manages to tackle the object-oriented side of R, to advise and instruct the reader on debugging and performance issues, to show how to deal with R and C (or Python) mixed codes, and finally to open new perspectives by presenting the different approaches to parallel R. There is even a mention of GPU programming, a short paragraph certainly inexhaustive, but still instructive. To my knowledge, this is the only R handbook in which parallel programming with R is tackled with some degree of detail (I only found a hint of it in <em><a href="http://www.amazon.com/gp/product/059680170X/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=059680170X" ref="nofollow" target="_blank">R in a nutshell</a></em>, yet no programming details are given therein.). Also, the importance and prominence given to debugging are commendable, since this topic is often and mistakenly disregarded in most programming handbooks<em>—</em>except those explicitly written on the subject. Among the sharpest passages of the book, I definitely include the ones on scope and environment issues, to which are devoted both a long section in Chapter 17 and a tiny simple yet enlightening example as early as page 9.</p>
<blockquote>
<p style="text-align:justify;"><em>&#8220;Note carefully the role of <strong>w</strong>. The R interpreter found that there was no local variable of that name, so it ascended to the next level [...] where it found a variable <strong>w</strong> with value 12. [...]. It is possible (though not desirable) to deliberately allow name conflicts in this hierarchy. [...] In such a situation the innermost environment is used first.&#8221; (p.153).</em></p>
</blockquote>
<p style="text-align:justify;"><strong>T</strong>he message is clear: know exactly what you want to implement, keep track of all your objects, and scoping will not be an issue but another tool.</p>
<blockquote>
<p style="text-align:justify;"><em>&#8220;In C, we would not have functions defined within functions [...]. Yet, since functions are objects, it is possible&#8211;and sometimes desirable from the point of view of the encapsulation goal of object-oriented programming—to define a function within a function; we are simply creating an object, which we can do anywhere.&#8221; (p.152-3).</em></p>
</blockquote>
<p style="text-align:justify;"><strong>A</strong>nother little gem is Section 7.9 on recursion, a concept that Matloff presents in a very clear and intuitive way. This section ends with one the most inspired extended examples proposed in the book, where recursion is used to implement a binary search tree. Other interesting extended examples are those about discrete-event simulation (Section 7.8.3), Markov Chains (Section 8.4.2) and polynomial regression (Section 9.1.7), though these applications may be a little too challenging for readers lacking a solid background in Statistics.</p>
<p style="text-align:justify;"><strong>A</strong>lthough <em><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1593273843" ref="nofollow" target="_blank">The Art of R programming</a></em> is a book of many virtues, there are in my opinion some flaws:</p>
<p style="text-align:justify;"><strong>T</strong>he presence of lines of R code starting from the first few pages encourages the user to test her understandings straight away while reading, making <em><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1593273843" ref="nofollow" target="_blank">The Art of R programming</a></em> a sort of plug-and-play guide through R. Unfortunately, the pleasure of real-time testing is spoiled by two things. First, the reader has to copy those codes line by line. This is unquestionably useful for the many simple examples scattered throughout the book. However, it may become an inexhaustible source of typos, both pointless and annoying<em>—</em>not to mention time-consuming<em>—</em>when it comes to more complicated programs like those expounded in the many Extended Example sections. Second, the databases are unavailable so some applications are simply unusable (I managed to find <a href="http://archive.ics.uci.edu/ml/datasets/Abalone" ref="nofollow" target="_blank">the abalone data</a> set for extended examples of Sections 2.9.2 and 4.4.3 thus discovering <a href="http://archive.ics.uci.edu/ml/datasets" ref="nofollow" target="_blank">this interesting repository</a>, but for the rest my research was rather inconclusive.) I am referring here to virtually all the extended examples in Chapters 5 and 6 on data frames, factors and tables. In particular, I find the application on the aids for learning Chinese dialect (Section 5.4.3) so over-elaborate to be nearly worthless. I would certainly suggest designing a dedicated package assembling all the necessary material for a fully profitable training with the book, like the package <a href="http://xianblog.wordpress.com/2012/01/31/the-art-of-r-programming-guest-post/cran.r-project.org/web/packages/mcsm/" ref="nofollow" target="_blank">mcsm</a> conceived by Robert and Casella for reproducing the results contained in <a href="http://www.amazon.com/gp/product/1441915753/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1441915753" ref="nofollow" target="_blank">their book on Monte Carlo methods with R</a>.</p>
<p style="text-align:justify;"><strong>I</strong>n addition, surely R can handle huge databases with great ease, and maybe I am giving way to my personal preferences here, but I find that two whole chapters on data frames and factors (adding up to almost 40 pages!) are perhaps too much. On the contrary, I believe that the &#8220;traditional&#8221; graphic package would have deserved more space and consideration, not only in the devoted chapter (Chapter 12) but generally throughout the book. Indeed, the author suggests some good handbooks on the subject by <a href="http://www.amazon.com/gp/product/1439831769/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1439831769" ref="nofollow" target="_blank">Murrel</a> and <a href="http://www.amazon.com/gp/product/0387981403/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0387981403" ref="nofollow" target="_blank">Wickham</a>, but these are too detailed and advanced to be used for general purposes.</p>
<p style="text-align:justify;"><strong>D</strong>espite an overall concise style, there are some long-winded passage and repetitions, especially in the applications, where certain lines of code are definitely redundant. I was likewise puzzled by the total absence in the book of the command separator <em><strong>;</strong></em>, which would have considerably shortened and lightened some unnecessarily long examples. Also, a separate and more detailed index of R commands and functions would be helpful.</p>
<p style="text-align:justify;"><strong>F</strong>inally, a minor but curious point about the assignment operator. I find the issue of <em><strong>&lt;-</strong></em> vs. <strong><em>=</em></strong> particularly fascinating and a bit perturbing, since this leaves in fact an ambiguity in the definition of such a fundamental operator. Still, there seem to be two main streams and no general agreement. Reading on various blogs and discussion forums, I found no decisive nor robust argument in favor of either. Matloff approaches the issue of <em><strong>&lt;-</strong></em> vs. <strong><em>=</em></strong> in assignments as soon as page 4. As he says, &#8220;The standard assignment operator in R is <em><strong>&lt;-</strong></em><strong><em></em></strong>. You can also use <em><strong>=</strong></em>, but this is discouraged, as it does not work in some special situations.&#8221;. I was really eager to see these &#8220;special situations&#8221; shown in concrete examples. Unfortunately, they are nowhere to be listed in the book.</p>
<p style="text-align:justify;"><strong>N</strong>otwithstanding these minor defaults, <em><a href="http://www.amazon.com/gp/product/1593273843/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1593273843" ref="nofollow" target="_blank">The Art of R programming</a></em> is enriching, enjoyable and definitely worthwhile keeping as a reference while working with R. I highly recommend it to programmers, academic researchers and students in computational statistics willing to be quickly operational in writing R software.  And it is undoubtedly a really useful reading for any R user.</p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/books/' ref="nofollow" target="_blank">Books</a>, <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/c/' ref="nofollow" target="_blank">C</a>, <a href='http://xianblog.wordpress.com/tag/norman-matloff/' ref="nofollow" target="_blank">Norman Matloff</a>, <a href='http://xianblog.wordpress.com/tag/programming/' ref="nofollow" target="_blank">programming</a>, <a href='http://xianblog.wordpress.com/tag/r/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/tag/software/' ref="nofollow" target="_blank">software</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14442/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14442/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14442&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/01/31/the-art-of-r-programming-guest-post/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/the-art-of-r-programming-guest-post/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
<enclosure url="http://akamaicovers.oreilly.com/images/9781593273842/lrg.jpg" length="" type="" />
		</item>
		<item>
		<title>ABC [PhD] course</title>
		<link>http://www.r-bloggers.com/abc-phd-course/</link>
		<comments>http://www.r-bloggers.com/abc-phd-course/#comments</comments>
		<pubDate>Wed, 25 Jan 2012 23:12:02 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[ABC]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[CREST]]></category>
		<category><![CDATA[graduate course]]></category>
		<category><![CDATA[indirect inference]]></category>
		<category><![CDATA[Malakoff]]></category>
		<category><![CDATA[model choice]]></category>
		<category><![CDATA[Paris]]></category>
		<category><![CDATA[phd]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Roma]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[summary statistics]]></category>
		<category><![CDATA[Travel]]></category>
		<category><![CDATA[University life]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14394</guid>
		<description><![CDATA[As mentioned in the latest post on ABC, I am giving a short doctoral course on ABC methods and convergence at CREST next week. I have now made a preliminary collection of my slides (plus a few from Jean-Michel Marin&#8217;s), available on slideshare (as ABC in Roma, because I am also giving the course in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14394&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/01/26/abc-phd-course/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><strong>A</strong>s mentioned in the <a title="ABC and SMC²" href="http://xianblog.wordpress.com/2012/01/17/abc-and-smc%C2%B2/" ref="nofollow" target="_blank">latest post on ABC</a>, I am giving a short doctoral course on ABC methods and convergence <a href="http://www.crest.fr/content/view/114/136/" ref="nofollow" target="_blank">at CREST next week</a>. I have now made a preliminary collection of my slides (plus a few from Jean-Michel Marin&#8217;s), available on <a href="http://www.slideshare.net/xianblog/abc-in-roma" ref="nofollow" target="_blank">slideshare</a> (as <em>ABC in Roma</em>, because I am also giving the course in Roma, next month, with an R lab on top of it!):</p>
<p style="text-align:justify;"><iframe src='http://www.slideshare.net/slideshow/embed_code/11170985' width='450' height='369'></iframe></p>
<p style="text-align:justify;">and I did manage to go over the book by <a href="http://www.amazon.com/gp/product/0198774753/ref=as_li_ss_tl?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0198774753" ref="nofollow" target="_blank">Gouriéroux and Monfort</a> on indirect inference over the weekend. I still need to beef up the slides before the course starts next Thursday! <em>(The core version of the slides is actually from the course I gave in <a title="ABC lectures [finale]" href="http://xianblog.wordpress.com/2010/11/01/abc-lectures-finale/" ref="nofollow" target="_blank">Wharton</a> more than a year ago.)</em></p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/books/' ref="nofollow" target="_blank">Books</a>, <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/travel/' ref="nofollow" target="_blank">Travel</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/abc/' ref="nofollow" target="_blank">ABC</a>, <a href='http://xianblog.wordpress.com/tag/crest/' ref="nofollow" target="_blank">CREST</a>, <a href='http://xianblog.wordpress.com/tag/graduate-course/' ref="nofollow" target="_blank">graduate course</a>, <a href='http://xianblog.wordpress.com/tag/indirect-inference/' ref="nofollow" target="_blank">indirect inference</a>, <a href='http://xianblog.wordpress.com/tag/malakoff/' ref="nofollow" target="_blank">Malakoff</a>, <a href='http://xianblog.wordpress.com/tag/model-choice/' ref="nofollow" target="_blank">model choice</a>, <a href='http://xianblog.wordpress.com/tag/paris/' ref="nofollow" target="_blank">Paris</a>, <a href='http://xianblog.wordpress.com/tag/phd/' ref="nofollow" target="_blank">PhD</a>, <a href='http://xianblog.wordpress.com/tag/r/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/tag/roma/' ref="nofollow" target="_blank">Roma</a>, <a href='http://xianblog.wordpress.com/tag/summary-statistics/' ref="nofollow" target="_blank">summary statistics</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14394/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14394/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14394/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14394&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/01/26/abc-phd-course/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/abc-phd-course/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
		</item>
		<item>
		<title>Internet surveys</title>
		<link>http://www.r-bloggers.com/internet-surveys/</link>
		<comments>http://www.r-bloggers.com/internet-surveys/#comments</comments>
		<pubDate>Wed, 18 Jan 2012 23:30:12 +0000</pubDate>
		<dc:creator>Rob J Hyndman</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Research tips]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://robjhyndman.com/researchtips/?p=1701</guid>
		<description><![CDATA[I received the following email today: I am preparing a thesis … I need to conduct the widest possible poll, and it occurred to me that perhaps you could guide me toward an internet-based way in which this can be done easily. I have a ten-question questionnaire prepared, that I wish to have an random sample of the population respond to. I have no budget for this, so I hope you can suggest a way in which a good number of responses can be harvested using blogs or sites you may be aware of. Here is my response. There are two issues here. The first is to find a convenient web-based data-collection tool. One popular approach is to use a survey form on Google Docs. The results are automatically saved to a Google spreadsheet. There are many online explanations of how to set up your survey form including this from Google help or this from digital inspirations. A more sophisticated tool for more complex surveys is SurveyMonkey. This allows skipping questions based on previous responses, response validation, and other useful features. For researchers collecting data, I generally recommend that they use SurveyMonkey. But for a quick poll of a small group,<a href="http://robjhyndman.com/researchtips/surveys/"> <br /><br /> (More)…</a>]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://robjhyndman.com/researchtips/surveys/"> Research tips » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p>I received the following email today:</p>
<blockquote><p>I am preparing a thesis … I need to conduct the widest possible poll, and it occurred to me that perhaps you could guide me toward an internet-based way in which this can be done easily. I have a ten-question questionnaire prepared, that I wish to have an random sample of the population respond to. I have no budget for this, so I hope you can suggest a way in which a good number of responses can be harvested using blogs or sites you may be aware of.</p></blockquote>
<p>Here is my response.<span id="more-1701"></span></p>
<p>There are two issues here. The first is to find a convenient web-based data-collection tool. One popular approach is to use a <strong>survey form on <a href="http://docs.google.com" ref="nofollow" target="_blank">Google Docs</a></strong>. The results are automatically saved to a Google spreadsheet. There are many online explanations of how to set up your survey form including <a href="http://support.google.com/docs/bin/answer.py?hl=en&amp;answer=87809" ref="nofollow" target="_blank">this from Google help</a> or <a href="http://www.labnol.org/software/google-docs-forms-for-surveys/10056/" ref="nofollow" target="_blank">this from digital inspirations</a>. A more sophisticated tool for more complex surveys is <strong><a href="http://www.surveymonkey.com" ref="nofollow" target="_blank">SurveyMonkey</a></strong>. This allows skipping questions based on previous responses, response validation, and other useful features. For researchers collecting data, I generally recommend that they use <a href="http://www.surveymonkey.com" ref="nofollow" target="_blank">SurveyMonkey</a>. But for a quick poll of a small group, Google Docs is adequate. Using either tool, the responses can be downloaded and imported into R or some other statistical analysis package. Web-based data collection avoids all the problems associated with entering and encoding data, although one drawback is the tech barrier for some audiences. You won’t be able to use web-based data collection for a survey of the elderly, or of remote Amazonian tribes, or of many other populations where not everyone uses the internet. But if it is reasonable to assume that all members of the population use the internet, then web-based collection is much better than paper-based forms.</p>
<p>The second issue is more difficult. That is, how to get a random sample of the population. Here, there are no magic tech solutions. Advertising on blogs or other sites will simply give you a biased sample favouring those who read the blogs and have the time and interest to respond. Then you have to make the courageous assumption that the responders are representative of the population of interest. It is better to identify the population of interest first, and find some way of randomly sampling it where each member of the population has equal probability of being selected in the sample. How this can be done depends on the particular population being studied. I suggest you discuss a sampling strategy with the statisticians at your university. There are also some good online references including <a href="http://www.aapor.org/Best_Practices1.htm" ref="nofollow" target="_blank">“Best practices”</a> from the AAPOR, and <a href="http://whatisasurvey.info/" ref="nofollow" target="_blank">“What is a survey?”</a> by Fritz Scheuren. A useful textbook is <em><a href="http://www.amazon.com/gp/product/0495105279/ref=as_li_ss_tl?ie=UTF8&amp;tag=prorobjhyn-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0495105279" ref="nofollow" target="_blank">Sampling: Design and Analysis</a></em> by Sharon Lohr (Duxbury Press, 2009, 2nd ed.).</p>

<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://robjhyndman.com/researchtips/surveys/"> Research tips » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/internet-surveys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>non-stationary AR(10)</title>
		<link>http://www.r-bloggers.com/non-stationary-ar10/</link>
		<comments>http://www.r-bloggers.com/non-stationary-ar10/#comments</comments>
		<pubDate>Wed, 18 Jan 2012 23:12:02 +0000</pubDate>
		<dc:creator>xi'an</dc:creator>
				<category><![CDATA[R bloggers]]></category>
		<category><![CDATA[AR(p) model]]></category>
		<category><![CDATA[Bayesian Core]]></category>
		<category><![CDATA[Books]]></category>
		<category><![CDATA[polynomials]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Stationarity]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Time series]]></category>
		<category><![CDATA[University life]]></category>

		<guid isPermaLink="false">http://xianblog.wordpress.com/?p=14252</guid>
		<description><![CDATA[In the revision of Bayesian Core on which Jean-Michel Marin and I worked together most of last week, having missed our CIRM break last summer (!), we have now included an illustration of what happens to an AR(p) time series when the customary stationarity+causality condition on the roots of the associated polynomial is not satisfied.  [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&#38;blog=5051449&#38;post=14252&#38;subd=xianblog&#38;ref=&#38;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 12px;">
(This article was first published on  <strong><a href="http://xianblog.wordpress.com/2012/01/19/non-stationary-ar10/"> Xi'an's Og » R</a></strong>, and kindly contributed to <a href="http://www.r-bloggers.com/">R-bloggers)</a>      
</div></p>
<p style="text-align:justify;"><strong>I</strong>n the <a href="http://www.amazon.com/dp/0387389792?tag=chrprobboo-20&amp;camp=14573&amp;creative=327641&amp;linkCode=as1&amp;creativeASIN=0387389792&amp;adid=0H3E228DC5VJ6PJFRP6W&amp;" ref="nofollow" target="_blank"><img class="alignleft" style="margin-left:5px;margin-right:5px;" src="http://www.springer.com/cda/content/image/cda_displayimage.jpg?SGWID=0-0-16-301393-0" alt="" width="95" height="148" /></a>revision of <em><strong><a href="http://www.amazon.com/gp/product/0387389792?ie=UTF8&amp;tag=chrprobboo-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0387389792" ref="nofollow" target="_blank">Bayesian Core</a></strong></em> on which Jean-Michel Marin and I worked together most of last week, having missed our <a title="Core not in CiRM" href="http://xianblog.wordpress.com/2011/07/28/core-not-in-cirm/" ref="nofollow" target="_blank">CIRM break</a> last summer (!), we have now included an illustration of what happens to an AR(p) time series when the customary stationarity+causality condition on the roots of the associated polynomial is not satisfied.  More specifically, we generated several time-series with the same underlying white noise and random coefficients that have a fair chance of providing non-stationary series and then plotted the 260 next steps of the series by the R code</p>
<blockquote><p><pre class="brush: r; gutter: false;">
p=10
T=260
dat=seri=rnorm(T) #white noise

par(mfrow=c(2,2),mar=c(2,2,1,1))
for (i in 1:4){
  coef=runif(p,min=-.5,max=.5)
  for (t in ((p+1):T))
    seri[t]=sum(coef*seri[(t-p):(t-1)])+dat[t]
  plot(seri,ty=&quot;l&quot;,lwd=2,ylab=&quot;&quot;)
  }
</pre></p></blockquote>
<p>leading to outputs like the following one</p>
<p><img class="aligncenter size-full wp-image-14254" title="AR(10) evolution under lack of stationarity, figure to appear in the next edition of Bayesian Core" src="http://xianblog.files.wordpress.com/2012/01/ardvrg.jpg?w=450&#038;h=450" alt="" width="450" height="450" /></p>
<br />Filed under: <a href='http://xianblog.wordpress.com/category/books/' ref="nofollow" target="_blank">Books</a>, <a href='http://xianblog.wordpress.com/category/statistics/r-statistics/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/category/statistics/' ref="nofollow" target="_blank">Statistics</a>, <a href='http://xianblog.wordpress.com/category/university-life/' ref="nofollow" target="_blank">University life</a> Tagged: <a href='http://xianblog.wordpress.com/tag/arp-model/' ref="nofollow" target="_blank">AR(p) model</a>, <a href='http://xianblog.wordpress.com/tag/bayesian-core/' ref="nofollow" target="_blank">Bayesian Core</a>, <a href='http://xianblog.wordpress.com/tag/polynomials/' ref="nofollow" target="_blank">polynomials</a>, <a href='http://xianblog.wordpress.com/tag/r/' ref="nofollow" target="_blank">R</a>, <a href='http://xianblog.wordpress.com/tag/stationarity/' ref="nofollow" target="_blank">stationarity</a>, <a href='http://xianblog.wordpress.com/tag/time-series/' ref="nofollow" target="_blank">time series</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/xianblog.wordpress.com/14252/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/xianblog.wordpress.com/14252/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/xianblog.wordpress.com/14252/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/xianblog.wordpress.com/14252/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/xianblog.wordpress.com/14252/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/xianblog.wordpress.com/14252/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/xianblog.wordpress.com/14252/" ref="nofollow" target="_blank"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/xianblog.wordpress.com/14252/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=xianblog.wordpress.com&amp;blog=5051449&amp;post=14252&amp;subd=xianblog&amp;ref=&amp;feed=1" width="1" height="1" />
<p class="syndicated-attribution"><div style="border: 1px solid; background: none repeat scroll 0 0 #EDEDED; margin: 1px; font-size: 13px;">
<div style="text-align: center;">To <strong>leave a comment</strong> for the author, please follow the link and comment on his blog: <strong><a href="http://xianblog.wordpress.com/2012/01/19/non-stationary-ar10/"> Xi'an's Og » R</a></strong>.</div>
<hr />
<a href="http://www.r-bloggers.com/">R-bloggers.com</a> offers <strong><a href="http://feedburner.google.com/fb/a/mailverify?uri=RBloggers">daily e-mail updates</a></strong> about <a title="The R Project for Statistical Computing" href="http://www.r-project.org/">R</a> news and <a title="R tutorials" href="http://www.r-bloggers.com/?s=tutorial">tutorials</a> on topics such as: visualization (<a title="ggplot and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=ggplot2">ggplot2</a>, <a title="Boxplots using lattice and ggplot2 tutorials" href="http://www.r-bloggers.com/?s=boxplot">Boxplots</a>, <a title="Maps and gis" href="http://www.r-bloggers.com/?s=map">maps</a>, <a title="Animation in R" href="http://www.r-bloggers.com/?s=animation">animation</a>), programming (<a title="RStudio IDE for R" href="http://www.r-bloggers.com/?s=RStudio">RStudio</a>, <a title="Sweave and literate programming" href="http://www.r-bloggers.com/?s=sweave">Sweave</a>, <a title="LaTeX in R" href="http://www.r-bloggers.com/?s=LaTeX">LaTeX</a>, <a title="SQL and databases" href="http://www.r-bloggers.com/?s=SQL">SQL</a>, <a title="Eclipse IDE for R" href="http://www.r-bloggers.com/?s=eclipse">Eclipse</a>, <a title="git and github, Version Control System" href="http://www.r-bloggers.com/?s=git">git</a>, <a title="Large data in R using Hadoop" href="http://www.r-bloggers.com/?s=hadoop">hadoop</a>, <a title="Web Scraping of google, facebook, yahoo, twitter and more using R" href="http://www.r-bloggers.com/?s=Web+Scraping">Web Scraping</a>) statistics (<a title="Regressions and ANOVA analysis tutorials" href="http://www.r-bloggers.com/?s=regression">regression</a>, <a title="principal component analysis tutorial" href="http://www.r-bloggers.com/?s=PCA">PCA</a>, <a title="Time series" href="http://www.r-bloggers.com/?s=time+series">time series</a>,<a title="ecdf" href="http://www.r-bloggers.com/?s=ecdf">ecdf</a>, <a title="finance trading" href="http://www.r-bloggers.com/?s=trading">trading</a>) and more...
</div></p>]]></content:encoded>
			<wfw:commentRss>http://www.r-bloggers.com/non-stationary-ar10/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
<enclosure url="http://1.gravatar.com/avatar/ba847ef5873101769043f6260d57282a?s=96&amp;amp;d=identicon" length="" type="" />
<enclosure url="http://www.springer.com/cda/content/image/cda_displayimage.jpg?SGWID=0-0-16-301393-0" length="" type="" />
<enclosure url="http://xianblog.files.wordpress.com/2012/01/ardvrg.jpg" length="" type="" />
		</item>
	</channel>
</rss>

