July 2009

One sample Student’s t-test

July 23, 2009 | 0 Comments

Comparison of the sample mean with a known value, when the variance of the population is not known.Consider the exercise we have just seen before.It was made an intelligence test in 10 subjects, and here are the results obtained. The average result of ...
[Read more...]

Two sample Z-test

July 22, 2009 | 0 Comments

Comparison of the means of two independent groups of samples, taken from two populations with known variance.Is asked to compare the average heights of two groups. The first group (A) consists of individuals of Italian nationality (the variance of the ...
[Read more...]

Massively parallel database for analytics

July 22, 2009 | 0 Comments

This is by far the best description of why traditional parallel databases (like Teradata, Greenplum et al.) is a evolutionary dead end. But much more than a theoretical discussion, they have built a solution which they call HadoopDB. It is based on Hadoop, PostgreSQL, and Hive and is completely Open ...
[Read more...]

One sample Z-test

July 21, 2009 | 0 Comments

Comparison of the sample mean with know population mean and standard deviation.Suppose that 10 volunteers have done an intelligence test; here are the results obtained. The mean obtained at the same test, from the entire population is 75. You want to c...
[Read more...]

RGG#155, 156 and 157

July 21, 2009 | 0 Comments

I pushed 3 more graphics from Biecek Przemyslaw to the graphics gallery A list of popular names for colors from packages RColorBrewer, colorRamps, grDevices A set of examples of few graphical low-level parameters lend, ljoin, xpd, adj, lege...
[Read more...]

Score with scoring rules

July 21, 2009 | 0 Comments

INCENTIVES TO STATE PROBABILITIES OF BELIEF TRUTHFULLY We have all been there. You are running an experiment in which you would like participants to tell you what they believe. In particular, you’d like them to tell you what they believe to be the probability that an event will occur. ...
[Read more...]

Geometric and harmonic means in R

July 20, 2009 | 0 Comments

Compute the geometric mean and harmonic mean in R of this sequence.10, 2, 19, 24, 6, 23, 47, 24, 54, 77These features are not present in the standard package of R, although they are easily available in some packets. However, it is easy to calculate the...
[Read more...]

Adding a legend to a plot

July 20, 2009 | 0 Comments

It's pretty easy!plot (c(1968,2010),c(0,10),type="n", # sets the x and y axes scales xlab="Year",ylab="Expenditures/GDP (%)") # adds titles to the axes lines(year,defense,col="red",lwd=2.5) # adds a line for defense expenditures lines(year,health,col="... [Read more...]

Example 7.6: Find Amazon sales rank for a book

July 20, 2009 | 0 Comments

In honor of Amazon's official release date for the book, we offer this blog entry.Both SAS and R can be used to find the Amazon Sales Rank for a book by downloading the desired web page and ferreting out the appropriate line. This code is likely to br...
[Read more...]

ggplot2: more wicked-cool plots in R

July 20, 2009 | 0 Comments

As far as I know there are 3 different systems for producing figures in R: (1) base graphics, included with R, (2) the lattice package, and (3) ggplot2, one of the newer plotting systems which is, according to the creator Hadley Wickham, "based on the grammar of graphics, which tries to take the good ... [Read more...]

Probability exercise: negative binomial distribution

July 19, 2009 | 0 Comments

What is the probability you get the 4th cross before the 3rd head, flipping a coin?The mathematical formula for solving this exercise, which follows a negative binomial distribution, is:$$f(x)=P(X=x)=\begin{pmatrix} x+y-1\\ y-1 \end{pmatrix} \cdot p^x ...
[Read more...]

New RInside release

July 19, 2009 | 0 Comments

I just rolled up a new release of RInside, my C++ wrapper classes which facilitate embedding R into your own C++ application. This releases owes a big Thank you! to Miguel Lechón who not only noticed errant behaviour and occassional segfaults with overly long commands sent to the embedded ... [Read more...]

David Varadi’s RSI(2) alternative

July 19, 2009 | 0 Comments

Here's a quick R implementation of David Varadi's alternative to the RSI(2).  Michael Stokes over at the MarketSci blog has three great posts exploring this indicator: Varadi’s RSI(2) Alternative: The DV(2) RSI(2) vs. DV(2) Last Couple...
[Read more...]

A probability exercise on the Bernoulli distribution

July 18, 2009 | 0 Comments

What is the probability, flipping a coin 8 times, to obtain the sequence HHTTTHTT? (H = head; T= tail)The theory teaches us that to solve this question, we can simply use the following formula:$$f(x)=P(X=x)=B(n,p)=\begin{pmatrix}n\\ x \end{pmatrix} \cd...
[Read more...]

Let us practice with some functions of R

July 18, 2009 | 0 Comments

Given the following data set, compute the arithmetic mean, median, variance, standard deviation; find the greatest and the smaller value, the sum of all values, the square of the sum of all values, the sum of the square of all values; assigne the ranks...
[Read more...]

Book excerpts now posted

July 18, 2009 | 0 Comments

We've posted excerpts from the book on the book website. The excerpts include Chapter 3 (regression and ANOVA) in its entirety. This demonstrates how the entries (the generic descriptions of software functions) and the worked examples reinforce each ...
[Read more...]

Parsing GEO SOFT files with Python and Sqlite

July 17, 2009 | 0 Comments

NCBI's GEO database of gene expression data is a great resource, but its records are very open ended. This lack of rigidity was perhaps necessary to accommodate the variety of measurement technologies, but makes getting data out a little tricky. But, a...
[Read more...]

Simple Data Visualization

July 16, 2009 | 0 Comments

OK, so, I know I already raved about one Hadley Wickham project and how it has changed my life last week. But what can I say, the man is a genius. And if you are using R (and let’s face it, you should be) and you want simple sexy ...
[Read more...]

Influence.ME: Simple Analysis

July 16, 2009 | 0 Comments

With the introduction of our new package for influential data influence.ME, I’m currently writing a manual for the package. This manual will address topics for both the experienced, and the inexperienced users. I will also present much of the content ... [Read more...]

Missing data, logistic regression, and a predicted values plot (or two)

July 15, 2009 | 0 Comments

miss attach miss result1 summary(result1) Call: glm(formula = a ~ b, family = binomial(logit)) Deviance Residuals: Min 1Q Median 3Q Max -1.8864 -1.2036 0.7397 0.9425 1.4385 Coefficients: Estimate Std. Error z value Pr(__|z|) (Intercept) -5.96130 1.40609 -4.240 2.24e-05 ***b 0.10950 0.02404 4.555 5.24e-06 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1(Dispersion parameter for binomial family taken to be 1) Null deviance: 279.97 on 203 ... [Read more...]
1 2 3 4

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)