statistics

Matrix vs Data Frame in R

April 19, 2012 | civilstat

Today I ran into a double question that might be relevant to other R users: Why can’t I assign a dataframe row into a matrix row? And why won’t my function accept this dataframe row as an input argument? A … Continue reading → [Read more...]

When do you need all the data for Big Analytics?

April 18, 2012 | David Smith

In the 2012 edition of the SAP Sybase Capital Markets Guide, Revolution Analytics' Senior Advisor for Products and Strategy (and former CEO) Norman Nie writes about the "Five Benefits of Big Analytics". (You can also read his article at Enterprise Innovation.) Norman makes the argument that while sampling and aggregation are ... [Read more...]

Information flows like water

April 16, 2012 | Pat

Guiding a ship, it takes more than your skill Spark David Rowe’s Risk column this month is about data leverage. The idea is that you are leveraging your data if you are using it to answer questions that are too demanding of information. The piece reminded me of a ... [Read more...]

Implementing the Exact Binomial Test in Julia

April 14, 2012 | John Myles White

One major benefit of spending my time recently adding statistical functionality to Julia is that I’ve learned a lot about the inner guts of algorithmic null hypothesis significance testing. Implementing Welch’s two-sample t-test last week was a trivial task because of the symmetry of the null hypothesis, but ... [Read more...]

[not] Le Monde puzzle (solution)

April 13, 2012 | xi'an

Following the question on dinner table permutations on StackExchange (mathematics) and the reply that the right number was six, provided by hardmath, I was looking for a constructive solution how to build the resolvable 2-(20,5,1) covering. A few hours later. hardmath again came up with an answer, found in the ... [Read more...]

Comparing all quantiles of two distributions simultaneously

April 13, 2012 | FelixS

Summary: A new function in the WRS package compares many quantiles of two distributions simultaneously while controlling the overall alpha error. When comparing data from two groups, approximately 99.6% of all psychological research compares the central tendency (that is a … Continue reading → [Read more...]

Weighted t-Test in R

April 12, 2012 | FelixS

Although there is a weighted.mean function in R, so far I couldn’t find a implementation of weighted.var and weighted.t.test – here they are (the weighted variance is from Gavin Simpson, found on the R malining list): ?View Code RSPLUS# weighted … Continue reading → [Read more...]

Comparing Julia and R’s Vocabularies

April 9, 2012 | John Myles White

While exploring the Julia manual recently, I realized that it might be helpful to put the basic vocabularies of Julia and R side-by-side for easy comparison. So I took Hadley Wickham’s R Vocabulary section from the book he’s putting together on the devtools wiki, put all of the ... [Read more...]

Simulated Annealing in Julia

April 4, 2012 | John Myles White

Building Optimization Functions for Julia In hopes of adding enough statistical functionality to Julia to make it usable for my day-to-day modeling projects, I’ve written a very basic implementation of the simulated annealing (SA) algorithm, which I’ve placed in the same JuliaVsR GitHub repository that I used for ... [Read more...]

Resampling Hierarchically Structured Data Recursively

April 4, 2012 | BioStatMatt

That's a mouthful! I presented this topic to a group of Vandy statisticians a few days ago. My notes (essentially reproduced in this post) are recorded at the Dept. of Biostatistics wiki: HowToBootstrapCorrelatedData. The presentation covers some bootstrap strategies for hierarchically structured (correlated) data, but focuses on the multi-stage bootstrap; ... [Read more...]

Julia, I Love You

March 31, 2012 | John Myles White

Julia is a new language for scientific computing that is winning praise from a slew of very smart people, including Harlan Harris, Chris Fonnesbeck, Douglas Bates, Vince Buffalo and Shane Conway. As a language, it has lofty design goals, which, if attained, will make it noticeably superior to Matlab, R ... [Read more...]

Back to Blogging

March 31, 2012 | John Myles White

If you’re subscribed to this blog, you’ve surely noticed the very long hiatus I’ve taken from writing over the last six months. I wish I’d kept up with blogging more faithfully this year, but, in my defense, I’ve been busy doing a few big things: ... [Read more...]

R, Twitter and McDonald’s

March 23, 2012 | David Smith

Ed Chen is a data scientist at Twitter, so he's accustomed to working with big data and complex models. In an interview with MIT Technology Review, he describes his data science toolbox: A common pattern for me is that I'll code a MapReduce job in Scala, do some simple command-line ... [Read more...]
1 3 4 5 6 7 41

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)