3539 search results for "git"

The “best” proxies for temperature reconstruction

April 29, 2012
By
The “best” proxies for temperature reconstruction

In the last post I presented the distribution of correlation coefficients of temperature proxies with the actual temperature observations during the past 150 years. One of the conclusions was that most proxies correlate weakly with temperature observations. However, there seemed to be some proxies that do have some significant positive correlation with the observations. These

Read more »

Marriage is good for your income

April 29, 2012
By

For those of you who are into machine learning, here you can find a cool collection of databases to play around with your favorite algorithm. I choose one out of the available 200 and fit a logistic regression model. The idea … Continue reading →Related posts:What is important for a loan?...

Read more »

Guess who wins: apply() versus for loops in R

April 28, 2012
By
Guess who wins: apply() versus for loops in R

Yesterday I tried to do some data processing on my really big data set in MS Excel. Wow, did it not like handling all those data!! Every time I tried to click on a different ribbon, the screen didn’t even … Continue reading →

Read more »

microbenchmarking with R

April 28, 2012
By
microbenchmarking with R

I love to benchmark.  Maybe I’m a bit weird but I love to bench  everything in R.  Recently I’ve had people raise accuracy challenges to the typical system.time and rbenchmark package approaches to benchmarking.  I saw Hadley Wickham promoting the … Continue reading →

Read more »

Sage Bionetworks Synapse

April 27, 2012
By
Sage Bionetworks Synapse

Michael Kellen, Director of Technology at Sage Bionetworks, is trying to build a GitHub for science. It's called Synapse and Kellen described it in a talk at the Sage Bionetworks Commons Congress 2012, this past weekend: 'Synapse' Pilot for Building an...

Read more »

Real Time Structural Break

April 27, 2012
By
Real Time Structural Break

Yesterday as I played with bfast I kept thinking “Yes, but this is all in hindsight.  How can I potentially use this in a system?”  Fortunately, one of the fine authors very generously commented on my post Structural Breaks (Bull or Bear?...

Read more »

Measuring user retention using cohort analysis with R

April 27, 2012
By
Measuring user retention using cohort analysis with R

Cohort analysis is super important if you want to know if your service is in fact a leaky bucket despite nice growth of absolute numbers. There’s a good write up on that subject “Cohorts, Retention, Churn, ARPU” by Matt Johnson. So how to do it using R and how to visualize it. Inspired by examples

Read more »

Randomization thoughts

April 27, 2012
By
Randomization thoughts

Le Grand Casino of Monte CarloOn Monday I’m going to be leading a little stats workshop on randomization tests and null models. In preparation for this I wrote up code for null model examples I wanted to write a post that introduced the basics of these models (Null models, bootstrapping,...

Read more »

phyloseq: Reproducible interactive analysis of microbiome census data using R

April 26, 2012
By
phyloseq: Reproducible interactive analysis of microbiome census data using R

Collaborative development of phyloseq on GitHub. Official stable release of phyloseq on Bioconductor. Advances in DNA sequencing technology have dramatically improved the scope and scale of culture-independent investigations into microbial communities. There are effective software tools available to process raw DNA … Continue reading →

Read more »

Structural Breaks (Bull or Bear?)

April 26, 2012
By
Structural Breaks (Bull or Bear?)

When I spotted the bfast R package, I could not resist attempting to apply it to identify bull and bear markets.  For all the details that I do not understand, please see the references: Jan Verbesselt, Rob Hyndman, Glenn Newnham, Darius Culvenor...

Read more »