Blog Archives

Restaurant Performance Sunk by Selfies

July 17, 2014
By
Restaurant Performance Sunk by Selfies

An interesting story appeared over the weekend about a popular NYC restaurant realizing that, although the number of customers they served on a daily basis is about the same today as it was ten years ago, the overall service has significantly slowed. Naturally, this situation has led to poor online reviews so, the restaurant hired a firm to...

Read more »

How to Remember the Poisson Distribution

July 3, 2014
By
How to Remember the Poisson Distribution

The Poisson cumulative distribution function (CDF) \begin{equation} F(α,n) = \sum_{k=0}^n \dfrac{α^k}{k!} \; e^{-α} \label{eqn:pcdf} \end{equation} is the probability of at most $n$ events occurring when the average number of events is α, i.e., $\Pr(X \le n)$. Since \eqref{eqn:pcdf} is a probability function, it cannot have a value greater than 1. In R, the CDF is given by the...

Read more »

Importing an Excel Workbook into R

June 5, 2014
By
Importing an Excel Workbook into R

The usual route for importing data from spreadsheet applications like Excel or OpenOffice into R involves first exporting the data in CSV format. A newer (c. 2011) and more efficient CRAN package, called XLConnect, facilitates reading an entire Excel workbook and manipulating worksheets and cells programmatically from within R. XLConnect doesn't require a running installation...

Read more »

Importing an Excel Workbook into R

June 4, 2014
By
Importing an Excel Workbook into R

usually import Excel data in CVS format A new package in CRAN facilitates reading in entire Excel workbook and selecting worksheets and cells from there. example.... require(XLConnect)# Load Excel workbook into memorywb # Convert a sheet to a data frame df sheet = "SGI-NUMA", startRow = 3, endRow =...

Read more »

Melbourne’s Weather and Cross Correlations

April 1, 2014
By
Melbourne’s Weather and Cross Correlations

During a lunchtime discussion among recent GCaP class attendees, the topic of weather came up and I casually mentioned that the weather in Melbourne, Australia, can be very changeable because the continent is so old that there is very little geographical relief to moderate the prevailing winds coming from the west. In general, Melbourne...

Read more »

Facebook Meets Florence Nightingale and Enrico Fermi

February 18, 2014
By
Facebook Meets Florence Nightingale and Enrico Fermi

Highlighting Facebook's mistakes and weaknesses is a popular sport. When you're the 800 lb gorilla of social networking, it's inevitable. The most recent rendition of FB bashing appeared in a serious study authored by a couple of academics in the Depar...

Read more »

Response Time Percentiles for Multi-server Applications

December 25, 2013
By
Response Time Percentiles for Multi-server Applications

In a previous post, I applied my rules-of-thumb for response time (RT) percentiles (or more accurately, residence time in queueing theory parlance), viz., 80th percentile: $R_{80}$, 90th percentile: $R_{90}$ and 95th percentile: $R_{95}$ to a cellphone application and found that the performance measurements were not completely consistent. Since the data appeared in a journal blog, I...

Read more »

Laplace the Bayesianista and the Mass of Saturn

September 15, 2013
By
Laplace the Bayesianista and the Mass of Saturn

I'm reviewing Bayes' theorem and related topics for the upcoming GDAT class. In its simplest form, Bayes' theorem is statement about conditional probabilities. The probability of A, given that B has occurred, is expressed as: \begin{equation} \Pr(A|B) = \dfrac{\Pr(B|A)\times\Pr(A)}{\Pr(B)} \label{eqn:bayes} \end{equation} In Bayesian language, $\Pr(A|B)$ is called the posterior probability, $\Pr(A)$ the prior probability, and $\Pr(B|A)$ the...

Read more »

GDAT Class October 14-18, 2013

August 25, 2013
By
GDAT Class October 14-18, 2013

This is your fast track to enterprise performance analysis and capacity planning with an emphasis on applying R statistical tools to your performance data. Early-bird discounts are available for the Level III Guerrilla Data Analysis Techniques class O...

Read more »

Exponential Cache Behavior

May 15, 2013
By
Exponential Cache Behavior

Guerrilla alumnus Gary Little observed certain fixed-point behavior in simulations where disk IO blocks are updated randomly in a fixed size cache. For his python simulation with 10 million entries (corresponding to an allocation of about 400 MB of memory) the following results were obtained: Hit ratio (i.e., occupied) = 0.3676748 Miss ratio...

Read more »