Blog Archives

Importing an Excel Workbook into R

June 5, 2014
By
Importing an Excel Workbook into R

The usual route for importing data from spreadsheet applications like Excel or OpenOffice into R involves first exporting the data in CSV format. A newer (c. 2011) and more efficient CRAN package, called XLConnect, facilitates reading an entire Excel workbook and manipulating worksheets and cells programmatically from within R. XLConnect doesn't require a running installation...

Read more »

Importing an Excel Workbook into R

June 4, 2014
By
Importing an Excel Workbook into R

usually import Excel data in CVS format A new package in CRAN facilitates reading in entire Excel workbook and selecting worksheets and cells from there. example.... require(XLConnect)# Load Excel workbook into memorywb # Convert a sheet to a data frame df sheet = "SGI-NUMA", startRow = 3, endRow =...

Read more »

Melbourne’s Weather and Cross Correlations

April 1, 2014
By
Melbourne’s Weather and Cross Correlations

During a lunchtime discussion among recent GCaP class attendees, the topic of weather came up and I casually mentioned that the weather in Melbourne, Australia, can be very changeable because the continent is so old that there is very little geographical relief to moderate the prevailing winds coming from the west. In general, Melbourne...

Read more »

Facebook Meets Florence Nightingale and Enrico Fermi

February 18, 2014
By
Facebook Meets Florence Nightingale and Enrico Fermi

Highlighting Facebook's mistakes and weaknesses is a popular sport. When you're the 800 lb gorilla of social networking, it's inevitable. The most recent rendition of FB bashing appeared in a serious study authored by a couple of academics in the Depar...

Read more »

Response Time Percentiles for Multi-server Applications

December 25, 2013
By
Response Time Percentiles for Multi-server Applications

In a previous post, I applied my rules-of-thumb for response time (RT) percentiles (or more accurately, residence time in queueing theory parlance), viz., 80th percentile: $R_{80}$, 90th percentile: $R_{90}$ and 95th percentile: $R_{95}$ to a cellphone application and found that the performance measurements were not completely consistent. Since the data appeared in a journal blog, I...

Read more »

Laplace the Bayesianista and the Mass of Saturn

September 15, 2013
By
Laplace the Bayesianista and the Mass of Saturn

I'm reviewing Bayes' theorem and related topics for the upcoming GDAT class. In its simplest form, Bayes' theorem is statement about conditional probabilities. The probability of A, given that B has occurred, is expressed as: \begin{equation} \Pr(A|B) = \dfrac{\Pr(B|A)\times\Pr(A)}{\Pr(B)} \label{eqn:bayes} \end{equation} In Bayesian language, $\Pr(A|B)$ is called the posterior probability, $\Pr(A)$ the prior probability, and $\Pr(B|A)$ the...

Read more »

GDAT Class October 14-18, 2013

August 25, 2013
By
GDAT Class October 14-18, 2013

This is your fast track to enterprise performance analysis and capacity planning with an emphasis on applying R statistical tools to your performance data. Early-bird discounts are available for the Level III Guerrilla Data Analysis Techniques class O...

Read more »

Exponential Cache Behavior

May 15, 2013
By
Exponential Cache Behavior

Guerrilla alumnus Gary Little observed certain fixed-point behavior in simulations where disk IO blocks are updated randomly in a fixed size cache. For his python simulation with 10 million entries (corresponding to an allocation of about 400 MB of memory) the following results were obtained: Hit ratio (i.e., occupied) = 0.3676748 Miss ratio...

Read more »

Adding Percentiles to PDQ

April 22, 2013
By
Adding Percentiles to PDQ

Pretty Damn Quick (PDQ) performs a mean value analysis of queueing network models: mean values in; mean values out. By mean, I mean statistical mean or average. Mean input values include such queueing metrics as service times and arrival rates. These could be sample means. Mean output values include such queueing metrics as waiting time and queue...

Read more »

Upcoming GDAT Class May 6-10, 2013

April 22, 2013
By
Upcoming GDAT Class May 6-10, 2013

Enrollments are still open for the Level III Guerrilla Data Analysis Techniques class to be held during the week May 6—10. Early-bird discounts are still available. Enquire when you register. As usual, all classes are held at our lovely Larkspur...

Read more »