51 search results for "ecdf"

Hard drive occupation prediction with R – part 2 – Getting the probability distribution

Hard drive occupation prediction with R – part 2 – Getting the probability distribution

On the first article, we saw a quick-and-dirty method to predict disk space exhaustion when the usage pattern is rigorously linear. We did that by importing our data into R and making a linear regression. In this article we will see the problems with that method, and deploy a more robust solution. Besides robustness, we will also see how we can generate...

Read more »

Visualizing US House Results with a Seats-Votes curve

November 16, 2010
By
Visualizing US House Results with a Seats-Votes curve

A few weeks ago I wrote about ways to compare major-party returns in US House elections. I experimented with several visualizations, none as useful as the seats-votes curve. A traditional seats-votes cure measures average party performance against individual US House results. Our simplified curve uses a density plot to measure major-party (Democratic, in this case)

Read more »

Cooling stations. A UHI Hint

September 29, 2010
By
Cooling stations. A UHI Hint

Update: google earth files in the box: Personally I like to look at things backwards. Why are cool sites cool? So download the kml or kmz file and you can tour 62 sites: All with 90 years of data or more. All with a cooling trend. And all “supposedly” urban. what do you see at

Read more »

Monte Carlo testing of classification groups

September 1, 2010
By
Monte Carlo testing of classification groups

This is another article on the theme of defining groups in a hierarchical classification. A previous article described homogeneity analysis to visualize how any well any number of groups, defined at the same level accounts for the variability in the da...

Read more »

Monte Carlo testing of classification groups

September 1, 2010
By

This is another article on the theme of defining groups in a hierarchical classification. A previous article described homogeneity analysis to visualize how any well any number of groups, defined at the same level accounts for the variability in the dataset, as measured by within-group pairwise distances. Here we will look at testing whether splitting a particular group...

Read more »

R’s Normal Distribution Functions: rnorm and pals

July 14, 2010
By

The rnorm() function in R is a convenient way to simulate values from the normal distribution, characterized by a given mean and standard deviation. I hadn't previously used the associated commands dnorm() (normal density function), pnorm() (cumulative...

Read more »

Visualizing Drought

March 6, 2010
By
Visualizing Drought

The impacts of drought depend on time-scale. On short time-scales, drought means dry soil. On long time-scales, it means dry rivers and empty reservoirs. A region may simultaneously experience dry conditions on one time-scale and wet conditions on another e.g. wet soil but low streamflow or visa versa. Standardized Precipitation Index (SPI) is a widely

Read more »

Package Update Roundup: Dec 2009 – Jan 2010

February 9, 2010
By

A special double edition of the Package Update Roundup this month! This is a list of new or updated packages that were released for R in December and January, as announced on the r-packages mailing list. To include other updates on this list, please email David Smith. For a complete list of all updates on CRAN, see the CRANberries...

Read more »

Survive R

September 29, 2009
By

New PDF slides version (presented at the Bay Area R Users Meetup October 13, 2009). We at Win-Vector LLC appear to like R a bit more than some of our, perhaps wiser, colleagues ( see: Choose your weapon: Matlab, R or something else? and R and data ). While we do like R (see: Exciting Related posts:

Read more »

Example 7.11: Plot an empirical cumulative distribution function from scratch

August 31, 2009
By
Example 7.11: Plot an empirical cumulative distribution function from scratch

In example 7.8, we used built-in functions to produce an empirical CDF plot. But the empirical cumulative distribution function (CDF) is simple to calculate directly, and it might be useful to have more control over its appearance than is afforded by...

Read more »