**Edwin Chen's Blog**, and kindly contributed to R-bloggers)

Measure theory studies ways of generalizing the notions of length/area/volume. Even in 2 dimensions, it might not be clear how to measure the area of the following fairly tame shape:

much less the “area” of even weirder shapes in higher dimensions or different spaces entirely.

For example, suppose you want to measure the length of a book (so that you can get a good sense of how long it takes to read). What’s a good measure? One possibility is to measure a book’s length in *pages*. Since books provide page counts, this is a fairly easy measure to get. However, different versions of the same book (e.g., hardcover and paperback versions) tend to have different page counts, so this page measure doesn’t satisfy the nice property of version invariance (which we would like to have, since hardcover and paperback versions of the same book take the same time to read). Also, not all books even have page counts (think Kindle books), so this measure doesn’t allow us to measure the length of all books we might want to read.

Another, possibly better measure is to measure a book’s length in terms of the number of *words* it contains. Now we do have version invariance (hardcover and paperback versions contain the same number of words) and we can measure the length of Kindle books as well. We can even do things like add two books together, and the measure/number of words of the concatenated books will pleasantly equal the sum of the measures/number of words of each book alone.

However, what happens when we try to measure a picture book’s length in words? We can’t – picture books are too pathological. Maybe we could say that a picture book has measure zero (since a picture book has no words), but then we get unhappy things like books of measure zero taking a really long time to read (imagine a really long picture book). So maybe a better option is to say that picture books are simply unmeasurable. Whenever someone asks for the length of a picture book, we ignore them, and this way our measure will continue to be a good approximation of reading time and we get to keep our other nice properties as well.

Similarly, measure theory asks questions like:

- How do we define a measure on our space? (Jordan measure and Lebesgue measure are two different options in Euclidean space.)
- What properties does our measure satisfy? (For example, does it satisfy translational invariance, rotational invariance, additivity?)
- Which objects are measurable/which objects can we say it’s okay not to measure in order to preserve nice properties of our measure? (The Banach-Tarski ball can be rigidly reassembled into two copies of the same shape and size as the original, so we don’t want it to be measurable, since then we would lose additivity properties.)

And once we’ve defined a “generalized area” (our measure), we can try to generalize other mathematical concepts as well. For example, recall that the (Riemann) integral that you learn in calculus measures the area under a curve. What happens if we replace the “area” in the Riemann integral with our new, generalized measure (e.g., to get the Lebesgue integral)? Measure theory also helps make certain probability statements mathematically precise (e.g., we can say exactly what it means that a fair coin flipped infinitely often will “almost never” land heads more than 50% of the time).

**leave a comment**for the author, please follow the link and comment on their blog:

**Edwin Chen's Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...