## Modeling an Infant’s Feeding Schedule with Periodic Smoothing Splines

## Twitter Twitter on the Web, Who is the Most Popular of All? Interactively Determining Popularity of Two Entitites on Twitter

## Better Neighborhoods with R: Exploring and Analyzing SeeClickFix Data (part 1)

## How likely is the NSA PRISM program to catch a terrorist?

Recent revelations about PRISM, the NSA’s massive program of surveillance of civilian communications have caused quite a stir. And rightfully so, as it appears that the agency has been granted warrantless direct access to just about any form of digital communication engaged in by American citizens, and that their access to such data has been

## Feature Selection 3 – Swarm Mentality

"Bees don't swarm in a mango grove for nothing. Where can you see a wisp of smoke without a fire?" - Hla Stavhana In the last two posts, genetic algorithms were used as feature wrappers to search for more effective subsets of predictors. Here, I will do the same with another type of search algorithm: particle swarm optimization....

## Major League Baseball run scoring trends with R’s Lahman package

The statistical software R has an ever-expanding array of packages that provide pre-programmed functions and datasets. One such package is named Lahman, bundling the contents of the Lahman database into a quick-and-easy resource for R users. In addition to the data tables, the package resources also contain a variety of analyses and graphics undertaken using...

## Exploratory Data Analysis: Variations of Box Plots in R for Ozone Concentrations in New York City and Ozonopolis

Introduction Last week, I wrote the first post in a series on exploratory data analysis (EDA). I began by calculating summary statistics on a univariate data set of ozone concentration in New York City in the built-in data set “airquality” in R. In particular, I talked about how to calculate those statistics when the data

## What is probabilistic truth? Part 2 – Everything is conditional

Read Part 1 When making a statement of the form “1/2 is the correct probability that this coin will land tails”, there are a few things which are left unsaid, but which are typically implied. The statement is one about the probability of an unknown event occurring, and it would seem reasonable to write this

## An R debugging example

The steps taken to fix an R problem. Task To prepare for the Portfolio Probe blog post called “Implied alpha and minimum variance”, I tried to update a matrix of daily stock prices using a function I had written for the purpose. Error When I tried to do what I wanted, I got: > univclose130518 The post An...