(This article was first published on

The New York Times has created a beautiful visualization of the Census Bureau's 2005-2009 American Community Survey data. The distribution of racial and ethnic groups in New York City is particularly fascinating:**mickeymousemodels**, and kindly contributed to R-bloggers)Chinatown appears in red toward the south-eastern end of Manhattan; Harlem, above Central Park, is solidly blue; nearby, Spanish Harlem is mostly yellow; and the Upper East Side is solidly green. (Read the map key if none of this is making sense.) In general, many neighborhoods are segregated, in the sense of being inhabited predominantly by one ethnic group. How does this happen?

When I saw the New York Times' map I thought immediately of Schelling's neighborhood model, which provides a simple and yet somewhat unexpected explanation for spatial segregation. Here's how it works: everyone would prefer to live in a neighborhood where a fraction >= f of their neighbors are of the same race; and if anyone is unhappy, they move. The striking result is that, even with a fairly tolerant f -- say, 30% -- people's movements will end up creating extremely segregated neighborhoods. That's the surprising bit: despite being perfectly happy when up to (say) 70% of their neighbors are of a different race, the people in Schelling's model end up living in neighborhoods that are almost entirely same-race. Let's see if I can get this into a little R simulation:

# Number of people

n <- 10000

# The people will live in a square with area side^2

side <- ceiling(sqrt(n))

df <- data.frame(x=((0:(n-1)) %% side),

y=floor((0:(n-1)) / side), row.names=0:(n-1))

# Most lots will have a race; some will be empty, ie uninhabited

races <- c("forestgreen", "dodgerblue", "darkred")

# Assign races iid uniformly; leave roughly 10% of lots empty

df$race <- sample(c(races, "empty"), n, replace=TRUE,

prob=c(rep(0.90 / length(races), length(races)),

0.10))

PlotNeighborhood <- function() {

with(subset(df, race != "empty"),

plot(x, y, col=race, pch=20, axes=FALSE,

xlab="", ylab="", xlim=c(0, side), ylim=c(0, side),

main="A Schelling-esque Neighborhood"))

}

dev.new(height=8, width=8)

par(mar=rep(1, 4), oma=rep(1, 4))

PlotNeighborhood()

savePlot("neighborhood_before_movement.png")

# Neighbors are counted within a Chebyshev distance <= depth

depth <- 3

CountNeighbors <- function(i) {

# Count people of each race in person i's immediate neighborhood

curr.x <- i %% side

curr.y <- floor(i / side)

neighbors <- subset(df, x %in% (curr.x - depth):(curr.x + depth) &

y %in% (curr.y - depth):(curr.y + depth) &

!(curr.x == x & curr.y == y))

return(sapply(races, function(x) { sum(neighbors$race == x) }))

}

# Apply CountNeighbors to the entire data frame; append results to df

df <- cbind(df, t(sapply(0:(n - 1), CountNeighbors)))

df$num.neighbors <- rowSums(df[ , races])

# Minimum fraction of own-race neighbors

fraction <- 0.30

GetPeopleWhoWantToMove <- function(curr.race) {

subset.who.want.to.move <-

subset(df, race == curr.race &

get(curr.race) / num.neighbors < fraction)

return(rownames(subset.who.want.to.move))

}

AdjustNeighborCounts <- function(row, race.of.mover, delta) {

curr.x <- as.integer(row) %% side

curr.y <- floor(as.integer(row) / side)

df[df$x %in% (curr.x - depth):(curr.x + depth) &

df$y %in% (curr.y - depth):(curr.y + depth) &

!(curr.x == df$x & curr.y == df$y),

c(race.of.mover, "num.neighbors")] <<-

df[df$x %in% (curr.x - depth):(curr.x + depth) &

df$y %in% (curr.y - depth):(curr.y + depth) &

!(curr.x == df$x & curr.y == df$y),

c(race.of.mover, "num.neighbors")] + delta

}

MoveOnePerson <- function() {

# Returns 1 if a person was successfully moved, and 0 otherwise

people.who.want.to.move <- c(lapply(races, GetPeopleWhoWantToMove),

recursive=TRUE)

if (!length(people.who.want.to.move) >= 1) return(0)

# Of people who want to move, pick one uniformly at random

person.who.will.move <- sample(people.who.want.to.move, size=1)

race.of.mover <- df$race[rownames(df) == person.who.will.move]

possible.new.homes <- rownames(

subset(df, race == "empty" &

get(race.of.mover) / num.neighbors >= fraction))

if (!length(possible.new.homes) >= 1) return(0)

# Of acceptable new homes, choose one uniformly at random

new.home <- sample(possible.new.homes, size=1)

df[rownames(df) == new.home, ]$race <<- race.of.mover

df[rownames(df) == person.who.will.move, ]$race <<- "empty"

AdjustNeighborCounts(person.who.will.move, race.of.mover, -1)

AdjustNeighborCounts(new.home, race.of.mover, +1)

return(1)

}

RunSimulation <- function(max.movements = 5000, plots.in.loop=TRUE) {

par(mar=rep(1, 4), oma=rep(1, 4))

for(i in 1:max.movements) {

if (!MoveOnePerson()) break

if (plots.in.loop & (i %% 50) == 0) PlotNeighborhood()

}

}

library(animation)

saveVideo(RunSimulation(),

video.name="schelling_neighborhood_model.mp4",

interval=0.20, outdir=getwd())

dev.new(height=8, width=8)

par(mar=rep(1, 4), oma=rep(1, 4))

PlotNeighborhood()

savePlot("neighborhood_after_movement.png")

Here's the initial neighborhood, before anyone moves around:

As you can see, people are scattered around randomly, and there are no large same-race areas. But this cannot last long: it is very likely that someone will want to move; by moving, they must increase the homogeneity of their new neighborhood; and in doing so, they may cause someone to become unhappy and move out. The process is self-perpetuating. Here is the same neighborhood, post-movement:

And just for fun, here's the animation:

The purpose of this model is to demonstrate how a slight preference for own-race neighbors can lead to extremely segregated neighborhoods. In Schelling's words, "a moderate urge to avoid small-minority status may cause a nearly integrated pattern to unravel, and highly segregated neighborhoods to form." The group's aggregate behavior is not actually desired by any individual; and, therefore, observing a segregated neighborhood does not allow us to conclude that its inhabitants desire segregation.

To

**leave a comment**for the author, please follow the link and comment on his blog:**mickeymousemodels**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...