Nightlights: cool data, bad geocoding

October 14, 2010

(This article was first published on Things I tend to forget » R, and kindly contributed to R-bloggers)

A global source of population density has been on my low-priority wish list for some time, so I was very excited when I found Steve Mosher’s work with the Nighlights data set. “Nightlights” refers to the artificial lights seen from space at night. Astronomers call it “light pollution” which is pretty accurate since it’s decidedly not the light which we all use to see and avoid peril at night. Rather, it’s the light (and energy) wasted in that pursuit by being emitted or reflected straight up into the sky.

Steve has since spent some quality time with other R packages like Rgooglemap exploring this data set and has noticed some problems with the geocoding of the nightlights data.  I noticed the same thing, though much more naively, just trying to check out the data around my home:


url_radianceCalibrated = ""
calibratedLights = "rad_cal.tar"
hiResTif = "world_avg.tif"

gunzip(paste(hiResTif, '.gz', sep='')

hiResLights = raster("world_avg.tif")

# Eastern Mass., Cape Cod & Islands:
e = extent(-71.5, -69.5, 41, 43)

r = crop(hiResLights,e)

which looks amazing… right up until you overlay the county boundaries from the standard ‘maps’ package:


Alas. To my eye, there’s a clear shift to the northwest (see Provincetown at the tip of Cape Cod), and perhaps a bit of a clockwise rotation as well (see the big bulge of Cape Ann north of Boston).

Newer = better… ?

I have a lot to learn about this data, but in my poking around, I did find more recent observations available on a “Version 4 DMSP-OLS Nighttime Lights Time Series” page. But warning — these files are big. The tar file I download next is over 350MB:

url = ""
dest = "F152007.v4.tar"
tif = "F152007.v4b_web.stable_lights.avg_vis.tif"

download.file(url, dest)
gunzip( paste(tif, '.gz', sep='') )

f15 = raster(tif)
e = extent(-71.5, -69.5, 41, 43)
r = crop(f15, e)

which looks a lot better, though still probably not perfect.

ggplot2 with raster

I am a huge fan of ggplot2, but since this was my first exposure to the raster package, I just copied and pasted Steve’s code to make the plots. But I couldn’t help myself to try to reproduce them in ggplot2.

Getting your data into a data.frame is the key to using ggplot2. Fortunately, the raster package includes a rasterToPoints() function which outputs a list which is easily cast to a data.frame:

p = rasterToPoints(r)
df = data.frame(p)
colnames(df) = c("lon", "lat", "dn")

which makes the actual plotting so easy, even qplot() will do it:

qplot(lon, lat, color=dn, data=df)
  + scale_colour_gradient(low="black", high='white', transform='sqrt')
  + theme_bw() + borders("state", colour="yellow")
  + xlim(c(-71.5, -69.5)) + ylim(c(41, 43))

The only technical glitch is in the overlay, as zooming in truncates the northernmost coastline points but the geom_polygon() layer created by borders() seems compelled to close the shape and connects the northern Mass. coast with Rhode Island.

To leave a comment for the author, please follow the link and comment on their blog: Things I tend to forget » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)