# Lomb-Scargle periodogram for unevenly sampled time series

January 10, 2013
By

(This article was first published on me nugget, and kindly contributed to R-bloggers)

In the natural sciences, it is common to have incomplete or unevenly sampled time series for a given variable. Determining cycles in such series is not directly possible with methods such as Fast Fourier Transform (FFT) and may require some degree of interpolation to fill in gaps. An alternative is the Lomb-Scargle method (or least-squares spectral analysis, LSSA), which estimates a frequency spectrum based on a least squares fit of sinusoid.

The above figure shows a Lomb-Scargle periodogram of a time series of sunspot activity (1749-1997) with 50% of monthly values missing. As expected (link1, link2), the periodogram displays a a highly significant maximum peak at a frequency of ~11 years.

The function comes from a nice set of functions that I found here: http://research.stowers-institute.org/efg/2005/LombScargle/R/index.htm. An accompanying paper focusing on its application to time series of gene expression can be found here.

Below is a comparison to an FFT of the full time series. For another great resource on spectral analysis, and time series-related R methods in general, see the following website: http://zoonek2.free.fr/UNIX/48_R/15.html.

To reproduce the example:

`#Required function # from: http://research.stowers-institute.org/efg/2005/LombScargle/R/index.htmsource("LombScargle.R") source("R-Mnemonics.R") #Required datadata(sunspot.month) # Historical sunspots, class(sunspot.month) = "ts"plot(sunspot.month) #Create vector version of datanspot <- as.vector(sunspot.month)time <- seq(1749+0.5/12, 1997+11.5/12, by=1/12)plot(nspot ~ time, t="l") # same as above #make gaps in data to create "observed" data seriesset.seed(1)obs <- sample(seq(nspot), 0.5*length(nspot)) # 50% gapsnspot.obs <- nspot[obs]time.obs <- time[obs]plot(nspot.obs ~ time.obs, pch=".", cex=2) # Create test frequencies corresponding to# periods from 1 to 50 yearsM <- 4*length(time.obs) # often 2 or 4TestFrequencies <- (1/50) + (1 - 1/50) * (1:M / M) # Use Horne & Baliunas' estimate of independent frequenciesNindependent <- NHorneBaliunas(length(nspot.obs)) #ComputeRES <- ComputeAndPlotLombScargle(time.obs, nspot.obs,  TestFrequencies, Nindependent,  "Sunspots") #Output consists of a single figure with four plots showing:   #1. time series.  The peak of the loess-smoothed curve (dotted line) is used for phase.   #2. Histogram of time interval variability   #3. Lomb-Scargle Periodogram with peak corresponding to a 11 year period #RES\$PeakPeriod   #4. Peak Significance Curve:  p-value of Lomb-Scargle peak is quite significant:  1 - (1-exp(-RES\$SpectralPowerDensity[which(1/RES\$Frequency == RES\$PeakPeriod)]))^Nindependent  #Alternate plotsp01 <- -log(1-((1-0.01)^(1/Nindependent))) #P=0.01  plot(RES\$Frequency, RES\$SpectralPowerDensity, t="n", xlab="Frequency [1/years]", ylab="Normalized Power Spectral Density")abline(h=seq(par()\$yaxp[1], par()\$yaxp[2],, par()\$yaxp[3]+1), col=8, lty=3)abline(v=seq(par()\$xaxp[1], par()\$xaxp[2],, par()\$xaxp[3]+1), col=8, lty=3)lines(RES\$Frequency, RES\$SpectralPowerDensity, lwd=1)mtext("Lomb-Scargle Periodogram", side=3, line=1)abline(h=p01, col=2, lty=2)text(0.9, p01, label="p = 0.01", pos=3, col=2) plot(1/RES\$Frequency, RES\$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()\$yaxp[1], par()\$yaxp[2],, par()\$yaxp[3]+1), col=8, lty=3)abline(v=c(1,2,5,10,20,50), col=8, lty=3)lines(1/RES\$Frequency, RES\$SpectralPowerDensity, lwd=1)mtext("Lomb-Scargle Periodogram", side=3, line=1)abline(h=p01, col=2, lty=2)text(2, p01, label="p = 0.01", pos=3, col=2) png("GappySunspots_LombScargle.png", width=5, height=7, units="in", res=400)#x11(width=5, height=7)op <- par(mfcol=c(2,1), mar=c(5,5,2,1), ps=10) plot(nspot.obs ~ time.obs, t="n", xlab="", ylab=expression(paste("No. sunspots [", "month"^-1, "]")))grid()points(nspot.obs ~ time.obs, pch=".", cex=2)mtext("50% missing values", side=3, line=0.5) plot(1/RES\$Frequency, RES\$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()\$yaxp[1], par()\$yaxp[2],, par()\$yaxp[3]+1), col=8, lty=3)abline(v=c(1,2,5,10,20,50), col=8, lty=3)lines(1/RES\$Frequency, RES\$SpectralPowerDensity, lwd=1)mtext("Lomb-Scargle Periodogram", side=3, line=0.5)abline(h=p01, col=2, lty=2)text(2, p01, label="p = 0.01", pos=3, col=2) par(op)dev.off()  #################################Compare to other methods#################################plot(sunspots) spectrum(sunspot.month)abline(v=1:10, lty=3) z <- spectrum(sunspots, spans=2) #smoothed spectrum png("LombScargle_vs_DFT.png", width=5, height=4, res=400, units="in")XLIM <- c(2,20)op <- par(mar=c(5,5,2,5), ps=10)plot(1/RES\$Frequency, RES\$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density", xlim=XLIM) abline(h=seq(par()\$yaxp[1], par()\$yaxp[2],, par()\$yaxp[3]+1), col=8, lty=3)abline(v=c(1,2,5,10,20,50), col=8, lty=3)lines(1/RES\$Frequency, RES\$SpectralPowerDensity, lwd=1)par(new=TRUE)plot(1/z\$freq, z\$spec, t="l", col=3, log = "x", xlim=XLIM, xlab="", ylab="", xaxt="n", yaxt="n")axis(4)mtext("Spectral Density", line=3, side=4, col=3)legend("topleft", legend=c("Lomb-Scargle", "Discrete Fourier\nTransform"), bg="white", col=c(1,3), lty=1)par(op)dev.off()`

Created by Pretty R at inside-R.org

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...