# Lomb-Scargle periodogram for unevenly sampled time series

**me nugget**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the natural sciences, it is common to have incomplete or unevenly sampled time series for a given variable. Determining cycles in such series is not directly possible with methods such as Fast Fourier Transform (FFT) and may require some degree of interpolation to fill in gaps. An alternative is the Lomb-Scargle method (or least-squares spectral analysis, LSSA), which estimates a frequency spectrum based on a least squares fit of sinusoid.

The above figure shows a Lomb-Scargle periodogram of a time series of sunspot activity (1749-1997) with 50% of monthly values missing. As expected (link1, link2), the periodogram displays a a highly significant maximum peak at a frequency of ~11 years.

The function comes from a nice set of functions that I found here: http://research.stowers-institute.org/efg/2005/LombScargle/R/index.htm. An accompanying paper focusing on its application to time series of gene expression can be found here.

Below is a comparison to an FFT of the full time series. For another great resource on spectral analysis, and time series-related R methods in general, see the following website: http://zoonek2.free.fr/UNIX/48_R/15.html.

#Required function # from: http://research.stowers-institute.org/efg/2005/LombScargle/R/index.htm source("LombScargle.R") source("R-Mnemonics.R") #Required data data(sunspot.month) # Historical sunspots, class(sunspot.month) = "ts" plot(sunspot.month) #Create vector version of data nspot <- as.vector(sunspot.month) time <- seq(1749+0.5/12, 1997+11.5/12, by=1/12) plot(nspot ~ time, t="l") # same as above #make gaps in data to create "observed" data series set.seed(1) obs <- sample(seq(nspot), 0.5*length(nspot)) # 50% gaps nspot.obs <- nspot[obs] time.obs <- time[obs] plot(nspot.obs ~ time.obs, pch=".", cex=2) # Create test frequencies corresponding to # periods from 1 to 50 years M <- 4*length(time.obs) # often 2 or 4 TestFrequencies <- (1/50) + (1 - 1/50) * (1:M / M) # Use Horne & Baliunas' estimate of independent frequencies Nindependent <- NHorneBaliunas(length(nspot.obs)) #Compute RES <- ComputeAndPlotLombScargle(time.obs, nspot.obs, TestFrequencies, Nindependent, "Sunspots") #Output consists of a single figure with four plots showing: #1. time series. The peak of the loess-smoothed curve (dotted line) is used for phase. #2. Histogram of time interval variability #3. Lomb-Scargle Periodogram with peak corresponding to a 11 year period #RES$PeakPeriod #4. Peak Significance Curve: p-value of Lomb-Scargle peak is quite significant: 1 - (1-exp(-RES$SpectralPowerDensity[which(1/RES$Frequency == RES$PeakPeriod)]))^Nindependent #Alternate plots p01 <- -log(1-((1-0.01)^(1/Nindependent))) #P=0.01 plot(RES$Frequency, RES$SpectralPowerDensity, t="n", xlab="Frequency [1/years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=seq(par()$xaxp[1], par()$xaxp[2],, par()$xaxp[3]+1), col=8, lty=3) lines(RES$Frequency, RES$SpectralPowerDensity, lwd=1) mtext("Lomb-Scargle Periodogram", side=3, line=1) abline(h=p01, col=2, lty=2) text(0.9, p01, label="p = 0.01", pos=3, col=2) plot(1/RES$Frequency, RES$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=c(1,2,5,10,20,50), col=8, lty=3) lines(1/RES$Frequency, RES$SpectralPowerDensity, lwd=1) mtext("Lomb-Scargle Periodogram", side=3, line=1) abline(h=p01, col=2, lty=2) text(2, p01, label="p = 0.01", pos=3, col=2) png("GappySunspots_LombScargle.png", width=5, height=7, units="in", res=400) #x11(width=5, height=7) op <- par(mfcol=c(2,1), mar=c(5,5,2,1), ps=10) plot(nspot.obs ~ time.obs, t="n", xlab="", ylab=expression(paste("No. sunspots [", "month"^-1, "]"))) grid() points(nspot.obs ~ time.obs, pch=".", cex=2) mtext("50% missing values", side=3, line=0.5) plot(1/RES$Frequency, RES$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density") abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=c(1,2,5,10,20,50), col=8, lty=3) lines(1/RES$Frequency, RES$SpectralPowerDensity, lwd=1) mtext("Lomb-Scargle Periodogram", side=3, line=0.5) abline(h=p01, col=2, lty=2) text(2, p01, label="p = 0.01", pos=3, col=2) par(op) dev.off() ############################## ###Compare to other methods### ############################## plot(sunspots) spectrum(sunspot.month) abline(v=1:10, lty=3) z <- spectrum(sunspots, spans=2) #smoothed spectrum png("LombScargle_vs_DFT.png", width=5, height=4, res=400, units="in") XLIM <- c(2,20) op <- par(mar=c(5,5,2,5), ps=10) plot(1/RES$Frequency, RES$SpectralPowerDensity, type="n", log = "x", xlab="Frequency [years]", ylab="Normalized Power Spectral Density", xlim=XLIM) abline(h=seq(par()$yaxp[1], par()$yaxp[2],, par()$yaxp[3]+1), col=8, lty=3) abline(v=c(1,2,5,10,20,50), col=8, lty=3) lines(1/RES$Frequency, RES$SpectralPowerDensity, lwd=1) par(new=TRUE) plot(1/z$freq, z$spec, t="l", col=3, log = "x", xlim=XLIM, xlab="", ylab="", xaxt="n", yaxt="n") axis(4) mtext("Spectral Density", line=3, side=4, col=3) legend("topleft", legend=c("Lomb-Scargle", "Discrete Fourier\nTransform"), bg="white", col=c(1,3), lty=1) par(op) dev.off()

**leave a comment**for the author, please follow the link and comment on their blog:

**me nugget**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.