Shrinking R’s PDF output

June 17, 2010

(This article was first published on PlanetFlux, and kindly contributed to R-bloggers)

R is great for graphics, but I've found that the PDF's R produces when drawing large plots can be extremely large. This is especially common when using spplot() to plot a large raster. I've made a 15 page PDF full of rasters that was hundreds of MB in size.  Obviously I don't need all the detail (every pixel of the raster) represented in the pdf and would rather have it reduced in size somehow.  So I wrote an R function to automate the following:

  1. take an existing pdf and run ps2pdf on it as an intial compression step. Often this step is all that's needed.
  2. split it into separate files using pdftk
  3. Check to see if each separate page is larger than some threshold you specify (I set 5MB as the default)
  4. If any one page is larger, rasterize the whole thing to a PNG file usingghostscript. I used the multicore package to parallelize this step, but this isn't necessary and that call could be replaced by lapply() to run them sequentially.
  5. Put the separate pages (perhaps a mix of the original and the compressed rasters) back together.

Here's the function:

   if(!file.exists(td)) dir.create(td)  
   if(verbose) print("Performing initial compression")  
   system(paste("ps2pdf ",pdf," ",td,"/test.pdf",sep=""))  
   system(paste("pdftk ",td,"/test.pdf burst",sep=""))  
   sizes=sapply(files,function(x)$size)*0.000001 #get sizes of individual pages  
   if(verbose)  print(paste("Resizing ",sum(toobig)," pages:  (",paste(files[toobig],collapse=","),")",sep=""))  
    system(paste("gs -dBATCH -dTextAlphaBits=4 -dNOPAUSE -r300 -q -sDEVICE=png16m -sOutputFile=",i,".png ",i,sep=""))  
    system(paste("convert -quality 100 -density 300 ",i,".png ",strsplit(i,".",fixed=T)[[1]][1],".pdf ",sep=""))  
    if(verbose) print(paste("Finished page ",i))  
   if(verbose) print("Compiling the final pdf")  
   system(paste("gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=",strsplit(pdf,".",fixed=T)[[1]][1],suffix,".pdf ",td,"/*.pdf",sep=""))  
  if(verbose) print("Finished!!")  



To leave a comment for the author, please follow the link and comment on their blog: PlanetFlux. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)