RghcnV3 2.0

August 7, 2011
By

(This article was first published on Steven Mosher's Blog, and kindly contributed to R-bloggers)

Well, version 2.0 is in the can and I’ll be uploading to CRAN over the next couple of days. Lets go over the highlights. Prior to version 2.0 we had basically 3 kinds of data flowing around the package: V3 14 column format, zoo objects and mts objects.  The 14 column format has always been a PITA and much of the code was designed to provide ways to transform that into 2D zoo or mts objects with station data organized into columns. After reviewing some of Nick’s code it became clear that there was a way to get rid of the 14 column data and streamline a bunch of the code. Going forward there are three types of objects: Zoo and Mts  which are 2D representations of station data and Nick’s 3D version which is an array. From input then you select which style you like and the readV3Data() function has been restructured for both speed and configurability:

readV3Data(filename=”foo”, output = c(“Array”,”Zoo”,”Mts”).  On ingest you decide what format you want to work in. If you change your mind there are a set of functions to handle transformation:  asZoo(), asMts(), asArray() and of course a set of logical functions to determine types. The core analysis functions have also been rewritten to accept ANY of these three object types. So, passesCamZoo() and passesCamMts()  etc have all been replaced with one function passesCam() that function accepts all three types of objects and just works. Some functions such as Roman’s function and Tamino’s function, and rasterizeZoo()  still have limited input: they require, for example, an Mts input or Zoo input. I’ll probably enhance those functions in another release and then it’s done.

With 2.0 thus you have these  kind of paths

readV3Data(); passesCam();anomalize();rasterizeZoo()  and you have stations selected by CAM and area averaged

readV3data(); averageStations() rasterizeCells() and you have stations estimated by Romans regression by grid cell

readV3Data(); inverseDensity();solveTemperature() and you have Nick stokes solution

and then you have Taminos approach as well. What we know faster all this is that the methods of computing averages for temperature stations yield the same global answers. There may however be slightly different answers if you look at smaller regions or have data that is too sparse in the temporal domain for CAM.

Next Steps:

There are several things I want to do and a few things I have to do.

1. Start work on GHCN Daily

2. incorporate Zeke’s paired station approach

3. Do some more work on CHCN

4. OOP. S4 classes and methods

5. Incorporate more of Nicks work

6. metadata package.

7. Demos and studies.

 


To leave a comment for the author, please follow the link and comment on his blog: Steven Mosher's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags:

Comments are closed.