Formatting Decimals in Texts with R

August 30, 2009
By

(This article was first published on Keep on Fighting! » R Language, and kindly contributed to R-bloggers)

Yanping Chen raised a question in the Chinese COS forum on the output of Eviews: how to (re)format the decimal coefficients in equations as text output? For example, we want to round the numbers in CC = 16.5547557654 + 0.0173022117998*PP + 0.216234040485 * PP(-1) + 0.810182697599 * (WP + WG) to the 3rd decimal places. This can be simply done by regular expressions, as decimals always begin with a “.”. The basic steps are:

  1. find out where are the decimals in the character string;
  2. format them;
  3. replace the original decimals with formatted values;

Given a character vector, we can format the decimals with the code below:

# x: equations; FUN: formatting function; ...: passed to FUN
coefFormat = function(x, FUN, ...) {
    sapply(x, function(s) {
        dig = sapply(gregexpr("\.[0-9]+", s), function(m) {
            sapply(seq(along = m), function(i) {
                substr(s, m[i], m[i] + attr(m, "match.length")[i] - 1)
            })
        })
        for (j in {
            if (is.null(dim(dig)))
                NULL
            else 1:dim(dig)[1]
        }) {
            s = sub(dig[j, 1], substring(FUN(as.numeric(dig[j, 1]), ...),
                2), s, fixed = TRUE)
        }
        s
    })
}

I used sapply() for 3 times to avoid explicit loops but consequently the code might be difficult to read. The critical part is the regular expression “\.[0-9]+” which means one of more (controlled by “+” after “[0-9]”) digits (“[0-9]” or “[:digit:]”) after a decimal point “.”. As “.” is a metacharacter in regular expressions, we need to use a backslash before it, and again, “” is a special character in R, so we need another backslash to denote a backslash. o:-)

x = readLines(zz <- textConnection(
"CC = 16.5547557654 + 0.0173022117998 * PP + 0.216234040485 * PP(-1) + 0.810182697599 * (WP + WG)

II = 20.2782089394 + 0.150221823899 * PP + 0.61594357734 * PP(-1) - 0.157787636546 * KK

WP = 1.50029688603 + C(10) * XX + 0.146673821502 * XX(-1) + 0.130395687204 * AA
"))
close(zz)

writeLines(coefFormat(x, round, digits = 3))
#  CC = 16.555 + 0.017 * PP + 0.216 * PP(-1) + 0.81 * (WP + WG)
#
#  II = 20.278 + 0.15 * PP + 0.616 * PP(-1) - 0.158 * KK
#
#  WP = 1.5 + C(10) * XX + 0.147 * XX(-1) + 0.13 * AA
#
writeLines(coefFormat(x, formatC, digits = 3, format = "f"))
#  CC = 16.555 + 0.017 * PP + 0.216 * PP(-1) + 0.810 * (WP + WG)
#  
#  II = 20.278 + 0.150 * PP + 0.616 * PP(-1) - 0.158 * KK
#  
#  WP = 1.500 + C(10) * XX + 0.147 * XX(-1) + 0.130 * AA
#

Related Posts

To leave a comment for the author, please follow the link and comment on his blog: Keep on Fighting! » R Language.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , ,

Comments are closed.