Preventing escaping in HTML

April 24, 2014
By

(This article was first published on MATHEMATICS IN MEDICINE, and kindly contributed to R-bloggers)

Preventing escaping in HTML

Preventing escaping in HTML

library(xtable)
## 
## Attaching package: 'xtable'
##
## The following objects are masked from 'package:Hmisc':
##
## label, label<-
library(stringr)
library(whisker)

Problem statement

Being a novice in R language, the problem I faced maight be a novice one, but I spent hours working on it.
I was working on making a html based report from a database (PostgreSQL), which would gather text information from the database and put it in the report in html format. I was using Rhtml in RStudio and inserting text into the specified position inside the Rhtml document by using code chunks, as text layout is more controlled than Rmd.
One of the database field contained text with multiple newline tags which indicated the places where I had inserted ENTER keystrokes. The example is shown below:
  string 1\nstring 2\nstring 3\nstring 4
I faced problems when I was trying to render this text as following in the resulting html document.
  1.  string 1
2. string 2
3. string 3
4. string 4

Method I was using in which I failed

I assigned a variable to the whole string.
s <- "string 1\nstring 2\nstring 3\nstring 4"
I substituted the \n with <br>, html tag for line break using str_replace_all
s1 <- str_replace_all(string = s, pattern = "\\n", replacement = "<br>")
s1
## [1] "string 1<br>string 2<br>string 3<br>string 4"
Then I tried to put the string s1 into the html document as follows. Actually I was working with dataframe with multiple rows and wanted to convert the data in table format.
print(xtable(data.frame(s1)), type = "html")
s1
1 string 1&lt br&gt string 2&lt br&gt string 3&lt br&gt string 4
I was not able to convert <br> in line breaks and the <br> came in the output verbatim. I cheked up the underlying html code and found the following:
  <TABLE border=1>
<TR> <TH> </TH> <TH> s1 </TH> </TR>
<TR>
<TD align="right"> 1 </TD>
<TD> string 1&lt br&gt string 2&lt br&gt string 3&lt br&gt string 4</TD>
</TR>
</TABLE>
What happened internally was that, while parsing the document html escaped < and > tags into &lt and &gt respectively. Problem I faced was how to prevent escaping the <br> and thereby inserting line breaks.

Method 1

I splitted the original string s and trimmed the resultant components.
s2 <- str_trim(unlist(str_split(s, "\\n")))
s2
## [1] "string 1" "string 2" "string 3" "string 4"
I made dataframe out of the character vector and printed the required output.
d <- data.frame(str_c(seq(from = 1, by = 1, along.with = s2), "."), s2)
print(xtable(d), type = "html", include.colnames = F, include.rownames = F,
html.table.attributes = "style='border-width:0;'")
1. string 1
2. string 2
3. string 3
4. string 4
which is the required output!!
But, I have not done anything to prevent <br> from getting escaped, I have bypassed the issue.

Method 2

This method uses {{Mustache}} and its R implementation, whisker package.
I have used the string s1 and take the following steps
l <- list(s1 = s1)
html.templ <- "<table><tr><td>{{{s1}}}</td></tr></table>"
cat(whisker.render(template = html.templ, data = l))
## <table><tr><td>string 1<br>string 2<br>string 3<br>string 4</td></tr></table>
{{{}}} prevents the <br> from getting escaped.
The output is as follows:
cat(whisker.render(template = html.templ, data = l))
string 1
string 2
string 3
string 4
It is a much smaller and cleaner code.

Concluding remarks

I request if any more techniques are there to prevent escaping the < and > from html rendering.

Session Information

sessionInfo()
## R version 3.0.2 (2013-09-25)
## Platform: x86_64-pc-linux-gnu (64-bit)
##
## locale:
## [1] LC_CTYPE=en_IN.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_IN.UTF-8 LC_COLLATE=en_IN.UTF-8
## [5] LC_MONETARY=en_IN.UTF-8 LC_MESSAGES=en_IN.UTF-8
## [7] LC_PAPER=en_IN.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] datasets grid grDevices splines graphics utils stats
## [8] methods base
##
## other attached packages:
## [1] whisker_0.3-2 xtable_1.7-1 knitr_1.5 mypackage_1.0
## [5] devtools_1.4.1 dplyr_0.1.2 ggplot2_0.9.3.1 rms_4.0-0
## [9] SparseM_0.99 Hmisc_3.13-0 Formula_1.1-1 cluster_1.14.4
## [13] car_2.0-19 stringr_0.6.2 lubridate_1.3.3 lattice_0.20-24
## [17] epicalc_2.15.1.0 nnet_7.3-7 MASS_7.3-29 survival_2.37-4
## [21] foreign_0.8-57 deSolve_1.10-8
##
## loaded via a namespace (and not attached):
## [1] assertthat_0.1 colorspace_1.2-4 dichromat_2.0-0
## [4] digest_0.6.4 evaluate_0.5.1 formatR_0.10
## [7] gtable_0.1.2 httr_0.2 labeling_0.2
## [10] memoise_0.1 munsell_0.4.2 parallel_3.0.2
## [13] plyr_1.8 proto_0.3-10 RColorBrewer_1.0-5
## [16] Rcpp_0.11.0 RCurl_1.95-4.1 reshape2_1.2.2
## [19] scales_0.2.3 tools_3.0.2
Bye and regards.

To leave a comment for the author, please follow the link and comment on his blog: MATHEMATICS IN MEDICINE.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.