How to Conditionally Remove Character of a Vector Element in R

Posted on October 30, 2015 by jdm in R bloggers | 0 Comments

[This article was first published on R – The Hack-R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I have (sometimes incomplete) data on addresses that looks like this:

<span class="pln">data </span><span class="pun"><-</span><span class="pln"> c</span><span class="pun">(</span><span class="str">"1600 Pennsylvania Avenue, Washington DC"</span><span class="pun">,</span>
<span class="str">",Siem Reap,FC,"</span><span class="pun">,</span> <span class="str">"11 Wall Street, New York, NY"</span><span class="pun">,</span> <span class="str">",Addis Ababa,FC,"</span><span class="pun">)</span>

where I need to remove the first and/or last character if either one of them are a comma.

Avinash Raj was able to help me with this on S.O. and the question turned out to be a popular one, so I’ll show the solution here:

(?<=^), In regex (?<=) called positive look-behind. In our case it asserts What precedes the comma must be a line start ^. So it matches the starting comma.

| Logical OR operator usually used to combine(ie, ORing) two regexes.

,(?=$) Lookahead aseerts that what follows comma must be a line end $. So it matches the comma present at the line end.

Related

To leave a comment for the author, please follow the link and comment on their blog: R – The Hack-R Blog.