(This article was first published on

**R – The Hack-R Blog**, and kindly contributed to R-bloggers)I have (sometimes incomplete) data on addresses that looks like this:

```
data <- c("1600 Pennsylvania Avenue, Washington DC",
",Siem Reap,FC,", "11 Wall Street, New York, NY", ",Addis Ababa,FC,")
```

where I need to remove the first and/or last character if either one of them are a comma.

Avinash Raj was able to help me with this on S.O. and the question turned out to be a popular one, so I’ll show the solution here:

```
> data <- c("1600 Pennsylvania Avenue, Washington DC",
+ ",Siem Reap,FC,", "11 Wall Street, New York, NY", ",Addis Ababa,FC,")
> gsub("(?<=^),|,(?=$)", "", data, perl=TRUE)
[1] "1600 Pennsylvania Avenue, Washington DC"
[2] "Siem Reap,FC"
[3] "11 Wall Street, New York, NY"
[4] "Addis Ababa,FC"
```

**Pattern explanation:**

`(?<=^),`

In regex`(?<=)`

called positive look-behind. In our case it asserts What precedes the comma must be a line start`^`

. So it matches the starting comma.`|`

Logical OR operator usually used to combine(ie, ORing) two regexes.`,(?=$)`

Lookahead aseerts that what follows comma must be a line end`$`

. So it matches the comma present at the line end.

To

**leave a comment**for the author, please follow the link and comment on their blog:**R – The Hack-R Blog**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...