Creating ‘Tags’ For PPC Keywords

[This article was first published on Mathew Analytics » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When performing search engine marketing, it is usually beneficial to construct a system for making sense of keywords and their performance. While one could construct Bayesian Belief Networks to model the process of consumers clicking on ads, I have found that using ’tags’ to categorize keywords is just as useful for conducting post-hoc analysis on the effectiveness of marketing campaigns. By ‘tags,’ I mean identifiers which categorize keywords according to their characteristics. For example, in the following data frame, we have six keywords, our average bids, numbers of clicks, and tags for state, model, car, auto, save, and cheap. What we want to do now is set the boolean for each tag to 1 if and only if that tag is mentioned in the keyword.

# CREATE SOME DATA = 
df = data.frame(keyword=c("best car insurance",
                          "honda auto insurance",
                          "florida car insurance",
                          "cheap insurance online",
                          "free insurance quotes",
                          "iowa drivers save money"),
                average_bid=c(3.12, 2.55, 2.38, 5.99, 4.75, 4.59),
                clicks=c(15, 20, 30, 50, 10, 25),
                conversions=c(5, 2, 10, 15, 3, 5),
                state=0, model=0, car=0, auto=0, save=0, cheap=0)
df

# FUNCTION WHICH SETS EACH TAG TO 1 IF THE SPECIFIED TAG IS PRESENT IN THE KEYWORD
main <- function(df) {
  state <- c("michigan", "missouri", "florida", "iowa", "kansas")
  model <- c("honda", "toyota", "ford", "acura", "audi")
  car <- c("car")
  auto <- c("auto")
  save <- c("save")
  cheap <- c("cheap")
  for (i in 1:nrow(df)) {
    Words = strsplit(as.character(df[i, 'keyword']), " ")[[1]]
    if(any(Words %in% state)) df[i, 'state'] <- 1
    if(any(Words %in% model)) df[i, 'model'] <- 1 
    if(any(Words %in% car)) df[i, 'car'] <- 1
    if(any(Words %in% auto)) df[i, 'auto'] <- 1     
    if(any(Words %in% save)) df[i, 'save'] <- 1
    if(any(Words %in% cheap)) df[i, 'cheap'] <- 1
  }
  return(df)
}

one = main(df)

subset(one, state==TRUE | model==TRUE | auto==TRUE)

# AN ALTERNATE METHOD USING THE STRINGR PACKAGE

df

library(stringr)

# CREATE EACH TAG
state <- c("michigan", "missouri", "florida", "iowa", "kansas")
model <- c("honda", "toyota", "ford", "acura", "audi")
car <- c("car")
auto <- c("auto")
save <- c("save")
cheap <- c("cheap")

state_match <- str_c(state, collapse = "|")
model_match <- str_c(model, collapse = "|")
car_match <- str_c(car, collapse = "|")
auto_match <- str_c(auto, collapse = "|")
save_match <- str_c(save, collapse = "|")
cheap_match <- str_c(cheap, collapse = "|")

#FUNCTION TO SET TAG IF PRESENT IN THE KEYWORD
main <- function(df) {
  df$state <- str_detect(df$keyword, state_match)
  df$model <- str_detect(df$keyword, model_match)
  df$car <- str_detect(df$keyword, car_match)
  df$auto <- str_detect(df$keyword, auto_match)
  df$save <- str_detect(df$keyword, save_match)
  df$cheap <- str_detect(df$keyword, cheap_match)
  df
}

two = main(df2)

subset(two, state==TRUE | model==TRUE | auto==TRUE)

By now, some of you are probably wondering why we don’t just select the keyword directly from the original data frame based on the desired characteristic. Well, that works too, albeit I’ve found that the marketing professionals that I’ve worked with have preferred the ‘tagging’ method.

## Alternate approach - SELECT DIRECTLY

df

main <- function(df) {
  model <- c("honda", "toyota", "ford", "acura", "audi")
  for (i in 1:nrow(df)) {
    Words = strsplit(as.character(df[i, 'keyword']), " ")[[1]]
    if(any(Words %in% model)) return(df[i, c(1:4) ])    
  }}

three = main(df)

So there you have it, a method of ‘tagging’ strings according to a certain set of specified characteristics. The benefit of using ‘tags’ is that it provides you with a systematic way to document how the presence of certain words or phrases impacts performance.


To leave a comment for the author, please follow the link and comment on their blog: Mathew Analytics » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)