Introducing the Redwall ‘Red Flag’ Explorer with New Constructs Data

[This article was first published on R on Redwall Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Red Flag Explorer

Click to see R set-up code
# Libraries
if(!require("pacman")) {

# Set knitr params
  comment = NA,
  fig.width = 12,
  fig.height = 8,
  out.width = '100%'

# Load annual data only
path <- 
red_flags <- 
  readRDS(paste0(path, "nc_annual_red_flags"))
annual_data <- 
  readRDS(paste0(path, "nc_annual_final"))

Key Findings

  • 1999-2000 was an exceptional period for both “Red Flag” prevalence and return differentiation, though apparent benefits of the strategy appear in most periods.

  • Approximately 2.0% of filings we checked had 5 or more “Red Flags” among annual and quarterly filings, so sparsity is challenge in estimating true relative returns.

  • Finding price histories for de-listed companies in open source channels is an challenge, which likely creates “survivor bias” in relative return estimates.

  • Prevalence of accounting distortion declined considerably since 1999-2000.


A few months ago in A Blueprint of “Red Flag” alerts Using Adjusted Earnings Data, we mused about using New Constructs historical data to back-test ideas from a 2003 CFA Conference Proceedings article entitled Revelations from Financial Reporting by Bruce Gulliver. This article succinctly offered the theory that, in the aftermath of the collapse of “Dot-com Bubble”, impending losses for many stocks might have been avoided by using a simple set of financial statement ratios, collectively as “red flags”. Mr. Gulliver’s work always stuck with us, and especially now that R can easily be used to test the hypothesis and to take the analysis to a different scale and interactivity. The only other missing piece would be thousands of companies with consistently and meticulously adjusted financial statement data over a very long period, which as far as we know, only resides at New Constructs.

After our blog post, New Constructs kindly offered to let us use their data to test the theory on a large sample of companies and over the last twenty years, which includes two of the great stock market busts and the recent Covid-19 related volatility. In this post, we will describe our analysis, summarize the benefits of the unparalleled New Constructs data and the interactive Red Flag Explorer Shiny app we have built for anyone who would like to interact with our derived “red flags” and measure their performance.

It is striking that analysis on this scale (ie: analyzing ~125,000 unique financial statements), impossible for a regular person with a computer to produce (much less share with others) for most of our years following markets, can now be conducted in a few weeks of coding. We think our “red flags” built on top of New Construct’s data represent a unique historical fingerprint of the market’s reaction to accounting disclosures in the controversial post-2000 period.

Red Flag Calculations

The methodology for generating “red flags” was close to what we laid out in A Blueprint of “Red Flag” alerts Using Adjusted Earnings Data, but there were several differences.

  • We didn’t fully understand the Revelations cash flow adjustments, which primarily had to do with timing differences of expensing of employee share options. According th Mr. Gulliver’s article, this became very significant during the Dot-com Bubble in many companies, but we are not sure if it is still so, and didn’t have comparable data from New Constructs to calculate it. If we were to do it again, we would have requested the data to calculate the difference between New Constructs “True Free Cash Flow” and “Traditional Free Cash Flow”, as laid out in The Most Overstated And Understated FCF In The S&P 500 Post 1Q21 Earnings by David Trainer (CEO of New Constructs) in the June 14th Forbes issue. For now, our app won’t have a “red flag” pertaining to cash flow.

  • Asset turnover was calibrated relative to other companies in the same sector, but considering the full 20+ year period when determining unfavorable ratios.

  • Trend-related variables (ie: increasing days of inventory/receivables, declining margins and ROIC and declining reserves/sales ratios) were calculated by taking more than one year of change into account to give an added penalty when the negative trend was persistent. This had a cost of losing the two periods at the very beginning of the series (ie: 1997-1998), because those were needed for the look-back.

  • The High Valuation “red flag” was calculated using New Construct’s three valuation-based Stock ratings (FCF Yield, Price-to-Economic Book Value and Market Implied Growth Appreciation Period), themselves derived variables, rather than raw valuation metrics. New Constructs does not use the traditional GAAP earnings and book value ratios pointed to in Revelations to determine valuation, for reasons very well discussed in the Basic Metrics section on New Construct’s website.

  • We don’t know what Mr. Gulliver would have used, but we defined “high” Earnings Distortion as an equal aggregation of the highest divergence of NC adjusted from reported net earnings divided by market capitalization and the absolute value of reported net earnings.

  • For liquidity, we first screened that a company did not have “excess cash”, a New Constructs derived variable measuring the amount of cash over and above what was needed to conduct operations. If the company did not have “excess cash”, we then used several credit metrics similar to those discussed in Revelations, but using the comparative New Construct’s derived financial statement items.

  • We also added two additional “red flags” of our own, for companies having amended filings (by far the least common “red flag”) and with more than two flags previously still showing filing-on-filing increases in total flags.

Thoughts on Red Flags

We don’t know what thresholds Mr. Gulliver would have used in his calculations, but where we had discretion, our “red flags” were calibrated to occur in about 20% of filings over the “whole period” (as shown in Figure 1 below). Because some flags were less frequent, the average rate of occurrence over all 10 flags was 16.3%. If the probability of raising flags was independent, this would translate into a 1.4% probability of having 5 or more flags (based on the binomial), but approximately 2.0% of filings had that many flags, so this may be a sign that some flags contribute to the likelihood of others and may not be independent.

When we say “over the whole period” above, this is significant because it means that the cut-offs for a “red flag” is the same regardless of the reporting period. Another option would have been to calculate by period (ie: attributing a similar number of each red flag in each period), but that would have taken away the ability to compare behavior over time. There was no special knowledge consideration given to “informed” threshold levels, where evidence supported likely problems, just that the selected metric was deviating negatively relative to the large majority of filings during the 20-year period. Further work in this regard might even improve the quality of the signalling.

Click to see R plot code
# Melt on logical cols as measure
cols <- names(red_flags)[sapply(red_flags, is.logical)]

# Melt data to long on fiscal_year and total_flags
red_flags_long <-
    data.table::between(fiscal_year, 1999, 2020),
    lapply(.SD, mean),
    .SDcols = cols,
    ][, data.table::melt(.SD, measure.vars = cols)]

# Make basic ggplot on x = fiscal_year and y = total_flags.
p <- 
      ggplot2::aes(x = fiscal_year,
                   y = value,
                   color = variable,
                   fill = variable)) +
      ggplot2::geom_line() +
      ggplot2::geom_point(size = 1) +
        x = "Fiscal Year",
        y = "Percentage of Filings") +
        labels = scales::percent) +

# Render as plotly and add customized flag labels to plotly object
p <- plotly::plotly_build(p)

names <- c(
  "Low Return",
  "Earns. Distort.",
  "Reserve Decline",
  "Days Inv or A/R ",
  "Mgn & ROIC decline",
  "Asset Turns",
  "High Val'n",
  "Poor Liquid.",
  "Neg. Trend")

vars <-

# Add red flag labels and tooltip to Plotly object
for (i in 1:10) {
  p$x$data[[i]]$name <- names[i]
  d <- 
    red_flags_long[variable == vars[i]]
  p$x$data[[i]]$text <- paste(
    "Period: ",
    "Red Flag Indicator: ",
    "Percent of Occurrences: ",
    paste0(round(d$value * 100, 0), "%"),
p[["x"]][["layout"]][["annotations"]][[1]][["text"]] <- "Red Flag"

# Add "Source: New Constructs" to bottom right
p <- 
  p %>% plotly::layout(
    hoverlabel = list(align = "left"),
    annotations =
        x = 1.05,
        y = -0.10,
        text = "Source: New Constructs",
        showarrow = F,
        xref = 'paper',
        yref = 'paper',
        xanchor = 'right',
        yanchor = 'auto',
        xshift = 0,
        yshift = 0,
        font = list(
          size = 12,
          color = "darkgray")

Figure 1: Most Red Flag Percentages Peaked in the Early 2000s

Ex-Post Return Data

In order to calculate ex post quarterly returns, we tried to find matching weekly prices for every company in the New Constructs database using the R {BatchGetSymbols} package, a wrapper for {quantmod}, when more than a few hundred tickers are needed. Both packages source prices from Yahoo Finance by default, and provided price histories for almost 4,000 requested companies. About 1700 tickers, generally defunct since the earlier periods of the series, were not available in Yahoo Finance, but we were able to recover an additional ~800 of the missing price histories using Alpha Vantage (the main pricing alternative to Yahoo Finance offered by {quantmod}), leaving ~900 companies unmatched.

While Yahoo maintains, and we used the “adjusted prices” (ie: for splits, dividends and other corporate actions) when available, Alpha Vantage only offered closing prices (un-adjusted). Although we assume that many of these must have gotten into difficulty and otherwise been de-listed, some might have been subsumed into other companies, possibly at a premium. If the stock price went to zero or was otherwise de-listed from trading, we think the fact that the price data was not adjusted might be less relevant, because it probably wasn’t paying dividends, spinning off subsidiaries or successfully completing a rights offering. Still, the 800 companies currently using closing prices from Alpha Vantage may cause some inaccuracy in our return estimates.

When we matched companies with returns, we used a “rolling join” on ticker and date, taking the last weekly price after to the filing date to simulate purchasing after the number of “red flags” was known. For this reason, the return calculated might start at the end of the same day as the report, or 1-4 days after the filing date. As a result, our relative return estimate should tend to be conservative approximation of the ex-post returns because it will not include the returns on the day of the report after the release (unless the report was after hours) and often not for several days after dissemination. When we calculated using the last weekly price before the report date, the overall shape of the relative returns didn’t look that much different (ie: higher “red flags” was associated with lower relative returns), especially when looking out several quarters.

Click to see R table code
# Make datatable object
dt <-
    annual_data[, {
      coverage = .N
      matched = .SD[!, .N]
      percent_matched = matched / coverage
      list(coverage, matched, percent_matched)
    rownames = FALSE,
    colnames =
      c("Fiscal Year",
        "New Constructs Coverage",
        "Matched with Returns",
        "Percent Matched"),
    options =
      list(pageLength = 24,
           scrollY = "400px",
           dom = 't')) %>%
    columns = 4,
    digits = 1) %>%
    columns = c(2:3),
    mark = ",",
    digits = 0)

Figure 2: New Constructs Covered Companies Matched with Returns Lower in Earlier Period

Figure 2 above shows the percentage of companies matched with returns over time, similar to Figure 1 earlier. We were able to download and match returns for between 75-80% of the stocks covered by New Constructs in the earlier periods, and a much higher rate in the later years. In the end, we have return data to go along with filings for almost 5,000 distinct companies, but were unable to match approximately 8,400 of the 67,000 annual reports for 913 companies. While we do have all the needed return data for many 2020 filings, the absolute number is lower because of the need to look ahead to calculate returns. We had significantly greater success matching quarterly reports, because those only started in a later period (around 2012 when New Constructs began providing them) when we had more complete pricing data.

Click to see R plot code
# Select cols with relative return data
cols <- 
    names(annual_data)[re2::re2_detect(names(annual_data), "rel_ret")]

# Melt on relative return amount columns
annual_data_long <-
  annual_data[, data.table::melt(
    measure.vars = cols, = "rel_ret_amt", = "rel_ret_pd",
    na.rm = TRUE,
    variable.factor = FALSE,
    value.factor = FALSE

# Make ggplot using only 6 quarter subsequent returns and calculate median 
# by red flag
p <- annual_data_long[
  data.table::between(fiscal_year, 1999, 2020) &
    rel_ret_pd == "rel_ret_q_6",
    cases = .N,
    unique_companies = length(unique(ticker)),
    median_rel_return = sapply(.SD, median, na.rm = TRUE),
    mean_rel_return = sapply(.SD, mean, na.rm = TRUE)
  .SDcols = "rel_ret_amt",
  .(fiscal_year, total_flags)][, 
  ][, median_rel_return :=
          median_rel_return < -0.5, -0.5, median_rel_return)][
  ][, median_rel_return :=
        median_rel_return > 0.5, 0.5, median_rel_return)][
        x = fiscal_year,
        y = median_rel_return,
        group = factor(total_flags),
        color = factor(total_flags),
        text = paste0(
          "</br>Reporting Period: ",
          "</br>Total Flags: ",
          format(total_flags, big.mark = ","),
          "</br>Unique Companies: ",
          format(unique_companies, big.mark = ","),
          "</br>Cases: ",
          format(cases, big.mark = ","),
          "</br>Median Relative: ",
          scales::percent(median_rel_return, accuracy = 0.1),
          "</br>Mean Relative: ",
          scales::percent(mean_rel_return, accuracy = 0.1)
    ) +
      ggplot2::geom_line() +
      ggplot2::geom_point(size = 1)+
        "Fiscal Year", 
        breaks = seq(2000, 2020, 5)) +
        "Median Percent Change", 
        labels = scales::percent) +
      # ggplot2::scale_color_manual(labels = as.character(c(1:9)), values = c(1:9)) +
        title = "",
        color = 'Num.\nFlags',
        caption = "Source: New Constructs") +
      ggplot2::theme_bw() +
        plot.title = ggplot2::element_text(
          size = 10,
          face = "italic",
          color = "darkgray")

# Render as plotly with tooltips set in ggplot object
p <- 
  plotly::ggplotly(p, tooltip = c("text")) %>% 
    hoverlabel = list(align = "left"),
    annotations =
        x = 1.05,
        y = -0.10,
        text = "Source: New Constructs",
        showarrow = F,
        xref = 'paper',
        yref = 'paper',
        xanchor = 'right',
        yanchor = 'auto',
        xshift = 0,
        yshift = 0,
        font = list(
          size = 12,
          color = "darkgray")

To leave a comment for the author, please follow the link and comment on their blog: R on Redwall Analytics. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)