Introduction to Redwall Analytics Nutmeg Open Data Project

[This article was first published on R on Redwall Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


Redwall Analytics has spent the last couple of months gathering data from the State of Connecticut’s relatively new “open data initiative”. In recent months, we have made several posts on LinkedIn describing our explorations and looking at the some of the financial challenges faced by our home state: Quick Review of State of Connecticut Spending in the R Stats Language and Independent Confirmation of the Yankee Institute Study on Gold Coast Real Estate Assessments. The latter post led to the creation of this Shiny App for anyone who would like to look at patterns of real estate assessment ratios in their own towns Analysis Comparing Connecticut Real Estate Assessments over Three Revaluation Cycles. In general, the pattern of under-assessing higher relative to lower valued homes is evident across Connecticut and in Fairfield County, though less so than in other towns.

This will be the first in a series of posts further exploring Connecticut’s finances from the federal, state and municipal levels, and the effect on real estate. The cause of our State’s recent travails are complex, and have been built up over decades under both parties. With the shrinking of the financial sector, the recovery in the largest cities at the expense of the suburbs and the rise of the digital economy benefitting other places more, it all come to a head after the Great Recession. The ultimate objective will be create a platform we will call the “Nutmeg Project” for accessing and linking to other sources of data, such as real estate prices, in order to look for patterns which may not have been considered previously. The code to access this data using the R Statistical Programming Language can be found at the Github link on our home page. Any contributions of ideas or skills for making the platform more accessible are welcome.

Below is a quick introduction to the data available at and Transparency CT. Although the State is to be commended for its effort to make public data available since 2014, the current disclosures are still not that well organized or annotated. This project intended to try to find and help reduce the friction in accessing the data, and to look for insights which might otherwise be outside of the paths of most busy people. As with all research by Redwall Analytics, the orientation will be non-partisan and open-minded with respect to any insights which may be found.

State Employee Compensation Expense for Fiscal 2010-2018 from Transparency CT

Compensation data from Transparency CT began to include fringe benefits in 2017 as shown in Figure 1. This data came from a download a few months ago, but it appears that data about fringe benefits has subsequently been removed without explanation. It is unfortunate given that these items represents 1/3 of employee compensation and are at the root of the State’s current challenges. It is also not clear why separate disclosures from (just below) seem show differing amounts of employees and compensation.

Figure 1: Selected Salary & Benefits Items from Transparency CT

State Employee Compensation Expense for Fiscal 2015-2019 Conflicts with CTdata

Figure 2 shows a shorter period and has richer variables including the employees’ age, job description, ethnicity, sex, full/part time, hire date, location, union and agency. It also discloses more categories of compensation. The full dataset includes over 102k employees during the full year of 2017 at a slightly lower total compensation of $75k (thought greater than the average amount disclosed by Transparency CT). If it is filtered for only full time employees as shown below, the number of unique employees falls to 60k, but the average total compensation rises close to $120k. As a general rule, the data is often loaded onto the websites without dictionaries or detailed explanation of context so can be challenging to understand.

Figure 2: Selected Salary & Benefits Items from CTdata

Pensions for State Retirees since 2010 from Transparency CT

Figure 3 with all pension data since 2010. The pension disclosures included employee name, year and total amount is believed to include only State retirees (SERS) and not Municipal Teacher’s as a general rule, our opinion is that the employee name doesn’t need to be disclosed, but the grade, years of service, job function and other employee attributes such as sex, age and ethnicity should be disclosed in any of the compensation and pension payments disclosures.

Figure 3: Pensions for Retirees Other than Municipal Teachers from Transparency CT

State Spending Database Since 2010

The spending database of almost 1 million rows has every payment made by category since 2010. The largest categories are human resources, finance, transportation, and operations. Figure 4 shows the data filtered by the largest category human resources which had increased to $13.8 billion from just under $10 billion in 2010. Salaries & wages had barely budged, but 75% of the increase was attributed to pension items (SERS & Pension Payments to Retirees) which togther reached almost half of all human resource relaated expenses.

Figure 4: Connecticut Expense Items Pertaining to Employees (Amounts in $ Millions)

State Payments Data Show the Nuts and Bolts of Government Operations

Figure 5 is a summary of over 3 million payments filtered for items over $100 million. Payments to the Department of Social Services has always been the largest item, and it has grown at a rapid rate (although this appears to have been reclassified from the grants database (below) so the growth here may be overstated). Transportation and Finance have also been large and gained in signficance. The consolidation of the state universities into the Board of Regents in 2015 is evident. Grants to the University of Connecticut have more than doubled.

Figure 5: Selected State of Connecticut Total Payments > $100M (Amounts in $ Millions)

State Grant Show the Rise in Aid to Troubled Cities

State Grants shown in Table 6 are tax dollars transferred back to municipalities from Hartford. Grants rose rapidly until 2012, and fell back sharply subsequently. Fiscal troubles have led to grants growing rapidly in Hartford and the Capital Region Education Council, New Haven and Bridgeport while declining in most other places. In an upcoming post, this blog will look at the Municipal Fiscal Indicators and some of the challenges will become clear in Hartford and New Haven, where half the Grand Lists are exempt from taxation. The wealthier towns of Fairfield County used to receive considerable Inter-Governmental transfers, but now receive small direct transfers although the state does pick up the full tab for teacher pensions in proportion to salaries.

Figure 6: Selected Connecticut Total Grant Items >$100M (Amounts in $ Millions)

Federal Grants Have Grown Considerably Primarily Driven By Medicaid

If the growth in the Medicaid Federal Share which commenced with Affordable Health were removed, Federal Grants would have been flat as shown in Table 7. Most other items have remained relatively stable.

Figure 7: Selected Federal Total Grant Items to Connecticut > $100M (Amounts in $ Millions)


This has been a less exciting introduction of the project and some of the data which will be used. In the next post, we will look at the Municipal Financial Indicators which is a statewide compilation of each town’s Comprehensive Annual Financial Report (CAFR), and specifically compare the towns of Fairfield County since 2001. Using this data, it is theoretically possible to compare all 169 towns across 58 variables with a few lines of code, and to look for patterns suggesting success or challenge. After that, we will look at the IRS Statistics of Income (SOI) data which compiles every line item of tax returns filed by zip code for Connecticut specifically and compared to every other state. Using this data, it is possible to calculate the full tax rate and total tax borne by six income segments. It is also possible to compare the evolution of income in Connecticut to other states.

To leave a comment for the author, please follow the link and comment on their blog: R on Redwall Analytics. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)