In this post, we will analyze government data from the Flemish region in
Belgium on A) official crime statistics and B) self-reported feelings of
safety among residents of Flanders. We will focus our analysis on two cities
in the province of Flemish
Vilvoorde. A key question of this
analysis is: do the residents of the safer city (as measured by official
government statistics) feel safer and have more pride in their city (according
to government polling data)?
Our first set of data consists of official crime figures from the Flemish
government, obtained at this
website. I chose
to download the data adjusted for population size. Specifically, the data
record the number of reported incidents per 1000 residents. The types of crime
reported are: property damage, theft, physical violence, and overall
incidents. As of October 2018, these figures were available for the years
Subjective Feelings About the Cities
The Flemish government regularly conducts a survey (called the
“Stadsmonitor”) to monitor how the residents of Flanders
feel about many different topics, including how people feel about the city or
town where they live. Summary data are available by question on the official
survey website. In this blog post, we will analyze answers from
2017 on the following questions: insecurity, problems, vandalism and pride in
Open Data and Code
I’m increasingly sharing data and code for blog posts in the hopes that this
will be interesting or useful for readers. If you’d like to reproduce the
analyses done here (or explore a different question), you can find the data
and code at this
The code below reads the raw crime data file and first does some basic
cleaning. These files were only available in an Excel format, and were not
meant to be machine readable, so it takes some effort to manipulate the data
to work with them in statistical analysis programs like R.
I then produce an additional set of crime statistics for Leuven. Because
Leuven is a university city, there are a number of students who live here
during the year, but who are not officially registered as residents of the
city. Essentially, these “invisible” residents live in the city, potentially
commit or are victims of crime, but are not counted as residents and so do not
factor into the calculation of crime per 1000 residents. In my analysis below,
I’m assuming that there are an additional 30% “invisible” residents, and
therefore divide all of the crime figures by 1.3. I call these data “Leuven
student assumptions” because they are adjustments to the actual figures, and
not the official statistics themselves. The 30% figure comes from this
article (in Dutch), in
which the former mayor gives an estimate of the number of “invisible” student
I then convert the data from its original wide format to a long format, as
ggplot2 requires data in the
long format for plotting. The code to read the data, produce the adjustment
for students, and transform the data is below:
The code produces a dataset ( crime_long ) that looks like this (only first
10 lines shown):
|1||Leuven||Beschadigen van eigendom (aantal per 1.000 inw.)||2000||14.82|
|2||Leuven||Diefstal en afpersing (aantal per 1.000 inw.)||2000||54.43|
|3||Leuven||Misdr. tegen de lichamelijke integriteit (aantal per 1.000 inw.)||2000||6.17|
|4||Leuven||Criminaliteitsgraad (aantal per 1.000 inw.)||2000||132.81|
|5||Vilvoorde||Beschadigen van eigendom (aantal per 1.000 inw.)||2000||8.47|
|6||Vilvoorde||Diefstal en afpersing (aantal per 1.000 inw.)||2000||44.95|
|7||Vilvoorde||Misdr. tegen de lichamelijke integriteit (aantal per 1.000 inw.)||2000||6.08|
|8||Vilvoorde||Criminaliteitsgraad (aantal per 1.000 inw.)||2000||88.22|
|9||Vlaams Gewest (Flemish Region)||Beschadigen van eigendom (aantal per 1.000 inw.)||2000||9.36|
|10||Vlaams Gewest (Flemish Region)||Diefstal en afpersing (aantal per 1.000 inw.)||2000||37.12|
Note that we also have data on the entire Flemish region ( Vlaams Gewest )
which is a nice point of comparison – the global average across the territory.
The descriptions are all in Dutch, but we’ll use English translations for
The code below reads in the datasets (there is 1 dataset per question, and
each dataset contains summary statistics from many different cities – here we
focus on just Leuven and Vilvoorde). Unfortunately, there are slight
differences in the formatting of the files between the questions, so I wrote a
different function to read in every dataset. The functions read the data,
select the observations from Leuven and Vilvoorde in 2017, and make English
language column names (the original names are all in Dutch).
Our functions are all executed in a single line, and return a cleaned dataset
for each of the four questions we will examine.
For illustrative purposes, the safety dataset is shown below. The answers are
in response to the following question: How often do you feel unsafe in your
neighborhood and city?
We will produce one graph for each outcome, and assemble the graphs into a
single plot. For each outcome, we will plot four different lines, displaying
the crime figures across the years for Leuven (“Leuven” in the plots below),
Leuven with our assumptions about the student population (assuming 30%
“invisible” student residents as described above, “Leuven Student Assumptions”
in the plots below), Vilvoorde, and the Vlaams Gewest (average for the entire
The code below stores each single graph in an object, and uses the
gridExtra package to plot
them all in a single plot.
The code looks like this:
Which produces the following plot:
The official crime statistics for Leuven are higher than the other points of
comparison for every outcome after 2003. Even with our assumptions about the
“invisible” residents, Leuven consistently has higher crime rates than
Vilvoorde and the Flemish region average. Vilvoorde sticks close to the
Flemish average for property damage and physical violence, but is higher than
average for theft and overall crime rates.
Questionnaire / Self-Report Data
Subjective Feelings of Safety
We will first take a look at subjective feelings of safety in one’s city and
neighborhood. We will use the
likert package to plot the
responses to the questionnaires. The Likert package is great for producing
charts, a common
visualization of responses to questionnaire data. The format of the data from
the Stadsmonitor is also suited to this type of visual representation.
We first create a Likert scale object from the safety dataset we created
above. Because the same question was asked in regard to both one’s
neighborhood and one’s city, we will use a grouping in our plot. This will
allow us to display the responses to both areas with a correct sub-heading for
both neighborhood and city.
The code below creates the Likert object and produces the plot.
Leuven respondents rarely report feeling unsafe in their city (only 4% say
they feel unsafe always or often ) and neighborhood (only 3%). The figures
are considerably higher for Vilvoorde. Fully 22% of respondents in Vilvoorde
say that they always or often feel unsafe in their city, and 10% say that
they always or often feel unsafe in their neighborhood.
Social Problems and Pride in One’s City
We will next examine questions about two social problems: being hassled on the
street and seeing vandalism (literally “witnessing the destruction of street
furniture” – public benches,
lamp posts, etc.). We will also look at reports of feelings of pride in one’s
city (an overall perception which is no doubt informed by feelings of safety).
We will display these three questions in a single plot, as we did above for
the crime data. We first create a Likert graph object for each question (as
the questions don’t have the same question structure and response options,
it’s better to plot each one by itself). We then use the
gridExtra package to
display all of the graphs in a single plot.
The following code creates the Likert graph objects and displays them in one
When considering the social problems questions (being hassled on the street
and seeing vandalism), the percentage of respondents in Leuven and Vilvoorde
who report that they often or always experience these things is more-or-
less the same. The cities distinguish themselves in the middle and lower
response categories. More residents in Vilvoorde (vs. Leuven) say that they
sometimes are hassled on the street and witness vandalism, and fewer
residents in Vilvoorde (vs. Leuven) say that they never or seldom
experience these things. In sum, residents of Leuven report fewer social and
safety-related problems than residents of Vilvoorde.
The results of the pride question are quite striking. In Leuven, 75% of the
respondents are proud of their city versus only 34% in Vilvoorde (less than
half the Leuven percentage). Only 6% of Leuven residents are not proud of
their city, while this figure is 23% in Vilvoorde.
The self-report data are clear: residents of Leuven feel safer, report fewer
social problems and are much more proud of their city when compared with
residents of Vilvoorde.
Summary and Conclusion
In this post, we examined official crime statistics and survey data on
subjective feelings of safety in two Flemish cities: Leuven and Vilvoorde.
Interestingly, the official crime statistics and survey data seem to tell two
On the one hand, Leuven (compared to Vilvoorde) has considerably higher rates
of property damage, theft and physical violence, even when assuming that the
official statistics are slightly biased because of “invisible” students who
are not legally registered as city residents.
On the other hand, people in Leuven say that they feel much safer in their
neighborhoods and in their city. Residents of Leuven also report seeing fewer
social problems and are prouder of their city.
It’s interesting to see this disconnect between official crime statistics and
self-reported feelings of safety. I will admit that I’m not entirely sure what
to make of this. Potential explanations for this disconnect include:
- There are more “invisible” student residents than I have assumed. If this is the case, it is possible that the true population-adjusted crime figures are lower in Leuven than in Vilvoorde (although to close the gap between the cities, there would have to be a very large number of invisible residents in Leuven).
- Residents in Vilvoorde under-report crime that happens in their city. If this is the case, the actual crime figures in Vilvoorde are much higher, perhaps equal to or greater than those in Leuven (although the similarity between the Vilvoorde figures and those of the Flemish region suggests this is unlikely).
- The “types” of people who respond to the survey questions are different in Leuven vs. Vilvoorde. If this is the case, the difference in observed feelings of safety and seeing social problems reflects sampling bias, not the true aggregate feelings of city residents. However, the Stadsmonitor survey appears to have been meticulously conducted, and it strikes me that such sampling errors are unlikely.
My sense of these data is that, in some contexts, actual safety and
feelings of safety are different things. The overall crime figures in Leuven
are not very high, even though they are much higher than the average for the
Flemish region (which is, after all, a relatively well-off part of the world).
Given the relatively low crime rate overall, perhaps other factors influence
how residents feel in the place where they live.
Leuven is a wealthy city in a wealthy
province. My sense is that people have a
common identity as Leuven residents ( Leuvenaars in Dutch) and are in many
ways a homogeneous group (despite a sizeable number of foreign students and
academics). Furthermore, the city is doing well economically and has a vision
for itself (the university, science and technology spinoffs, along with a
large bank and brewery company are all valuable sources of revenue for the
city and its residents and seem likely to ensure a solid future in the modern
Vilvoorde, by comparison, is much less wealthy. It is socially very
diverse, with its
residents coming from many different parts of the world. As such, it doesn’t
have a single strong source of identity like Leuven, and its diverse residents
do not always mix or know each other very
Given the different economic and social situations of these two cities, it is
perhaps understandable that Leuven residents feel safer than Vilvoorde
residents, even when the actual crime statistics suggest that the opposite is
Coming Up Next
In the next post, we will solve a basic programming puzzle that is
(apparently) asked in data-science interviews. We will use this puzzle as a
case study for understanding similarities and differences in programming logic
and implementation of control structures in R vs. Python.
If you’ve got a better one, please let me know in the comments! ↩