Cars in Netherlands
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I am looking for a new car. So when I saw there was an update on vehicles in Statistics Netherlands I just had to go and look at the data. So, I learned the brown is getting more popular, often the number of cars from a certain construction year is larger at six years of age than five years of age and lighter cars get more popular, especially in these later years in the crisis.
Color of Cars
The data I downloaded contains number of cars by color, build year and reference date. Unfortunately it is all in Dutch, but I did translate all relevant parts. The data actually contains 14 colors, including ‘other’ but some of the colors were so infrequent, it made all confusing. So I added some colors to other. To plot what is sold in a certain year, I took reference date 1 January, the year after building.
% change in number of cars is interesting. I was expecting black cars maybe get more in accidents or something, because they are less visible. What I did see is a marked decrease in white cars, especially in the beginning of the previous decade. And there is a marked increase in cars of say 5 or 6 years old. Finally, at say ten years of age the cars start disappearing. The marked increase at 5 or 6 years may be explained by importing older cars. I sometimes read or hear they are bought in Germany. The Dutch tax on imported cars is fairly high, so it is interesting to import a cheaper second hand, which has significantly less tax. For example; Athlon car lease Germany has a Dutch language site especially for this purpose.
Fuel
Weight
R code.
library(ggplot2)
col1 <- read.csv2('Motorvoertuigen__per_260513101348.csv',na.strings='-')
col2 <- col1[!is.na(col1$Waarde),]
col2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(col2$Bouwjaren)))
col2$RefYear <- as.numeric(sub(', 1 januari','',as.character(col2$Peildatum)))
col2$Colour <- factor( c("Beige", "Blue", "Brown", "Other", "Yellow",
“Grey”, “Green”, “Other”, “Other”,
“Other”, “Red”, “Other”, “White” ,”Black”)[col2$Onderwerpen_2])
col3 <- aggregate(col2$Waarde,list(Colour=col2$Colour,BuildYear=col2$BuildYear,
RefYear=col2$RefYear),sum)
col4 <- col3[col3$RefYear==col3$BuildYear+1,]
colourcode <- c("#C8AD7F", "Black","Blue", "Brown", "Green" ,
“Grey” , “Purple” , “Red”,”White” , “Yellow”)
png(‘col1.png’)
p <- ggplot(col4, aes(x=BuildYear, y=x, colour=Colour))
p + geom_line() +
scale_colour_manual(values=colourcode) +
scale_y_log10(“Numer of vehicles”) +
scale_x_continuous(breaks=seq(2000,2012,2))
dev.off()
##############
lastyeardata <- col3[,c('x','BuildYear','Colour','RefYear')]
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)==’x’] <- 'LastYearAmount'
change <- merge(x=col3,y=lastyeardata)
change$Pchange <- with(change,100*(x-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png(‘col2.png’)
p <- ggplot(change[change$BuildYear<2010,], aes(x=Age, y=Pchange, colour=Colour))
p + geom_line() +
scale_colour_manual(values=colourcode) +
scale_y_continuous(“Numer of vehicles”) +
facet_wrap(~BuildYear,nrow=2)
dev.off()
############
fuel1 <- read.csv2('Motorvoertuigen__per_010613135050.csv',na.strings='-')
fuel2 <- fuel1[!is.na(fuel1$Waarde),]
fuel2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(fuel2$Bouwjaren)))
fuel2$RefYear <- as.numeric(sub(', 1 januari','',as.character(fuel2$Peildatum)))
fuel4 <- fuel2[fuel2$RefYear==fuel2$BuildYear+1,]
png(‘fuel1.png’)
p <- ggplot(fuel4, aes(x=BuildYear, y=Waarde, colour=Onderwerpen_2))
p + geom_line() +
scale_y_continuous(“Numer of vehicles”) +
scale_x_continuous(breaks=seq(2000,2012,2)) +
labs(colour=”Fuel”)
dev.off()
##
lastyeardata <- fuel2[,c('Waarde','BuildYear','Onderwerpen_2','RefYear')]
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)==’Waarde’] <- 'LastYearAmount'
change <- merge(x=fuel2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png(‘fuel2.png’)
p <- ggplot(change[change$BuildYear<2010,],
aes(x=Age, y=Pchange, colour=Onderwerpen_2))
p + geom_line() +
scale_y_continuous(“Numer of vehicles”) +
facet_wrap(~BuildYear,nrow=2) +
labs(colour=”Fuel”)
dev.off()
##############
weight1 <- read.csv2('Motorvoertuigen__per_010613140907.csv',na.strings='-')
weight2 <- weight1[!is.na(weight1$Waarde),]
weight2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(weight2$Bouwjaren)))
weight2$RefYear <- as.numeric(sub(', 1 januari','',as.character(weight2$Peildatum)))
weightcats <- levels(weight2$Onderwerpen_2)
weightcats <- gsub('en meer','and more',weightcats)
levels(weight2$Onderwerpen_2) <- weightcats
lweightcats <- as.numeric(gsub('( |-).*$','',weightcats))
weight2$lweight <- lweightcats[weight2$Onderwerpen_2]
weightcats <- weightcats[order(lweightcats)]
weight2$WeightCat <- factor(weight2$Onderwerpen_2,levels=weightcats)
weight4 <- weight2[weight2$RefYear==weight2$BuildYear+1,]
png(‘weight1.png’)
p <- ggplot(weight4, aes(x=lweight, y=Waarde, colour=factor(BuildYear)))
p + geom_line() +
scale_y_continuous(“Numer of vehicles”) +
labs(colour=’Build Year’)
dev.off()
png(‘weight2.png’)
p <- ggplot(weight4[weight4$lweight>600& weight4$lweight<1800,], aes(x=BuildYear,y=Waarde))
p + geom_line() +
scale_y_continuous(“Numer of vehicles”) +
facet_wrap(~WeightCat)
dev.off()
##
lastyeardata <- weight2[,c('Waarde','BuildYear','Onderwerpen_2','RefYear')]
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)==’Waarde’] <- 'LastYearAmount'
change <- merge(x=weight2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png(‘weight3.png’)
p <- ggplot(change[change$lweight>600& change$lweight<2200 & change$BuildYear<2010,]
, aes(x=Age, y=Pchange, colour=WeightCat))
p + geom_line() +
scale_y_continuous(“% Chane in Numer of vehicles”) +
facet_wrap(~BuildYear,nrow=2)
dev.off()
#
png(‘weight4.png’)
p <- ggplot(change[change$lweight>600 & change$lweight<1200,],
aes(x=RefYear, y=Pchange, colour=factor(Age)))
p + geom_line() +
scale_y_continuous(“% Change in Number of vehicles”) +
facet_wrap(~WeightCat)
dev.off()
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.