**Wiekvoet**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Two weeks ago I described car data, among which weight distribution of cars in Netherlands. At that time it was purely plots. In the mean time I decided I wanted to model trends. As a first step of that, I decided to fit distributions for these data. I soon found out that most of the tools seem to be organized around individual objects as units. So, the 22 categories of weight might be extended to 100 000 records or so, one for each car. That is clearly not convenient. So, I brewed a MCMC algorithm for the Bayesian estimation of the distribution without all that. As I did most of this while on vacation, I had minimum access to books/internet, so it may be a bit clunky.

### Data reading

Nothing new here, except I added manually a formal endpoint for the final weight category; 3500 kg. This number chosen because this is the maximum weight you may drive with a normal drivers license here.

weight1 <- read.csv2(‘Motorvoertuigen__per_010613140907.csv’,na.strings=’-‘)

weight2 <- weight1[!is.na(weight1$Waarde),]

weight2$BuildYear <- as.numeric(sub(‘Bouwjaar ‘,”,as.character(weight2$Bouwjaren)))

weight2$RefYear <- as.numeric(sub(‘, 1 januari’,”,as.character(weight2$Peildatum)))

weightcats <- levels(weight2$Onderwerpen_2)

weightcats <- gsub(‘en meer’,’and more’,weightcats)

levels(weight2$Onderwerpen_2) <- weightcats

lweightcats <- as.numeric(gsub(‘( |-).*$’,”,weightcats))

uweightcats <- as.numeric(gsub(‘(^[0-9]+-)|([ [:alpha:]]+$)’,”,weightcats))

uweightcats[uweightcats==2451] <- 3500

weight3 <- weight2[weight2$RefYear==2001,c(2,6)]

weight3$lower <- lweightcats[weight3$Onderwerpen_2]

weight3$upper <- uweightcats[weight3$Onderwerpen_2]

weight3

Onderwerpen_2 Waarde lower upper

14 0-450 kg 95 0 450

196 451-550 kg 33 451 550

378 551-650 kg 262 551 650

560 651-750 kg 19950 651 750

742 751-850 kg 58752 751 850

924 851-950 kg 81650 851 950

1106 951-1050 kg 57738 951 1050

1288 1051-1150 kg 91793 1051 1150

1470 1151-1250 kg 104933 1151 1250

1652 1251-1350 kg 89506 1251 1350

1834 1351-1450 kg 30814 1351 1450

2016 1451-1550 kg 19517 1451 1550

2198 1551-1650 kg 13828 1551 1650

2380 1651-1750 kg 5512 1651 1750

2562 1751-1850 kg 4131 1751 1850

2744 1851-1950 kg 1558 1851 1950

2926 1951-2050 kg 1769 1951 2050

3108 2051-2150 kg 430 2051 2150

3290 2151-2250 kg 205 2151 2250

3472 2251-2350 kg 89 2251 2350

3654 2351-2450 kg 137 2351 2450

3836 2451 kg and more 631 2451 3500

### Jumping through mixture space

### The model

curmodel$mean[i],curmodel$sd[i])

–pnorm(counts$lower,

### Fitting

### Result

(pnorm(counts$upper,curmodel$mean[i],curmodel$sd[i])

c(lower,3500),data=weight3,

type=’s’,xlim=c(0,3500),ylim=c(0,1.2e5),

xlab=’Weight’,ylab=’Frequency’)

for(i in seq(1,10000,100)) {

**leave a comment**for the author, please follow the link and comment on their blog:

**Wiekvoet**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.