Regression model with auto correlated errors – Part 3, some astrology

Posted on January 17, 2017 by Margot Tollefson in R bloggers | 0 Comments

[This article was first published on DataScience+, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The results of the study are interesting from an astrological point of view. Astrological signs are divided into groups by order. The first grouping is by alternating order, with the first sign (Aries) positive and the next sign negative, the third positive, and so on through the circle. The second grouping is in groups of three’s, called the elements – fire, earth, air, and water. The first, fifth, and ninth signs are fire signs, and so on through the circle. The third grouping is in groups of four, called the qualities or modes – cardinal, fixed, and mutable. The first, fourth, seventh, and tenth signs are cardinal signs, and so on through the circle. The months, though I am quite certain that the months were originally coincident with the zodiac signs, overlap signs. About 2/3 of one sign and 1/3 of the next sign are in each month. The adjusted counts used in what follows

Averaging the adjusted counts for the February’s, April’s, June’s, August’s, October’s, and December’s, months that are 2/3 positive and 1/3 negative, the result is 616.7 divorces with an estimated standard error of 15.8. Averaging the adjusted counts for the January’s, March’s, May’s, July’s, September’s, and November’s, the months that are 2/3 negative and 1/3 positive, the result is 661.5 divorces with an estimated standard error of 15.8. The difference between the two groupings is -44.9 with an estimated standard error of 25.8. So, positive months have fewer divorces than negative months in this data set – not what I would expect since the positive signs are associated with action and the negative signs with reaction.

Averaging the adjusted counts for the January’s, April’s, July’s, and October’s, months that are 2/3 cardinal and 1/3 fixed, the result is 627.3 divorces with an estimated standard error of 22.3. Averaging the counts for the February’s, May’s, August’s, and November’s, months that are 2/3 fixed and 1/3 mutable, the result is 611.0 divorces with an estimated standard error of 21.8. Averaging the counts for the March’s, June’s, September’s, and November’s, months that are 2/3 mutable and 1/3 cardinal, the result is 681.2 divorces with an estimated standard error of 22.8.

Looking at the three differences, between the first group and the second group – mainly cardinal versus mainly mutable – the difference is -53.9 with an estimated standard error of 35.8, between the first group and the third group – mainly cardinal versus mainly fixed – there is a difference of 16.3 with an estimated standard error of 34.4, and between the second group and the third group – mainly fixed versus mainly mutable – there is a difference of -70.2, with an estimated standard error of 35.3. Most of the differences between the groups are not easily explained by random variation and the results are in agreement with what one would expect from astrology. Cardinal and mutable signs are associated with change and adjustment, while fixed signs are associated with steadfastness.

The estimated covariance matrix for the 40 observations is found for the adjusted counts. We use the model mod.arima.r.1 from part 2.

First, the matrix cov.r.1 is created as a matrix of zeros. The sum of the squares of the estimated coefficients of the inclusions from the model is placed on the diagonal.

cov.r.1 = matrix(0,40,40)
diag(cov.r.1)= sum(c(1,mod.arima.r.1$coef[1:13]^2))

Next, the off-diagonal terms of the covariance matrix are put in the matrix, starting with the first row and column.

for (i in 1:12) {
  cov.r.1[cbind(c(1,(i+1)),c(i+1,1))]=
    sum(c(1,mod.arima.r.1$coef[1:(13-i)])*
        mod.arima.r.1$coef[i:13])
}
cov.r.1[cbind(c(1,14),c(14,1))] = 
        mod.arima.r.1$coef[13]
for (i in 2:14) {
  cov.r.1[(i-1)+2:14, i] = cov.r.1[2:14, 1]
  cov.r.1[i, (i-1)+2:14] = cov.r.1[2:14, 1]
}

Last, cov.r.1 is multiplied by the estimated variance of the inclusions from the arima() model.

cov.r.1 = cov.r.1*mod.arima.r.1$sigma2

The average counts for five sets of months along with their standard errors are found. The matrix t.m is the matrix of multipliers for finding the means and standard errors. The five rows of t are for positive signs, negative signs, cardinal signs, fixed signs, and mutable signs.

First the matrix t.m is created with five rows and filled with zeros. Row 1 selects the mainly positive months. Row 2 selects the mainly negative months. Row 3 selects the mainly cardinal months. Row 4 selects the mainly fixed months. Row 5 selects the mainly mutable months. Last, the means and standard errors are found and displayed.

t.m = matrix(0,5,40)
t.m[1,] = rep(1:0, 20)
t.m[2,] = rep(0:1, 20)
t.m[3,] = c(rep(c(0,0,1),13),0)
t.m[4,] = c(rep(c(1,0,0),13),1)
t.m[5,] = c(rep(c(0,1,0),13),0)
m.m = t.m%*%div.a.ts/apply(t.m,1,sum)
s.m = sqrt(diag(t.m%*%cov.r.1%*%t(t.m)))/
      apply(t.m,1,sum)
m.m 
s.m

The four tests are done using a similar structure as with the means. The four tests are positive vs negative, cardinal vs fixed, cardinal vs mutable, and fixed vs mutable.

t.t = matrix(0,4,40)
t.t[1,] = rep(c(1/20,-1/20),20)
t.t[2,] = c(rep(c(0,-1/13,1/13), 13), 0)
t.t[3,] = c(rep(c(-1/14,0,1/13), 13),-1/14)
t.t[4,] = c(rep(c(1/14,-1/13,0), 13), 1/14)
m.t = t.t%*%div.a.ts
s.t = sqrt(diag(t.t%*%cov.r.1%*%t(t.t)))
m.t 
s.t

Finding p-values for the four test statistics shows that all but the difference between mainly cardinal and mainly fixed are significant at a level of 15% for two-sided tests. The four test statistics are -1.74 for positive verses negative with a p-value of 0.0452; -1.51 for cardinal versus mutable with a p-value of 0.0704; 0.47 with a p-value of 0.6806 for cardinal versus fixed; and -1.99 with a p-value of 0.0274 for fixed versus mutable.

The code for the four test statistics and the code for the p-values are:

m.t/s.t
pnorm(m.t/s.t)

The negative direction of the lag of 13, could be explained by the cycles of the Moon. There are approximately 13 Moon cycles per year, so for a given month, the Moon will start in a positive sign one year and in a negative sign the next, over reasonably short time periods. The estimated standard error for the means over the 13 lags is 37.0 except for the mean starting with the first observation, which has an estimated standard error of 30.0. There are 13 means from the 40 observations.

The means and standard errors for the observations lagged by 13 are found, using a similar method as for the means and tests.

vec.14 = c(1,1+seq(13,40,13))
vec.13 = c(2,2+seq(13,26,13))
t.13 = matrix(0,13,40)
t.13[1,vec.14] = 1/4
for (i in 2:13) t.13[i,vec.13+(i-2)] = 1/3
m.13 = t.13 %*% div.a.ts
s.13 = sqrt(diag(t.13 %*% cov.r.1 %*% t(t.13)))
m.13 
s.13

Conclusion

In this study, the sample size is small. A value of 0.15 for alpha is reasonable for such a small sample. According to time series theory, the distributions of the means and test statistics should be asymptotically normal and the standard errors should converge to their parametric value, so confidence intervals based on normality and z-tests for the test statistics may be appropriate, even though the sample is small. The diagnostic plots from part 2 indicate that the errors from the model follow a normal distribution quite closely, which supports using the assumption of normality.

Of interest is the correspondence between divorce count means and astrological theory for cardinal, fixed, and mutable signs. In the sample, people were less likely to divorce in months that were mainly fixed than in months that were mainly cardinal or mutable, with the difference being largest for months that were 2/3 fixed and 1/3 mutable as compared to months that were 1/3 cardinal and 2/3 mutable.

For the entire model, the model with an increase in divorce counts as the time passes after a shift in the astronomical point I call Vulcan fits the data best. The result is encouraging and deserves further study.

Related Post

To leave a comment for the author, please follow the link and comment on their blog: DataScience+.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Conclusion

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)