RObservations #39: Uncovering A Stranger Side Of The Collatz Conjecture

[This article was first published on r – bensstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

The Collatz Conjecture is one of the most famous unsolved problems in mathematics which only requires the knowledge of 4th grade math to understand.

This blog was initially intended to show how to code the Collatz conjecture as function and visualize stopping times as well as the hailstone sequences for some positive integers. However when exploring this further, things took a turn to the stranger side of things.

I’m not sure if these findings are meaningful, but I hope you enjoy them nonetheless!

What is the Collatz Conjecture?

The Collatz Conjecture (also known as the 3n+1 problem) simply states, that by taking any positive number and repeating the following operations enough time will lead to operations termiating at the number 1.

  • If the number is even, divide it by two.
  • If the number is odd, triple it and add one.

Using modular arithmetic, this can be represented as a function in the following form:

I have coded this in base R as a function below:

collatz_conjecture<- function(positive_number){
  if(positive_number<0|is.na(positive_number)|positive_number%%1!=0){
    stop("The Collatz Conjecture only Applies to Positive Integers. For more information see: en.wikipedia.org/wiki/Collatz_conjecture")
  }

  hailstone_sequence<-c()
  n <- positive_number
  i <- 0
  iterations<-c()
  while(n!=1){
      i<-i+1
      hailstone_sequence<-append(hailstone_sequence,n)
      iterations<-append(iterations,i)
    if(n%%2==0){
      n<- n/2
    }else{
      n<-(3*n)+1
    }
      if(n==1){
      i<-i+1
      hailstone_sequence<-append(hailstone_sequence,n)
      iterations<-append(iterations,i)
      break
      }
  }

   data.frame(iterations,hailstone_sequence)
}

The function returns a dataframe which lists the number of iterations and respective hailstone sequence position. For example, for the number 7 we have 17 iterations until termination at 1.

collatz_conjecture(7)


##    iterations hailstone_sequence
## 1           1                  7
## 2           2                 22
## 3           3                 11
## 4           4                 34
## 5           5                 17
## 6           6                 52
## 7           7                 26
## 8           8                 13
## 9           9                 40
## 10         10                 20
## 11         11                 10
## 12         12                  5
## 13         13                 16
## 14         14                  8
## 15         15                  4
## 16         16                  2
## 17         17                  1

Visualizing Stopping Times

One of the common visuals that you will find when looking into the Collatz Conjecture is a scatter plot of stopping times for each number. To take this a step further I explored how stopping times looked for odd and even numbers as well as prime and non-prime numbers. For determining if a number is prime or not, I utilized the isprime() function from the matlab package.

library(tidyverse)
library(ggthemes)
library(matlab)

dt<- lapply(c(2:10000),function(x) data.frame(n=x,
                                           iterations=collatz_conjecture(x) %>%  
                                           nrow())) %>% 
  do.call(rbind,.) %>% 
  mutate(odd_even=ifelse(n%%2==0,"Even Number","Odd Number"),
         is_prime = ifelse(isprime(n),"Prime Number","Non-Prime Number"))


dt %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_point(mapping=aes(x=n,y=iterations),alpha=0.7)+
  ggtitle("Collatz Conjecture Stopping Times For 2-10,000")+
  labs(x="n",
       y="Stopping Time")+
  theme(
        plot.title=element_text(hjust=0.5),
        axis.title.x = element_text(size=14),
        axis.title.y = element_text(size=14))

dt %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_point(mapping=aes(x=n,y=iterations,color=odd_even ),alpha=0.7)+
  facet_wrap(~odd_even)+
  ggtitle("Collatz Conjecture Stopping Times For 2-10,000")+
  labs(x="n",
       y="Stopping Time")+
  theme(legend.title=element_blank(),
        plot.title=element_text(hjust=0.5),
        axis.title.x = element_text(size=14),
        axis.title.y = element_text(size=14))

dt %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_point(mapping=aes(x=n,y=iterations,color=is_prime ),alpha=0.7)+
  facet_wrap(~is_prime)+
  ggtitle("Collatz Conjecture Stopping Times For 2-10,000")+
  labs(x="n",
       y="Stopping Time")+
  theme(legend.position="none",
        plot.title=element_text(hjust=0.5),
        axis.title.x = element_text(size=14),
        axis.title.y = element_text(size=14))

At this point, the visualizations from the Collatz conjecture appear to be random, and lacking any sort of pattern. However, when exploring the hailstone sequences things started to take a turn for the bizarre.

Visualizing Hailstone Paths- Uncanny Similarities

Things started to look strange when I started visualizing hailstone sequence paths. From 2 to 26 the hailstone paths appear to form a visual that appears to expand with the hailstone sequences.

dv <- lapply(c(2:26),function(x) collatz_conjecture(x) %>% 
                                  #slice(-1) %>% 
                                  mutate(n = x)) %>% 
      do.call(rbind,.)

dv %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_line(mapping=aes(x=iterations,y=hailstone_sequence,color=factor(n)))+
  ggtitle("Hailstone paths for 2-26")+
  theme(legend.position = "none")

However when the Collatz conjecture is applied to 27, a totally separate sequence is drawn.

dv <- lapply(c(2:27),function(x) collatz_conjecture(x) %>% 
                                  #slice(-1) %>% 
                                  mutate(n = x)) %>% 
      do.call(rbind,.)

dv %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_line(mapping=aes(x=iterations,y=hailstone_sequence,color=factor(n)))+
  ggtitle("Hailstone sequence for 2-27")+
  theme(legend.position = "none")

I wouldn’t say much of it but the pattern demonstrated from numbers 2-26 returns again for 28-30 and 32-41.

dv <- lapply(c(2:26,28:30,32:40),function(x) collatz_conjecture(x) %>% 
                                  #slice(-1) %>% 
                                  mutate(n = x)) %>% 
      do.call(rbind,.)

dv %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_line(mapping=aes(x=iterations,y=hailstone_sequence,color=factor(n)))+
  ggtitle("Hailstone paths for 2-26,28-30,32-40")+
  theme(legend.position = "none")

The pattern which appeared initially unique for 27 also held true for 31 and 41.

dv <- lapply(c(27,31,41),function(x) collatz_conjecture(x) %>% 
                                  #slice(-1) %>% 
                                  mutate(n = x)) %>% 
      do.call(rbind,.)

dv %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_line(mapping=aes(x=iterations,y=hailstone_sequence,color=factor(n)))+
  facet_wrap(~factor(n))+
  theme(legend.position = "none")

dv %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_line(mapping=aes(x=iterations,y=hailstone_sequence,color=factor(n)))+
  theme(legend.position = "none")

Looking into these numbers started to take me down a rabbit hole I didn’t expect to enter.

What does Pi have to do with this?

When I looked at the numbers I initially Googled what the relationship between 27, 31 and 41 had to do with each other. This lead me to here which told me that they were numbers belonging to the digits in pi. This took me into finding more hailstone paths for other digits in pi.

The hailstone sequences for these numbers similar.

dv <- lapply(c(31,41,27,97,62,95,94,55,83,62,108,182,171),function(x) collatz_conjecture(x) %>% 
                                  #slice(-1) %>% 
                                  mutate(n = x)) %>% 
      do.call(rbind,.)

dv %>% 
  ggplot()+
  theme_fivethirtyeight()+
  geom_line(mapping=aes(x=iterations,y=hailstone_sequence,color=factor(n)))+
  facet_wrap(~factor(n))+
  theme(legend.position = "none")

For the numbers 27 and 171, there was a rounding up which needed to be done for the pattern to hold, nevertheless the number counting got very strange.

(Digits of pi from here.)

It was at this point I realized I needed to stop looking into this problem as it was getting strange very quickly. Looking into the Collatz conjecture with this approach likely stretches the limits of sanity.

So, for now- make with this information what you may. This is where I halted my explorations.

Conclusion

The Collatz Conjecture is a rabbit hole which one can get stuck in if thought about for too long. Famous mathematician Terrance Tao noted that the Collatz conjecture is notorious for absorbing massive amounts of time from both professional and amateur mathematicians. For a great resource on learning more about the Collatz conjecture, I highly recommend Veritasium‘s facinating video on the topic.

I myself had to come to terms that exploring this further in a sane manner has met its limit as a a productive productive endeavor. That being said, the discoveries coding the conjecture and the findings have definitely been amusing!

Thank you for reading!

Want to see more of my content?

Be sure to subscribe and never miss an update!

To leave a comment for the author, please follow the link and comment on their blog: r – bensstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)