Kadane’s algorithm – finding maximum sum in contigous sub-array

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Great algorithm for analyzing timeseries data or array of numbers.

An algorithm for finding the contiguous subarray within a one-dimensional array of numbers that has the largest sum. It is called Kadane’s algorithm.

Largest sum

How does the algorithm work? It looks for a global maximum of positive-sum on any sub-array, regardless of its starting position or its length. Numbers must be next or together in a sequence (without any number left out).

With formula as: local_maximum at index i is the maximum of A[i] and the sum of A[i] and local_maximum at index i-1. So, you calculate the sum of every possible subarray ending with the element array[n-1]. Then, we would calculate the sum of every possible subarray ending with array[n-2], array[n-3] and so on up to array[0].

With a simple function we can iterate through the array (vector) of values:

#sample vector
v <- c(-3,-8, 1, -2, 1, 5, -3, -4, 3, 10, -2, 4, 1)

kadane <- function(v){
 
  max_so_far = -999999999999999 #some obnoxiously big number
  max_ending_here = 0
  
  for (i in 1:length(v)) {
    max_ending_here = max_ending_here  + v[i]
    if (max_so_far < max_ending_here ){
      max_so_far = max_ending_here
    }
    if (max_ending_here < 0) {
      max_ending_here = 0
    }
  }
  return (max_so_far)
}

Run the function with:

# run the function
kadane(v)

The function can also return the starting and ending positions of the array.

kadane_with_position <- function(v){
  max_so_far = -999999999999999 #some obnoxiously big number
  max_ending_here = 0
  subarray_start = 0
  subarray_end = 0
  int_s = 0
  
  for (i in 1:length(v)) {
    max_ending_here = max_ending_here  + v[i]
    if (max_so_far < max_ending_here ){
      max_so_far = max_ending_here
      subarray_start = int_s
      subarray_end = i
    }
    if (max_ending_here < 0) {
      max_ending_here = 0
      int_s =  i + 1 
    }
  }
  cat("Sum is: ", max_so_far, " with starting position: ", subarray_start, " and ending: ", subarray_end)
}

# run the function
kadane_with_position(v)

Kadane’s algorithm has also an interesting time complexity function. With O(n) for an array, it can explode exponentially when using 2D matrices: O(n^3).

R library Adagio offers this function with ability to calculate over matrices.

As always, code is available at Github in the same Useless_R_function repository. Check Github for future updates.

Happy R-coding and stay healthy!“

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)