Thursday, March 31, 2011

Find the maximum subarray of an integer array

Question: given an unsorted array of integers, find the subarray that yields the largest sum. For instance, if the input is {5, 2, -1}, then the output is subarray {5, 2} because it gives the largest sum (7 vs. 5 vs. 2 or 6).

Solution: this problem can be solved by using a modified version of Kadane's Algorithm. The strategy is to calculate the sum of each subarray and keep track of the sum, the start index and the end index of the subarray that gives the largest sum. Moreover, we calculate the sum of a subarray by adding the sum of the previous subarray with an additional element. For example, to calculate the sum of {5, 2, -1}, we add the sum of {5, 2}, which is 7, to the value of the next element, which is -1.

Here is our C++ method to solve the problem:

void findMaxSumSequence (int inputArray[], int size)
{
  if (size == 0)
    throw "Array Size is 0";

  int maxSum = inputArray[0];
  int maxStartIndex = 0;
  int maxEndIndex = 0;
  
  int curSum = inputArray[0];
  int curStartIndex = 0;
  

  for (int i = 1; i < size; i++)
  {
    if (curSum < 0)
    {
      curSum = 0;
      curStartIndex = i;
    }
    
    curSum = curSum + inputArray[i];

    if (curSum > maxSum)
    {

      maxSum = curSum;
      maxStartIndex = curStartIndex;
      maxEndIndex = i;
    }
  } 

  cout << "Start index: " << maxStartIndex << " End index: " 
        << maxEndIndex << " Max sum: " << maxSum << endl;
}

Explanation: this method accepts an array and its size as arguments. It prints out the start index, end index and the sum of the subarray that yields the max sum. Here are the main steps:

  1. First, check if the array size is 0. If it is so, we throw an exception because there is nothing we need to do when the array is null.
  2. Then, initialize variables: maxSum is the largest sum found. maxStartIndex and maxEndIndex are respectively the start index and end index of the subarray that yields the max sum. curSum is the sum of the current subarray that we're examining. curStartIndex is the start index of the current subarray we're checking.
  3. Next, we loop through the array and start calculating the sum of subarrays one after another:

    If the sum of the current subarray is less than 0, we reset curSum to 0. Why? If the last subarray's sum is negative, we will only decrease the next subarray's sum by adding the previous subarray's sum with an additional number. For example, if the previous subarray's sum is -2, and the next element is 3, it's better to reset the sum to 0 and add 3 into 0 than to add -2 to 3.

    curSum = curSum + inputArr[i] calculates the sum of the current subarray by adding the sum of the previous subarray with the next value in the array.

    After that, if the sum of the current subarray is greater than the max sum then we replace the max sum with the sum of the current subarray. We also change the maxStartIndex to the start index of the current subarray and the maxEndIndex to the current index i.

    When the loop ends, maxSum will contain the largest sum found. maxEndIndex and maxStartIndex contain respectively the end and start index of the subarray that gives the largest sum. Thus, we just have to print out those values.

If you have any comment or another solution to this problem, please post in the comment below. I would love to learn about it. Thanks for reading and until next post!

2 comments:

  1. will this work for the follwoing arrey:
    [-2,-1,-3] if so,pls tell how?

    ReplyDelete
    Replies
    1. This does not work. However one cheap way to get it to working is by keeping track of the least single element ; just as a special case for "all the negative numbers" and instead of comparing to cursum to zero ; compare to least negative number. I am sure there are a few other kinks that need to be ironed out to get the final answer.

      Delete