A linear solution

# Longest positive subarray

Find the largest interval with positive sum

Given an array of $n$ non-all positive integer $a$, find the pair of indeces $(l, r) \in P$, where $P = \{(i, j) \mid 0 \leq i \leq j \le n, \sum\limits_{k = i}^{j} a[k] \geq 0 \}$ such that $r-l \geq i-j\ \forall (i, j) \in P$.

With a naive approach it’s possible to solve the problem with a complexity of $O(n^3)$. The following code is an example.

Using the same algorithm, but optimizing it with prefix sum, decreases the time complexity to $O(n^2)$.

However, there is a linear solution to the problem. To code this solution some auxiliary arrays would come in handy. The same $prefix\_sum$ array of before is used to compute the sum of a whole interval in constant time and the $best$ array is used to keep track of the biggest of those prefix sum from the end to the beginning (it’s actually the array of prefix maximums). Basically you want to know whether it’s worth to keep expanding the interval on the right or if you should discard some elements by trimming the interval on the left.

The following example should clarify the reasoning

Given the array $a = [3, -5, 8, 6, -9, 5, -3, -4, 2, -7]$, $best$ became $[12, 12, 12, 12, 8, 8, 5, 3, 3, -4]$. Since the problem ask to maximize the length of the interval, it’s easy to understand that we would never consider the solution $(0, 2)$ since we can increase the interval to $(0, 3)$ and obtain a valid solution.

This last part of the code should be clear now. We compute the sum of the subarray $a[l, r]$ and if it’s positive try to expand the interval on the right, if it’s not just trim the interval from the left.

A small optimization that doesn’t change the asymptotic complexity but that can make the program run faster is the following: from the first example it’s possible to see that there are lot of duplicate elements in the array $best$ and for everyone of them the sum is computed with the same result. It’s enough to memorize only the unique values and make the index $r$ jump to those value.

Here the $best$ index has been replace by the $best\_index$ stack that hold the indices to those unique values.