Solving a hard HackerRank problem — array manipulation

Since it’s a holiday and I don’t believe in zombie Jesus, I’ll publicly go through a 3rd HackerRank challenge. Today, something more challenging: array manipulation. This challenge is ranked hard, with ~50% success rate:

Starting with a 1-indexed array of zeros and a list of operations, for each operation add a value to each of the array element between two given indices, inclusive. Once all operations have been performed, return the maximum value in your array.

The list of operations — number[][] — can be iterated over as [indexStart, indexEnd, value] = operation[i]. The other argument is n, which gives us the length of the array over which to perform operations. I’m renaming it to containerSize because I dislike non-descriptive names.

So, what we want to do, is iterate over a list of operations, and for each operation, compute it over the container array. Basically:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function main(containerSize: number, operations: number[][]): number[] {
let container = new Array(containerSize).fill(0)

for (let i = 0, len = operations.length; i < len; i++) {
container = performOperation(operations[i], container)
}

return Math.max(...container)
}

function performOperation(operation, container) {
const indexStart = operation[0]
const indexEnd = operation[1]
const operand = operation[2]

for (let i = indexStart - 1; i < indexEnd; i++) {
container[i] += operand
}

return container
}

Evidently, we’re running a loop-in-a-loop, for a terrible O(n^2) time complexity. Since variables are kept per-loop, except for container, the space complexity is O(n).

Stress testing O(n^2)

To no one’s surprise, a test case with numerous operations running on a large array is a stress test that breaks my solution. After all, O(n^2) is easy to break at scale.

Oh god, the recursion

To my surprise, JavaScript’s built-in Math.max is recursive. After unlocking one of the timed-out test cases from HackerRank, I decided to run it locally. The test isn’t dramatic enough to crash node itself, so all I had to do to pass the test is wait (and tell Jest to wait as well). However, when trying to return the maximum value in the result array, Math.max spirals out of control:

1
RangeError: Maximum call stack size exceeded

That error message should read “Your recursion is awful”. I was surprised to run into this one.

Okay, so we’re back to loops to create an iterative Math.max. Sigh.

My first approach to find the maximum value is quite obvious:

1
2
3
function customMathMax(arr: number[]): number {
return arr.reduce((max, v) => max >= v ? max : v, -Infinity)
}

It’s… sluggish. I decided to re-test my assumption about array traversal in JavaScript, so I built yet another loop comparison test on JSPerf. By using Array.reduce my code is taking a 90-100% performance hit. Depending on the browser, phase of the moon, and thermal throttling, your mileage may vary. In my case, the for loops performed the best in Safari, and they are arguably cleaner to look at. Chromium reports different results, but I like loops. Given that, my Math.max is becoming:

1
2
3
4
5
6
7
8
9
function customMathMax(arr: number[]): number {
let max = -Infinity

for (let i = arr.length - 1; i >= 0; i--) {
max = arr[i] > max ? arr[i] : max
}

return max
}

Boom. Call stack issue fixed.

Just a side note: in Chromium-based browsers (I tested Chromium and Brave), Array.reduce is much faster than the loops. None of this makes sense, as the engine can optimize for the same use case. In fact, I assume much of this post is not as correct as it seems, based on different browsers optimizing code differently (and idiomatic loops still taking a hit to reduce).

Changing the spec

To avoid the overhead of reading HackerRank’s test case, I decided to stress test arrayManipulation using a much simpler but equally demanding spec:

1
2
3
4
5
6
7
8
9
10
describe('Array manipulation', () => {
it('should be performant', () => {
const containerSize = 1000000 // 1M
const operationToRepeat = [1, containerSize, 1]
const operations = new Array(containerSize).fill(operationToRepeat)

const maxItem = main(containerSize, operations)
expect(maxItem).toEqual(containerSize) // 1M
})
})

Researching smart solutions

At this point, I’m not happy with my solution, but it works. It just takes a lot of time to finish on large stress cases. In fact, it takes so long that it fails some of HackerRank’s tests.

Fortunately, the world is filled with smart people. Seriously. If you navigate to this challenge’s discussion tab, you’ll find much better O(n·m) solutions. On the negative side, some of the solutions are also showcases of extreme compactness, to the point of become unreadable. Instead, I’ll translate the idea to plain English.

Every operation to run, i.e. an [a, b, k] instruction, defines a gradient k from a to b. Let me rephrase that: an operation [indexStart, indexEnd, value] increases all elements between indexStart and indexEnd by the given value. Sadly, this translates poorly into descriptive code. My dual-loop solution is inelegant, but its descriptiveness makes the code expressive and arguably easy to understand. What we’re trying to do, is the following:

Assuming a container with size 4, each line in the code below is a 1-indexed operation being performed on it:

1
2
3
4
5
6
------- | a b k | -------
0 0 0 0 → 1 2 4 → 4 4 0 0
4 4 0 0 → 1 1 1 → 5 4 0 0
5 4 0 0 → 3 4 2 → 5 4 2 2
5 4 2 2 → 1 1 3 → 8 4 2 2
8 4 2 2 → 2 4 1 → 8 5 3 3

In an abstract sense, each operation is modifying the gradient in a given range of the array. That is, when we add 4 to indexes 1 and 2, what we’re actually doing is increasing the gradient by 4 in the 1–2 range.

If you break this down, it means that each element in that range is now +4. But when you finish the operation and try to find the maximum gradient change, only the first element matters, because you can simply add the positive gradient changes. When the gradient stops increasing, it creates a negative mirrored change in the next element (b+1).

In essence:

1
2
3
4
5
6
----------- | a b k | -----------
+0 +0 +0 +0 → 1 2 4 → +4 +0 -4 +0
+4 +0 -4 +0 → 1 1 1 → +5 +0 -4 +0
+5 +0 -4 +0 → 3 4 2 → +5 +0 -2 +0
+5 +0 -2 +0 → 1 1 3 → +8 -3 -2 +0
+8 -3 -2 +0 → 2 4 1 → +8 -2 -1 +1

That leaves us with finding the maximum continuum gradient increase, which comes out at 8. Let’s test that assumption in another series of operations:

1
2
3
4
5
6
------- | a b k | -------
0 0 0 0 → 1 2 1 → 1 1 0 0
1 1 0 0 → 1 2 1 → 2 2 0 0
2 2 0 0 → 2 2 1 → 2 3 0 0
2 3 0 0 → 1 3 4 → 6 7 4 0
6 7 4 0 → 3 3 3 → 6 7 7 0

Or, in gradient differences:

1
2
3
4
5
6
----------- | a b k | -----------
+0 +0 +0 +0 → 1 2 1 → +1 +0 -1 +0
+1 +0 -1 +0 → 1 2 1 → +2 +0 -2 +0
+2 +0 -2 +0 → 2 2 1 → +2 +1 -3 +0
+2 +1 -3 +0 → 1 3 4 → +6 +1 -3 -4
+6 +1 -3 -4 → 3 3 3 → +6 +1 +0 -7

In this case, the maximum continuum gradient increase is [+6, +1], which adds up to 7. If you compare the gradient changes to the actual end result, that +7 maximum gradient is congruent with the maximum values in the resulting array (7, for 1-indexed elements 2 and 3).

Brilliant. We don’t have to keep track of the actual values, which means O(n) space complexity. We also don’t have to run 2 loops, which is how we arrive at the O(n·m) time complexity. There’s no way to avoid that second iteration over the gradients array to find the maximum, so that’s what we’re stuck with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
function main(
containerSize: number,
operations: Array<Array<number>>
): number {
const container = new Array(containerSize).fill(0)
let maxPositiveGradient = 0

for (let i = 0; i < operations.length; i++) {
const [indexStart, indexEnd, value] = operations[i]

container[indexStart - 1] += value

if (indexEnd <= container.length) {
container[indexEnd] -= value
}
}

for (let i = 1; i < containerSize; i++) {
container[i] += container[i - 1]
maxPositiveGradient = Math.max(maxPositiveGradient, container[i - 1])
}

return maxPositiveGradient
}

Performing 1 million operations in a 1 million container takes ~2.5 seconds (in Jest). Bumping that to 10M, we can get the result in ~4 seconds. It’s not the best thing ever, but I couldn’t find a way to avoid iterating over both container and queries.

What I don’t like about the faster solution is that it’s not intuitive. It can’t be described in the same manner as the simpler, slower solution. That said, algorithmic optimizations tend to be rooted in mathematics and “serious” computer science.