DP Code Understanding Challenge (Leetcode 801)

Hey engineering fellows!

I think the intuition around understanding this problem is difficult so I thought it would be good to get a conversation around it.

For those that want to try the question first see this: https://leetcode.com/problems/minimum-swaps-to-make-sequences-increasing/.


Here is one of the solutions:

def minSwap(self, A: List[int], B: List[int]) -> int:
    swapped = [0] * len(A)
    not_swap = [0] * len(A)
    swapped[0] = 1
    for i in range(1, len(A)):
        if A[i] > A[i - 1] and B[i] > B[i - 1]:
            # is increasing
            if A[i] > B[i - 1] and B[i] > A[i - 1]:
                # a swap would work
                not_swap[i] = min(swapped[i - 1], not_swap[i - 1])
                swapped[i] = not_swap[i] + 1
                swapped[i] = swapped[i - 1] + 1
                not_swap[i] = not_swap[i - 1]
            # it's not increasing
            not_swap[i] = swapped[i - 1]
            swapped[i] = not_swap[i - 1] + 1
    return min(swapped[-1], not_swap[-1])

I’d love to see if someone can get to an intuitive explanation on why this works. Excited to hear your thoughts!



To understand a DP algorithm, we need to understand the subproblems and the recurrence equation that relates them. As it is often the case, for a problem where the input are arrays, the subproblems are prefixes of the input arrays. What makes this DP algorithm a bit more complicated than usual is that the subproblems are not exactly of the same form as the original problem. In the subproblems, we need to add an extra condition: whether the last position in the prefix is flipped or not. This is important in order to be able to write the recurrence equation.

Thus, we have two types of subproblems, defined as follows:

not_swap[i] is the minimum number of swaps that you need to make A[0…i] and B[0…i] increasing, without swapping position i.
swapped[i] is the exact same, but swapping position i.

For the base case, we have not_swap[0] = 0, swapped[0] = 1, which should be clear from the definition above.

To compute swapped[i] and not_swap[i] for i > 0, the code implements a recurrence equation. It is called like that because, to compute swapped[i] and not_swap[i], we need to know swapped[i-1] and not_swap[i-1].

swapped[i] and not_swap[i] have their own recurrence equations, which are similar but a bit different. In each case, the recurrence equation has four cases, depending on the following conditions (both can be true):
b1= A[i] > A[i-1] and B[i] > B[i-1] //pos i satisfies the property (i.e., is increasing) if pos i-1 has not been swapped.
b2 = A[i] > B[i-1] and B[i] > A[i-1] //pos i satisfies the property if pos i-1 has been swapped.

The recurrence equation for no_swap is as follows:
no_swap[i] =

  1. min(swapped[i-1], no_swap[i-1]) if b1 and b2
  2. no_swap[i-1] if b1 and not b2
  3. swapped[i-1] if not b1 and b2
  4. -1 if not b1 and not b2

Case 1 is when the sequence is increasing whether the previous position is flipped or not. Since we can choose, we use the min operator to get the best choice.
Case 2 is when the sequence is only increasing if we do not flip the previous position.
Similarly with Case 3. Case 4 is when there is no solution, which should never happen according to the statement.

The recurrence equation for swapped[i] is similar, only reversing the roles of b1 and b2, because the position i will be swapped, and with a “+1” for doing the swap at position i:

swapped[i] =

  1. 1 + min(swapped[i-1], no_swap[i-1]) if b1 and b2
  2. 1 + no_swap[i-1] if not b1 and b2
  3. 1 + swapped[i-1] if b1 and not b2
  4. -1 if not b1 and not b2

The code iterates through indices i to n-1, at iteration i it computes swapped[i] and not_swap[i] using the recurrence equation above (in a more succinct way).

Finally, we choose the best option among swapped[n-1] and not_swap[n-1], (where n is the size of the arrays). Thinking of the definition of these values, this corresponds to minimum number of swaps that you need to make A[0…n-1] and B[0…n-1] increasing, with or without swapping position i, which is what we want.

By the way, the non-input space complexity can be reduced to O(1), since, at each iteration, we only need the values for iteration i-1, and not for all the previous iterations.