A Deeper Look at the Small-to-Large Technique

The small-to-large technique is well known among the competitive programming community, but most problems that require it are straightforward applications of set merging. Here, we introduce a different way to think about the technique so that it may be applicable to a wider variety of problems. Consider the problem Non-boring Sequences. In short, the problem asks to determine whether all consecutive subsequences of a sequence A contain a unique element (such sequences are termed “non-boring”). Surely, after a quick read, we can start to brainstorm some segment tree or other data structure based solution, but what if we try some brute force approaches as well?

Consider a recursive check function taking parameters l and r for whether the subsequence A[l,r] is non-boring. For some i in [l,r], if Ai is unique (the previous and next appearance of Ai in A occurs outside of [l,r]), then all subsequences of A[l,r] “crossing” i are non-boring. Thus, it suffices to check that both A[l,i1] and A[i+1,r] are non-boring with a recursive call to check (note that we only need to recurse for one such i if it exists, think about why this is).

Naively, the algorithm above runs in O(N2), far too slow for the given constraint of N200000. However, what if we try a different order of looping i in the check function? Instead of looping i in the order [l,l+1,l+2,..,r], we will loop i “outside-in”, in the order [l,r,l+1,r1,l+2,r2,...]. As it turns out, this provides us with an O(NlogN) algorithm, which is a dramatic improvement from what we thought would be O(N2)!

To prove this, consider the tree formed by our decisions for i for each i recursively chosen by check. This will be a binary tree, where the number of children in the left child is the size of the left side of our split [l,i1], and the number of children in the right child is the size of [i+1,r]. At each step of the algorithm, we are essentially “unmerging” a set of objects into the left and right children, giving each child the corresponding number of objects to its size. Note that this unmerging happens in a time complexity proportional to the size of the smaller child, by nature of us looping outside-in. However, considering the reverse process, this is exactly the process of small-to-large set merging, which is O(NlogN)! Thus, we have obtained the correct complexity of our algorithm, and this problem is solved with barely any pain or book-code. Below shows a C++ implementation of check, where lst and nxt store the index of the previous and next appearance of Ai respectively:

bool check(int l, int r) {
    if (l > r) return 1;
    for (int i = l, j = r; i <= j; i++, j--) {
        if (lst[i] < l && nxt[i] > r) return check(l, i - 1) && check(i + 1, r);
        if (lst[j] < l && nxt[j] > r) return check(l, j - 1) && check(j + 1, r);
    }
    return 0;
}

In conclusion, it may be worth the time to consider seemingly brute force solutions to some problems, as long as there is a merging or unmerging process that can happen proportional to the size of the smaller set, capitalizing on the small-to-large technique when it seems like the last thing one could do.