A Deeper Look at the Small-to-Large Technique

Sep 01, 2021 Edward Xiao

The small-to-large technique is well known among the competitive programming community, but most problems that require it are straightforward applications of set merging. Here, we introduce a different way to think about the technique so that it may be applicable to a wider variety of problems. Consider the problem Non-boring Sequences. In short, the problem asks to determine whether all consecutive subsequences of a sequence contain a unique element (such sequences are termed “non-boring”). Surely, after a quick read, we can start to brainstorm some segment tree or other data structure based solution, but what if we try some brute force approaches as well?

Consider a recursive check function taking parameters and for whether the subsequence is non-boring. For some in , if is unique (the previous and next appearance of in occurs outside of ), then all subsequences of “crossing” are non-boring. Thus, it suffices to check that both and are non-boring with a recursive call to check (note that we only need to recurse for one such if it exists, think about why this is).

Naively, the algorithm above runs in , far too slow for the given constraint of . However, what if we try a different order of looping in the check function? Instead of looping in the order , we will loop “outside-in”, in the order . As it turns out, this provides us with an algorithm, which is a dramatic improvement from what we thought would be !

To prove this, consider the tree formed by our decisions for for each recursively chosen by check. This will be a binary tree, where the number of children in the left child is the size of the left side of our split , and the number of children in the right child is the size of . At each step of the algorithm, we are essentially “unmerging” a set of objects into the left and right children, giving each child the corresponding number of objects to its size. Note that this unmerging happens in a time complexity proportional to the size of the smaller child, by nature of us looping outside-in. However, considering the reverse process, this is exactly the process of small-to-large set merging, which is ! Thus, we have obtained the correct complexity of our algorithm, and this problem is solved with barely any pain or book-code. Below shows a C++ implementation of check, where lst and nxt store the index of the previous and next appearance of respectively:

bool check(int l, int r) {
    if (l > r) return 1;
    for (int i = l, j = r; i <= j; i++, j--) {
        if (lst[i] < l && nxt[i] > r) return check(l, i - 1) && check(i + 1, r);
        if (lst[j] < l && nxt[j] > r) return check(l, j - 1) && check(j + 1, r);
    }
    return 0;
}

In conclusion, it may be worth the time to consider seemingly brute force solutions to some problems, as long as there is a merging or unmerging process that can happen proportional to the size of the smaller set, capitalizing on the small-to-large technique when it seems like the last thing one could do.