|
3 | 3 | - Original |
4 | 4 | title: MEX (minimal excluded) of a sequence |
5 | 5 | --- |
6 | | -# MEX of a sequence in O(N) |
| 6 | +# MEX (minimal excluded) of a sequence |
7 | 7 |
|
8 | | -Given an array $A$ of size $N$. You have to find the minimal element that is not present in the array and is equal to or greater than $0$. See [Wikipedia](https://en.wikipedia.org/wiki/Mex_(mathematics)) for more information. |
| 8 | +Given an array $A$ of size $N$. You have to find the minimal non-negative element that is not present in the array. That number is commonly called the **MEX** (minimal excluded). |
9 | 9 |
|
10 | | -First we have to create a set of all elements in the array to quickly check for element presence in $O(1)$. Then check all numbers from $0$ to $N$. If the current number is not present in the set, return it. If all numbers are present in the set, return $N$, because all elements from $0$ to $N-1$ are present. |
| 10 | +$$ |
| 11 | +\begin{align} |
| 12 | +\text{mex}(\{0, 1, 2, 4, 5\}) &= 3 \\ |
| 13 | +\text{mex}(\{0, 1, 2, 3, 4\}) &= 5 \\ |
| 14 | +\text{mex}(\{1, 2, 3, 4, 5\}) &= 0 \\ |
| 15 | +\end{align} |
| 16 | +$$ |
11 | 17 |
|
12 | | -*NB*: This approach is fast, but works well only if your array doesn't change during program execution, e.g. it is not effective in problems with changing the initial array. If your array is changed by any type of array queries, use [O(N log N) approach](https://codeforces.com/blog/entry/81287?#comment-677837) instead. |
| 18 | +Notice, that the MEX of an array of size $N$ can never be bigger than $N$ itself. |
13 | 19 |
|
14 | | -## Implementation (C++): |
| 20 | +The easiest approach is to create a set of all elements in the array $A$, so that we can quickly check if a number is part of the array or not. |
| 21 | +Then we can check all numbers from $0$ to $N$, if the current number is not present in the set, return it. |
15 | 22 |
|
16 | | -```cpp |
17 | | -int mex(vector<int> a) { |
18 | | - set<int> b(a.begin(), a.end()); |
19 | | - for (int i=0; ; ++i) |
20 | | - if (!b.count(i)) |
21 | | - return i; |
| 23 | +## Implementation |
| 24 | + |
| 25 | +The following algorithm runs in $O(N \log N)$ time. |
| 26 | + |
| 27 | +```{.cpp file=mex_simple} |
| 28 | +int mex(vector<int> const& A) { |
| 29 | + set<int> b(A.begin(), A.end()); |
| 30 | + |
| 31 | + int result = 0; |
| 32 | + while (b.count(result)) |
| 33 | + ++result; |
| 34 | + return result; |
22 | 35 | } |
23 | 36 | ``` |
24 | 37 |
|
25 | | -If an algorithm requires fast $O(N)$ MEX computation, it is possible by computing an boolean vector of existing elements and then checking the first non-present one: |
| 38 | +If an algorithm requires a $O(N)$ MEX computation, it is possible by using a boolean vector instead of a set. |
| 39 | +Notice, that the array needs to be as big as the biggest possible array size. |
| 40 | +
|
26 | 41 |
|
27 | | -```cpp |
28 | | -int mex (const vector<int> & a) { |
29 | | - static bool used[D+1] = { 0 }; |
30 | | - int c = (int) a.size(); |
| 42 | +```{.cpp file=mex_linear} |
| 43 | +int mex(vector<int> const& A) { |
| 44 | + static bool used[MAX_N+1] = { 0 }; |
31 | 45 |
|
32 | | - for (int i=0; i<c; ++i) |
33 | | - if (a[i] <= D) |
34 | | - used[a[i]] = true; |
| 46 | + // mark the given numbers |
| 47 | + for (int x : A) { |
| 48 | + if (x <= MAX_N) |
| 49 | + used[x] = true; |
| 50 | + } |
35 | 51 |
|
36 | | - int result; |
37 | | - for (int i=0; ; ++i) |
38 | | - if (!used[i]) { |
39 | | - result = i; |
40 | | - break; |
41 | | - } |
| 52 | + // find the mex |
| 53 | + int result = 0; |
| 54 | + while (used[result]) |
| 55 | + ++result; |
42 | 56 | |
43 | | - for (int i=0; i<c; ++i) |
44 | | - if (a[i] <= D) |
45 | | - used[a[i]] = false; |
| 57 | + // clear the array again |
| 58 | + for (int x : A) { |
| 59 | + if (x <= MAX_N) |
| 60 | + used[x] = false; |
| 61 | + } |
46 | 62 |
|
47 | | - return result; |
| 63 | + return result; |
48 | 64 | } |
49 | 65 | ``` |
50 | 66 |
|
| 67 | +This approach is fast, but only works well if you have to compute the MEX once. |
| 68 | +If you need to compute the MEX over and over, e.g. because your array keeps changing, then it is not effective. |
| 69 | +For that, we need something better. |
| 70 | + |
| 71 | +## MEX with array updates |
| 72 | + |
| 73 | +In the problem you need to change individual numbers in the array, and compute the new MEX of the array after each such update. |
| 74 | + |
| 75 | +There is a need for a better data structure that handles such queries efficiently. |
| 76 | + |
| 77 | +One approach would be take the frequency of each number from $0$ to $N$, and build a tree-like data structure over it. |
| 78 | +E.g. a segment tree or a treap. |
| 79 | +Each node represents a range of numbers, and together to total frequency in the range, you additionally store the amount of distinct numbers in that range. |
| 80 | +It's possible to update this data structure in $O(\log N)$ time, and also find the MEX in $O(\log N)$ time, by doing a binary search for the MEX. |
| 81 | +If the node representing the range $[0, \lfloor N/2 \rfloor)$ doesn't contain $\lfloor N/2 \rfloor$ many distinct numbers, then one is missing and the MEX is smaller than $\lfloor N/2 \rfloor$, and you can recurse in the left branch of the tree. Otherwise it is at least $\lfloor N/2 \rfloor$, and you can recurse in the right branch of the tree. |
| 82 | + |
| 83 | +It's also possible to use the standard library data structures `map` and `set` (based on an approach explained [here](https://codeforces.com/blog/entry/81287?#comment-677837)). |
| 84 | +With a `map` we will remember the frequency of each number, and with the `set` we represent the numbers that are currently missing from the array. |
| 85 | +Since a `set` is ordered, `*set.begin()` will be the MEX. |
| 86 | +In total we need $O(N \log N)$ precomputation, and afterwards the MEX can be computed in $O(1)$ and an update can be performed in $O(\log N)$. |
| 87 | + |
| 88 | +```{.cpp file=mex_updates} |
| 89 | +class Mex { |
| 90 | +private: |
| 91 | + map<int, int> frequency; |
| 92 | + set<int> missing_numbers; |
| 93 | + vector<int> A; |
| 94 | + |
| 95 | +public: |
| 96 | + Mex(vector<int> const& A) : A(A) { |
| 97 | + for (int i = 0; i <= A.size(); i++) |
| 98 | + missing_numbers.insert(i); |
| 99 | + |
| 100 | + for (int x : A) { |
| 101 | + ++frequency[x]; |
| 102 | + missing_numbers.erase(x); |
| 103 | + } |
| 104 | + } |
| 105 | + |
| 106 | + int mex() { |
| 107 | + return *missing_numbers.begin(); |
| 108 | + } |
| 109 | + |
| 110 | + void update(int idx, int new_value) { |
| 111 | + if (--frequency[A[idx]] == 0) |
| 112 | + missing_numbers.insert(A[idx]); |
| 113 | + A[idx] = new_value; |
| 114 | + ++frequency[new_value]; |
| 115 | + missing_numbers.erase(new_value); |
| 116 | + } |
| 117 | +}; |
| 118 | +``` |
51 | 119 |
|
52 | 120 | ## Practice Problems |
53 | 121 |
|
54 | | -- [CODEFORCES: Replace by MEX](https://codeforces.com/contest/1375/problem/D) |
| 122 | +- [AtCoder: Neq Min](https://atcoder.jp/contests/hhkb2020/tasks/hhkb2020_c) |
| 123 | +- [Codeforces: Replace by MEX](https://codeforces.com/contest/1375/problem/D) |
| 124 | +- [Codeforces: Vitya and Strange Lesson](https://codeforces.com/problemset/problem/842/D) |
| 125 | +- [Codeforces: MEX Queries](https://codeforces.com/contest/817/problem/F) |
0 commit comments