Skip to content

Commit 2af17a1

Browse files
committed
improve MEX article
1 parent 8ccaca6 commit 2af17a1

File tree

2 files changed

+154
-30
lines changed

2 files changed

+154
-30
lines changed

src/sequences/mex.md

Lines changed: 101 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -3,52 +3,123 @@ tags:
33
- Original
44
title: MEX (minimal excluded) of a sequence
55
---
6-
# MEX of a sequence in O(N)
6+
# MEX (minimal excluded) of a sequence
77

8-
Given an array $A$ of size $N$. You have to find the minimal element that is not present in the array and is equal to or greater than $0$. See [Wikipedia](https://en.wikipedia.org/wiki/Mex_(mathematics)) for more information.
8+
Given an array $A$ of size $N$. You have to find the minimal non-negative element that is not present in the array. That number is commonly called the **MEX** (minimal excluded).
99

10-
First we have to create a set of all elements in the array to quickly check for element presence in $O(1)$. Then check all numbers from $0$ to $N$. If the current number is not present in the set, return it. If all numbers are present in the set, return $N$, because all elements from $0$ to $N-1$ are present.
10+
$$
11+
\begin{align}
12+
\text{mex}(\{0, 1, 2, 4, 5\}) &= 3 \\
13+
\text{mex}(\{0, 1, 2, 3, 4\}) &= 5 \\
14+
\text{mex}(\{1, 2, 3, 4, 5\}) &= 0 \\
15+
\end{align}
16+
$$
1117

12-
*NB*: This approach is fast, but works well only if your array doesn't change during program execution, e.g. it is not effective in problems with changing the initial array. If your array is changed by any type of array queries, use [O(N log N) approach](https://codeforces.com/blog/entry/81287?#comment-677837) instead.
18+
Notice, that the MEX of an array of size $N$ can never be bigger than $N$ itself.
1319

14-
## Implementation (C++):
20+
The easiest approach is to create a set of all elements in the array $A$, so that we can quickly check if a number is part of the array or not.
21+
Then we can check all numbers from $0$ to $N$, if the current number is not present in the set, return it.
1522

16-
```cpp
17-
int mex(vector<int> a) {
18-
set<int> b(a.begin(), a.end());
19-
for (int i=0; ; ++i)
20-
if (!b.count(i))
21-
return i;
23+
## Implementation
24+
25+
The following algorithm runs in $O(N \log N)$ time.
26+
27+
```{.cpp file=mex_simple}
28+
int mex(vector<int> const& A) {
29+
set<int> b(A.begin(), A.end());
30+
31+
int result = 0;
32+
while (b.count(result))
33+
++result;
34+
return result;
2235
}
2336
```
2437
25-
If an algorithm requires fast $O(N)$ MEX computation, it is possible by computing an boolean vector of existing elements and then checking the first non-present one:
38+
If an algorithm requires a $O(N)$ MEX computation, it is possible by using a boolean vector instead of a set.
39+
Notice, that the array needs to be as big as the biggest possible array size.
40+
2641
27-
```cpp
28-
int mex (const vector<int> & a) {
29-
static bool used[D+1] = { 0 };
30-
int c = (int) a.size();
42+
```{.cpp file=mex_linear}
43+
int mex(vector<int> const& A) {
44+
static bool used[MAX_N+1] = { 0 };
3145
32-
for (int i=0; i<c; ++i)
33-
if (a[i] <= D)
34-
used[a[i]] = true;
46+
// mark the given numbers
47+
for (int x : A) {
48+
if (x <= MAX_N)
49+
used[x] = true;
50+
}
3551
36-
int result;
37-
for (int i=0; ; ++i)
38-
if (!used[i]) {
39-
result = i;
40-
break;
41-
}
52+
// find the mex
53+
int result = 0;
54+
while (used[result])
55+
++result;
4256
43-
for (int i=0; i<c; ++i)
44-
if (a[i] <= D)
45-
used[a[i]] = false;
57+
// clear the array again
58+
for (int x : A) {
59+
if (x <= MAX_N)
60+
used[x] = false;
61+
}
4662
47-
return result;
63+
return result;
4864
}
4965
```
5066

67+
This approach is fast, but only works well if you have to compute the MEX once.
68+
If you need to compute the MEX over and over, e.g. because your array keeps changing, then it is not effective.
69+
For that, we need something better.
70+
71+
## MEX with array updates
72+
73+
In the problem you need to change individual numbers in the array, and compute the new MEX of the array after each such update.
74+
75+
There is a need for a better data structure that handles such queries efficiently.
76+
77+
One approach would be take the frequency of each number from $0$ to $N$, and build a tree-like data structure over it.
78+
E.g. a segment tree or a treap.
79+
Each node represents a range of numbers, and together to total frequency in the range, you additionally store the amount of distinct numbers in that range.
80+
It's possible to update this data structure in $O(\log N)$ time, and also find the MEX in $O(\log N)$ time, by doing a binary search for the MEX.
81+
If the node representing the range $[0, \lfloor N/2 \rfloor)$ doesn't contain $\lfloor N/2 \rfloor$ many distinct numbers, then one is missing and the MEX is smaller than $\lfloor N/2 \rfloor$, and you can recurse in the left branch of the tree. Otherwise it is at least $\lfloor N/2 \rfloor$, and you can recurse in the right branch of the tree.
82+
83+
It's also possible to use the standard library data structures `map` and `set` (based on an approach explained [here](https://codeforces.com/blog/entry/81287?#comment-677837)).
84+
With a `map` we will remember the frequency of each number, and with the `set` we represent the numbers that are currently missing from the array.
85+
Since a `set` is ordered, `*set.begin()` will be the MEX.
86+
In total we need $O(N \log N)$ precomputation, and afterwards the MEX can be computed in $O(1)$ and an update can be performed in $O(\log N)$.
87+
88+
```{.cpp file=mex_updates}
89+
class Mex {
90+
private:
91+
map<int, int> frequency;
92+
set<int> missing_numbers;
93+
vector<int> A;
94+
95+
public:
96+
Mex(vector<int> const& A) : A(A) {
97+
for (int i = 0; i <= A.size(); i++)
98+
missing_numbers.insert(i);
99+
100+
for (int x : A) {
101+
++frequency[x];
102+
missing_numbers.erase(x);
103+
}
104+
}
105+
106+
int mex() {
107+
return *missing_numbers.begin();
108+
}
109+
110+
void update(int idx, int new_value) {
111+
if (--frequency[A[idx]] == 0)
112+
missing_numbers.insert(A[idx]);
113+
A[idx] = new_value;
114+
++frequency[new_value];
115+
missing_numbers.erase(new_value);
116+
}
117+
};
118+
```
51119

52120
## Practice Problems
53121

54-
- [CODEFORCES: Replace by MEX](https://codeforces.com/contest/1375/problem/D)
122+
- [AtCoder: Neq Min](https://atcoder.jp/contests/hhkb2020/tasks/hhkb2020_c)
123+
- [Codeforces: Replace by MEX](https://codeforces.com/contest/1375/problem/D)
124+
- [Codeforces: Vitya and Strange Lesson](https://codeforces.com/problemset/problem/842/D)
125+
- [Codeforces: MEX Queries](https://codeforces.com/contest/817/problem/F)

test/test_mex.cpp

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
#include <cassert>
2+
#include <map>
3+
#include <set>
4+
#include <vector>
5+
using namespace std;
6+
7+
namespace MexSimple {
8+
#include "mex_simple.h"
9+
}
10+
11+
namespace MexLinear {
12+
constexpr int MAX_N = 100;
13+
#include "mex_linear.h"
14+
} // namespace MexLinear
15+
16+
namespace MexUpdate {
17+
#include "mex_updates.h"
18+
}
19+
20+
struct Example {
21+
vector<int> array;
22+
int mex;
23+
};
24+
25+
int main() {
26+
vector<Example> examples = {
27+
{{0, 1, 2}, 3},
28+
{{0, 1, 3}, 2},
29+
{{0, 2, 3}, 1},
30+
{{1, 2, 3}, 0},
31+
};
32+
33+
for (auto example : examples) {
34+
assert(MexSimple::mex(example.array) == example.mex);
35+
assert(MexLinear::mex(example.array) == example.mex);
36+
}
37+
38+
vector<int> A = {1, 2, 4, 1, 4};
39+
auto mex = MexUpdate::Mex(A);
40+
assert(mex.mex() == 0); // array = {1, 2, 4, 1, 4}
41+
mex.update(3, 0);
42+
assert(mex.mex() == 3); // array = {1, 2, 4, 0, 4}
43+
mex.update(2, 0);
44+
assert(mex.mex() == 3); // array = {1, 2, 0, 0, 4}
45+
mex.update(3, 3);
46+
assert(mex.mex() == 5); // array = {1, 2, 0, 3, 4}
47+
mex.update(4, 3);
48+
assert(mex.mex() == 4); // array = {1, 2, 0, 3, 3}
49+
mex.update(4, 9);
50+
assert(mex.mex() == 4); // array = {1, 2, 0, 9, 9}
51+
mex.update(3, 9);
52+
assert(mex.mex() == 3); // array = {1, 2, 0, 9, 9}
53+
}

0 commit comments

Comments
 (0)