|
| 1 | +# Z-algorithm |
| 2 | + |
| 3 | +The Z-algorithm finds occurrences of a "word" `W` |
| 4 | +within a main "text string" `T` in linear time. |
| 5 | + |
| 6 | +Given a string `S` of length `n`, the algorithm produces |
| 7 | +an array, `Z` where `Z[i]` represents the ongest substring |
| 8 | +starting from `S[i]` which is also a prefix of `S`. Finding |
| 9 | +`Z` for the string obtained by concatenating the word, `W` |
| 10 | +with a nonce character, say `$` followed by the text, `T`, |
| 11 | +helps with pattern matching, for if there is some index `i` |
| 12 | +such that `Z[i]` equals the pattern length, then the pattern |
| 13 | +must be present at that point. |
| 14 | + |
| 15 | +While the `Z` array can be computed with two nested loops, the |
| 16 | +following strategy shows how to obtain it in linear time, based |
| 17 | +on the idea that as we iterate over the letters in the string |
| 18 | +(index `i` from `1` to `n - 1`), we maintain an interval `[L, R]` |
| 19 | +which is the interval with maximum `R` such that `1 ≤ L ≤ i ≤ R` |
| 20 | +and `S[L...R]` is a prefix that is also a substring (if no such |
| 21 | +interval exists, just let `L = R = - 1`). For `i = 1`, we can |
| 22 | +simply compute `L` and `R` by comparing `S[0...]` to `S[1...]`. |
| 23 | + |
| 24 | +## Complexity |
| 25 | + |
| 26 | +- **Time:** `O(|W| + |T|)` |
| 27 | +- **Space:** `O(|W|)` |
0 commit comments