Skip to content

Commit ea463fa

Browse files
author
github-actions
committed
Sync regex_inverter example from pyparsing
1 parent afcbdac commit ea463fa

3 files changed

Lines changed: 596 additions & 0 deletions

File tree

dest/README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Regex Inverter
2+
3+
This directory contains a web-based Regex Inverter tool powered by [PyScript](https://pyscript.net/) and `pyparsing`.
4+
5+
## Overview
6+
7+
The Regex Inverter allows you to enter a regular expression and generate all possible strings that match it. It is particularly useful for visualizing the expansion of character classes, repetitions, and alternatives.
8+
9+
### Key Features:
10+
- **Expansion of Regex Patterns:** Generates matching strings for patterns like `[A-Z]{3}\d{3}`.
11+
- **Client-Side Processing:** All computations happen in your browser using PyScript, so no data is sent to a server.
12+
- **Progress Tracking:** Shows the total count of possible matches, even if they exceed the display limit.
13+
14+
### Supported Syntax:
15+
- Character sets: `[a-z]`, `[0-9A-F]`, `[^0-9]`
16+
- Repetitions: `{n}`, `{min,max}`, `{,max}`
17+
- Alternatives: `apple|orange`
18+
- Groups: `(abc|def)`
19+
- Macros: `\d`, `\w`, `\s`, `\D`, `\W`, `\S`
20+
- Dot: `.` (matches printable characters)
21+
22+
### Constraints:
23+
- **Unbounded operators `+` and `*` are not supported.** You must use explicit range repetitions like `{1,10}` instead of `+` to prevent infinite or excessively large result sets that would crash the browser.
24+
25+
## Files
26+
27+
- `index.html`: The web interface and PyScript configuration.
28+
- `inv_regex.py`: The core inversion logic using `pyparsing`.
29+
30+
## How to Run Locally
31+
32+
To run the Regex Inverter on your own machine:
33+
34+
1. Open a terminal or command prompt.
35+
2. Navigate to this directory:
36+
```bash
37+
cd examples/regex_inverter
38+
```
39+
3. Start a local Python web server:
40+
```bash
41+
python -m http.server
42+
```
43+
4. Open your web browser and go to:
44+
[http://localhost:8000](http://localhost:8000)
45+
46+
## Deployment
47+
48+
To deploy this to a web server:
49+
50+
1. Upload both `index.html` and `inv_regex.py` to the same directory on your web server.
51+
2. Ensure your server is configured to serve `.html` files (most are by default).
52+
3. Access the `index.html` file through its URL.
53+
54+
Since this is a static site (using PyScript to run Python in the browser), you can even host it on GitHub Pages or any other static site hosting service.

dest/index.html

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
<!DOCTYPE html>
2+
<html lang="en">
3+
<head>
4+
<meta charset="UTF-8">
5+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
6+
<title>PyParsing Regex Inverter</title>
7+
<link rel="stylesheet" href="https://pyscript.net/releases/2025.11.1/core.css">
8+
<script type="module" src="https://pyscript.net/releases/2025.11.1/core.js"></script>
9+
<style>
10+
body { font-family: sans-serif; margin: 2em; }
11+
#regex-input { width: 300px; }
12+
#output { width: 100%; height: 400px; margin-top: 1em; font-family: monospace; white-space: pre; border: 1px solid #ccc; padding: 0.5em; overflow: auto; }
13+
.controls { margin-bottom: 1em; }
14+
table { border-collapse: collapse; width: 100%; }
15+
table th, table td { border: 1px solid #ddd; padding: 8px; text-align: left; }
16+
table th { background-color: #f2f2f2; }
17+
</style>
18+
</head>
19+
<body>
20+
<h1>Regex Inverter</h1>
21+
<h4>by Paul McGuire, January 2026</h4>
22+
<details>
23+
<summary>Description</summary>
24+
<div>
25+
<p>This page allows you to invert a regular expression, generating all strings that match it.</p>
26+
<p><strong>Instructions:</strong> Enter a regular expression in the "Regex" field and specify the maximum number of results you want to see (up to 100,000,000). Click "Invert" or press Enter to generate the matching strings.</p>
27+
<p><strong>Constraints:</strong></p>
28+
<ul>
29+
<li>Unbounded repetition operators <code>+</code> and <code>*</code> are <strong>not supported</strong>.</li>
30+
<li>Replace <code>+</code> or <code>*</code> with explicit <code>{n}</code> or <code>{min,max}</code> repetition operators (e.g., use an explicit repetition like <code>[A-Z]{1,4}</code> instead of <code>[A-Z]+</code>, or <code>[A-Z]{,4}</code> instead of <code>[A-Z]*</code>).</li>
31+
</ul>
32+
<p><strong>Note:</strong> Complex regular expressions or those with large repetition counts may take some time to process.</p>
33+
</div>
34+
</details>
35+
<p>
36+
<details>
37+
<summary>Examples</summary>
38+
<div>
39+
<p>Here are some example regular expressions to try:</p>
40+
<table>
41+
<thead>
42+
<tr><th>Description</th><th>Regex</th></tr>
43+
</thead>
44+
<tr>
45+
<td>Match one uppercase letter followed by three digits</td>
46+
<td><pre>[A-Z]-\d{3}</pre></td>
47+
</tr>
48+
<tr>
49+
<td>Time of day (HH:MM:SS)</td>
50+
<td><pre>(2[0-3]|[01]\d):([0-5]\d):([0-5]\d)</pre></td>
51+
</tr>
52+
<tr>
53+
<td>8-bit binary numbers</td>
54+
<td><pre>[01]{8}</pre></td>
55+
</tr>
56+
<tr>
57+
<td>Integer from 0 to 99</td>
58+
<td><pre>[1-9]?\d</pre></td>
59+
</tr>
60+
<tr>
61+
<td>Integer from 0 to 255</td>
62+
<td><pre>25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d</pre></td>
63+
</tr>
64+
<tr>
65+
<td>Roman Numerals to 50</td>
66+
<td><pre>(X{,3}|XL)(I{,3}|IV|VI{,3}|IX)|L</pre></td>
67+
</tr>
68+
<tr>
69+
<td>IPv4 addresses in 192.168.0.0/16</td>
70+
<td><pre>192\.168(\.((25[0-5]|2[0-4]\d|1\d{2}|[1-9]\d|\d))){2}</pre></td>
71+
</tr>
72+
<tr>
73+
<td>MAC address</td>
74+
<td><pre>[0-9A-Fa-f]{2}([:-][0-9A-Fa-f]{2}){5}</pre></td>
75+
</tr>
76+
<tr>
77+
<td>UUID</td>
78+
<td><pre>[0-9A-F]{8}(-[0-9A-F]{4}){3}-[0-9A-F]{12}</pre></td>
79+
</tr>
80+
<tr>
81+
<td>Original US Area codes</td>
82+
<td><pre>[2-9][10][2-9]</pre></td>
83+
</tr>
84+
</table>
85+
</div>
86+
</details>
87+
<p>Enter a regular expression to see its matching strings.</p>
88+
<div class="controls">
89+
<label for="regex-input">Regex:</label>
90+
<input type="text" id="regex-input" placeholder="e.g. [A-Z]-\d{3}" value="">
91+
<label for="max-results">Max results:</label>
92+
<input type="number" id="max-results" value="200" min="1" style="width: 60px;">
93+
<button id="invert-btn" py-click="do_invert">Invert</button>
94+
<button id="cancel-btn" py-click="cancel_invert" style="display: none;">Cancel</button>
95+
</div>
96+
<div id="status"></div>
97+
<textarea id="output" readonly></textarea>
98+
<p>GitHub repo for this page <a href="https://github.com/ptmcg/regex_inverter.git" target="_blank">regex-inverter</a></p>
99+
<p>Powered by <a href="https://pyscript.net" target="_blank">PyScript</a> and <a href="https://pypi.org/project/pyparsing/" target="_blank">pyparsing</a></p>
100+
101+
<py-config>
102+
packages = ["pyparsing"]
103+
[[fetch]]
104+
files = ["inv_regex.py"]
105+
</py-config>
106+
107+
<script type="py">
108+
from pyscript import document
109+
from inv_regex import invert, count
110+
import pyparsing
111+
import itertools
112+
import asyncio
113+
114+
is_cancelled = False
115+
116+
def cancel_invert(event):
117+
global is_cancelled
118+
is_cancelled = True
119+
120+
async def do_invert(event):
121+
global is_cancelled
122+
is_cancelled = False
123+
124+
regex = document.querySelector("#regex-input").value.strip()
125+
if not regex:
126+
return
127+
128+
try:
129+
max_results = int(document.querySelector("#max-results").value)
130+
except ValueError:
131+
max_results = 200
132+
if max_results > 100_000_000:
133+
max_results = 100_000_000
134+
document.querySelector("#max-results").value = 100_000_000
135+
136+
output_area = document.querySelector("#output")
137+
status_div = document.querySelector("#status")
138+
cancel_btn = document.querySelector("#cancel-btn")
139+
140+
output_area.value = ""
141+
status_div.innerText = "Processing..."
142+
143+
# Use a small delay to allow status to update in the UI
144+
await asyncio.sleep(0.1)
145+
146+
try:
147+
# Get up to max_results items using an iterator
148+
invert_iter = invert(regex)
149+
results = list(itertools.islice(invert_iter, max_results))
150+
num_shown = len(results)
151+
output_area.value = "\n".join(results)
152+
153+
if num_shown == max_results:
154+
cancel_btn.style.display = "inline"
155+
156+
await asyncio.sleep(0.1)
157+
158+
# Count the remaining items in the iterator
159+
remaining_count = 0
160+
for i, _ in enumerate(invert_iter, 1):
161+
remaining_count = i
162+
if i % 100_000 == 0:
163+
status_div.innerText = f"Counting matches... {num_shown + i:,} found so far"
164+
await asyncio.sleep(0)
165+
if is_cancelled:
166+
status_div.innerText += " (cancelled)"
167+
break
168+
169+
total_count = num_shown + remaining_count
170+
171+
if not is_cancelled:
172+
status_div.innerText = f"Total matching strings: {total_count:,}"
173+
if total_count > max_results:
174+
status_div.innerText += f" (showing first {max_results:,})"
175+
176+
except pyparsing.ParseBaseException as pe:
177+
status_div.innerText = "Error"
178+
output_area.value = f"Parse Error: {pe.msg}\n{pe.explain(depth=0)}"
179+
except Exception as e:
180+
status_div.innerText = "Error"
181+
output_area.value = f"Error: {str(e)}"
182+
finally:
183+
cancel_btn.style.display = "none"
184+
185+
# Add event listener for Enter key in input box
186+
def on_keypress(event):
187+
if event.key == "Enter":
188+
asyncio.create_task(do_invert(None))
189+
190+
document.querySelector("#regex-input").onkeypress = on_keypress
191+
document.querySelector("#max-results").onkeypress = on_keypress
192+
</script>
193+
</body>
194+
</html>

0 commit comments

Comments
 (0)