Skip to content

Commit dac6ccd

Browse files
committed
add greedy quantifiers
1 parent 679998a commit dac6ccd

2 files changed

Lines changed: 27 additions & 0 deletions

File tree

python 2/koans/about_regex.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -365,3 +365,28 @@ def using_intervals_at_least(self):
365365
#tip: Yo must match: several digits + colon + whitespace + $ + at least 3 digits + . + 2 digits (decimals)
366366
self.assertEquals(len(re.findall("__", string)),4, "Search all orders valued at 100$ or more.")
367367

368+
def using_intervals_preventing_over_mathing(self):
369+
"""
370+
Lesson 4
371+
372+
Consider this example. Text that follows is part of a Web Page. The regular expression needs to match
373+
any text within <B> tags.
374+
text: This offer is not available to customers living in <B>AK</B> and <B>HI</B>
375+
regex: <[Bb]>.*</[Bb]>
376+
Result: <B>AK</B> and <B>HI</B>
377+
Instead of two matches, only one was found. the .* matched everything after the first <B> until the last
378+
<B> so that the text AK</B> and <B>HI was matched.
379+
The reason for this is that metacharacters such as * and + are greedy. They look for the greatest
380+
possible match as opposed to the smallest.
381+
The solution is to use lazy versions of these quantifiers(they are referred to as being lazy because
382+
they match the fewest characters instead oj the most).
383+
384+
Lazy quantifiers are defined by appending an ?:
385+
*?
386+
*?
387+
{n,}?
388+
"""
389+
string = "This offer is not available to customers living in <B>AK</B> and <B>HI</B>"
390+
391+
self.assertEquals(len(re.findall(__, string)),2, " The regular expression needs to match any text within <B> tags.")
392+

python 2/koans/regex_solutions.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,5 @@ using_intervals_range_intervals:
4141
\d{1,2}[-\/]\d{1,2}[-\/]\d{2,4}
4242
using_intervals_at_least:
4343
\d+: \$\d{3,}\.\d{2}
44+
using_intervals_preventing_over_mathing:
45+
<[Bb]>.*?</[Bb]>

0 commit comments

Comments
 (0)