Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions 5-regular-expressions/09-regexp-groups/1-find-webcolor-3-or-6/solution.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
A regexp to search 3-digit color `#abc`: `pattern:/#[a-f0-9]{3}/i`.
查找三位颜色 `#abc` 的正则表达式为:`pattern:/#[a-f0-9]{3}/i`

We can add exactly 3 more optional hex digits. We don't need more or less. Either we have them or we don't.
我们可以添加额外三位 16 进制数,不多也不少。这三位可能有,也可能没有。

The simplest way to add them -- is to append to the regexp: `pattern:/#[a-f0-9]{3}([a-f0-9]{3})?/i`
最简单的方式 —— 直接附加上去:`pattern:/#[a-f0-9]{3}([a-f0-9]{3})?/i`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最简单的方式 —— 在正则表达式后附加上去:


We can do it in a smarter way though: `pattern:/#([a-f0-9]{3}){1,2}/i`.
但是,还有一种更讨巧的方法:`pattern:/#([a-f0-9]{3}){1,2}/i`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们可以用一种更加妙的方法:
妙 / 机智 / 有智慧

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

用『讨巧』也可以吧

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

自己看辞典:http://www.iciba.com/smarter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

『讨巧』中文意思为『取巧,不费力气而得到好处』,用在这里我认为并没有什么不妥。翻译是给人看的,如果非要照字典的话,为什么不用机翻?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

讨巧一词,讨为动词,有渴求,请求的意思,我认为已经上下级的身份不对了。

@Moonliujk 请帮忙校对。

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没问题的,翻译风格不同而已

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calpa 我查了一下,讨巧 有 做事不费力而占便宜 之意,至于你所说的渴求、请求之意我还没有看到出处。所以这里的翻译我觉得没有问题。


Here the regexp `pattern:[a-f0-9]{3}` is in parentheses to apply the quantifier `pattern:{1,2}` to it as a whole.
这里我们把正则 `pattern:[a-f0-9]{3}` 放置在括号内,并且应用量词 `pattern:{1,2}`

In action:
实际操作:

```js run
let reg = /#([a-f0-9]{3}){1,2}/gi;
Expand All @@ -18,7 +18,7 @@ let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
alert( str.match(reg) ); // #3f3 #AA0ef #abc
```

There's minor problem here: the pattern found `match:#abc` in `subject:#abcd`. To prevent that we can add `pattern:\b` to the end:
不过这里有个小问题:这个模式会在 `subject:#abcd` 中找到 `match:#abc`。为了避免这种情况,我们可以在最后加上 `pattern:\b`

```js run
let reg = /#([a-f0-9]{3}){1,2}\b/gi;
Expand Down
8 changes: 4 additions & 4 deletions 5-regular-expressions/09-regexp-groups/1-find-webcolor-3-or-6/task.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Find color in the format #abc or #abcdef
# 查找颜色,格式为 #abc #abcdef

Write a regexp that matches colors in the format `#abc` or `#abcdef`. That is: `#` followed by 3 or 6 hexadimal digits.
编写一个正则来匹配 `#abc` `#abcdef` 格式的颜色。即:`#` 后接三位或六位 16 进制数。

Usage example:
使用案例:
```js
let reg = /your regexp/g;

Expand All @@ -11,4 +11,4 @@ let str = "color: #3f3; background-color: #AA00ef; and: #abcd";
alert( str.match(reg) ); // #3f3 #AA0ef
```

P.S. Should be exactly 3 or 6 hex digits: values like `#abcd` should not match.
注:必须为三位或六位,`#abcd` 这种不应该被匹配。
8 changes: 4 additions & 4 deletions 5-regular-expressions/09-regexp-groups/3-find-decimal-positive-numbers/solution.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@

An integer number is `pattern:\d+`.
`pattern:\d+` 可以匹配一个整数。

A decimal part is: `pattern:\.\d+`.
`pattern:\.\d+` 可以匹配小数部分。

Because the decimal part is optional, let's put it in parentheses with quantifier `pattern:'?'`.
因为小数部分不一定存在,所以我们将其放入捕获括号内,搭配量词 `pattern:'?'`

Finally we have the regexp: `pattern:\d+(\.\d+)?`:
最终我们得到这样一个正则表达式:`pattern:\d+(\.\d+)?`

```js run
let reg = /\d+(\.\d+)?/g;
Expand Down
6 changes: 3 additions & 3 deletions 5-regular-expressions/09-regexp-groups/3-find-decimal-positive-numbers/task.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Find positive numbers
# 查找正数

Create a regexp that looks for positive numbers, including those without a decimal point.
编写一个能够匹配正数的正则,包括没有小数点的数。

An example of use:
使用案例:
```js
let reg = /your regexp/g;

Expand Down
4 changes: 2 additions & 2 deletions 5-regular-expressions/09-regexp-groups/4-find-decimal-numbers/solution.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
A positive number with an optional decimal part is (per previous task): `pattern:\d+(\.\d+)?`.
回顾上个问题,`pattern:\d+(\.\d+)?` 可以匹配一个具有可选择小数部分的正数。

Let's add an optional `-` in the beginning:
那么我们只需要在最前面加上一个可选的负号 `-` 即可:

```js run
let reg = /-?\d+(\.\d+)?/g;
Expand Down
6 changes: 3 additions & 3 deletions 5-regular-expressions/09-regexp-groups/4-find-decimal-numbers/task.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Find all numbers
# 查找所有数字

Write a regexp that looks for all decimal numbers including integer ones, with the floating point and negative ones.
编写一条正则表达式来查找所有的数字,包括整数、浮点数和负数。

An example of use:
例如:

```js
let reg = /your regexp/g;
Expand Down
34 changes: 17 additions & 17 deletions 5-regular-expressions/09-regexp-groups/5-parse-expression/solution.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,37 +1,37 @@
A regexp for a number is: `pattern:-?\d+(\.\d+)?`. We created it in previous tasks.
回顾之前的问题,我们用 `pattern:-?\d+(\.\d+)?` 来匹配数字。

An operator is `pattern:[-+*/]`. We put a dash `pattern:-` the first, because in the middle it would mean a character range, we don't need that.
`pattern:[-+*/]` 匹配运算符。我们把 `pattern:-` 放在最前面,因为如果放在中间的话,则表示字符范围,这并不是我们想要的。

Note that a slash should be escaped inside a JavaScript regexp `pattern:/.../`.
注意,在 JavaScript 中,`pattern:/.../` 中的 `/` 需要被转义。

We need a number, an operator, and then another number. And optional spaces between them.
我们需要匹配一个数字、一个运算符,还有另一个数字。除此以外,还有它们之间可能存在的空格。

The full regular expression: `pattern:-?\d+(\.\d+)?\s*[-+*/]\s*-?\d+(\.\d+)?`.
完整的正则表达式为:`pattern:-?\d+(\.\d+)?\s*[-+*/]\s*-?\d+(\.\d+)?`

To get a result as an array let's put parentheses around the data that we need: numbers and the operator: `pattern:(-?\d+(\.\d+)?)\s*([-+*/])\s*(-?\d+(\.\d+)?)`.
为了将得到的结果转化为数组,我们须将所需的数据:数字及运算符,包裹在括号中,对应的表达式为:`pattern:(-?\d+(\.\d+)?)\s*([-+*/])\s*(-?\d+(\.\d+)?)`

In action:
实际操作:

```js run
let reg = /(-?\d+(\.\d+)?)\s*([-+*\/])\s*(-?\d+(\.\d+)?)/;

alert( "1.2 + 12".match(reg) );
```

The result includes:
结果包括:

- `result[0] == "1.2 + 12"` (full match)
- `result[1] == "1"` (first parentheses)
- `result[2] == "2"` (second parentheses -- the decimal part `(\.\d+)?`)
- `result[3] == "+"` (...)
- `result[4] == "12"` (...)
- `result[5] == undefined` (the last decimal part is absent, so it's undefined)
- `result[0] == "1.2 + 12"`(完整匹配)
- `result[1] == "1"`(第一个捕获组)
- `result[2] == ".2"`(第二个捕获组 —— 小数部分)
- `result[3] == "+"`...
- `result[4] == "12"`...
- `result[5] == undefined`(最后一个小数部分不存在,因此为 undefined

We need only numbers and the operator. We don't need decimal parts.
我们只需要数字和运算符,不需要小数部分。

So let's remove extra groups from capturing by added `pattern:?:`, for instance: `pattern:(?:\.\d+)?`.
因此,我们可以加上 `pattern:?:` 来去除多余的捕获组,例如:`pattern:(?:\.\d+)?`

The final solution:
最终答案:

```js run
function parse(expr) {
Expand Down
18 changes: 9 additions & 9 deletions 5-regular-expressions/09-regexp-groups/5-parse-expression/task.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,23 +1,23 @@
# Parse an expression
# 解析算数表达式

An arithmetical expression consists of 2 numbers and an operator between them, for instance:
一条算数表达式包括两个数字及其中间的一个运算符。例如:

- `1 + 2`
- `1.2 * 3.4`
- `-3 / -6`
- `-2 - 2`

The operator is one of: `"+"`, `"-"`, `"*"` or `"/"`.
运算符可能为:`"+"``"-"``"*"` `"/"`

There may be extra spaces at the beginning, at the end or between the parts.
开头、结尾和中间可能存在额外的空格。

Create a function `parse(expr)` that takes an expression and returns an array of 3 items:
编写一个函数 `parse(expr)`。它接收一个表达式作为参数,并且返回一个包含以下三个值的数组:

1. The first number.
2. The operator.
3. The second number.
1. 第一个数。
2. 运算符。
3. 第二个数。

For example:
例如:

```js
let [a, op, b] = parse("1.2 * 3.4");
Expand Down
107 changes: 53 additions & 54 deletions 5-regular-expressions/09-regexp-groups/article.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,46 +1,46 @@
# Capturing groups
# 捕获组

A part of the pattern can be enclosed in parentheses `pattern:(...)`. That's called a "capturing group".
正则模式的一部分可以用括号括起来 `pattern:(...)`,由此构成一个『捕获组』。

That has two effects:
这有两个作用:

1. It allows to place a part of the match into a separate array item when using [String#match](mdn:js/String/match) or [RegExp#exec](mdn:/RegExp/exec) methods.
2. If we put a quantifier after the parentheses, it applies to the parentheses as a whole, not the last character.
1. 当使用 [String#match](mdn:js/String/match) [RegExp#exec](mdn:/RegExp/exec) 方法时,它允许你把匹配到的部分放到一个独立的数组项里面。
2. 如果我们在括号之后加上量词,那么它会应用到这个整体,而非最后一个字符。

## Example
## 例子

In the example below the pattern `pattern:(go)+` finds one or more `match:'go'`:
以下例子中的模式 `pattern:(go)+` 将会查找一个或多个 `match:'go'`

```js run
alert( 'Gogogo now!'.match(/(go)+/i) ); // "Gogogo"
```

Without parentheses, the pattern `pattern:/go+/` means `subject:g`, followed by `subject:o` repeated one or more times. For instance, `match:goooo` or `match:gooooooooo`.
如果没有括号,模式 `pattern:/go+/` 则表示 `subject:g` 之后跟上一个或多个 `subject:o`。比如:`match:goooo` 或者 `match:gooooooooo`

Parentheses group the word `pattern:(go)` together.
捕获括号将 `pattern:(go)` 划为了一组。

Let's make something more complex -- a regexp to match an email.
让我们尝试一个更复杂的例子 —— 一个匹配 email 地址的正则表达式。

Examples of emails:
例如:

```
my@mail.com
john.smith@site.com.uk
```

The pattern: `pattern:[-.\w]+@([\w-]+\.)+[\w-]{2,20}`.
正则为:`pattern:[-.\w]+@([\w-]+\.)+[\w-]{2,20}`

- The first part before `@` may include wordly characters, a dot and a dash `pattern:[-.\w]+`, like `match:john.smith`.
- Then `pattern:@`
- And then the domain. May be a second-level domain `site.com` or with subdomains like `host.site.com.uk`. We can match it as "a word followed by a dot" repeated one or more times for subdomains: `match:mail.` or `match:site.com.`, and then "a word" for the last part: `match:.com` or `match:.uk`.
- `@` 之前的第一部分 `pattern:[-.\w]+` 可以包括单字字符、点号和中划线,比如 `match:john.smith`
- 接着是 `pattern:@`
- 然后是域名。可能是个二级域名 `site.com` 或者包括子域名 `host.site.com.uk`。我们可以通过『单词之后接一个点号』并且重复至少一次来匹配子域名 `match:mail.` 或者 `match:site.com.`,再然后是一个单词用来表示最后一部分 `match:.com` 或者 `match:.uk`

The word followed by a dot is `pattern:(\w+\.)+` (repeated). The last word should not have a dot at the end, so it's just `\w{2,20}`. The quantifier `pattern:{2,20}` limits the length, because domain zones are like `.uk` or `.com` or `.museum`, but can't be longer than 20 characters.
`pattern:(\w+\.)+` 用于表示单词后接一个点号(可重复)。最后一个单词不应该以点号结尾,因此它就是 `\w{2,20}`。量词 `pattern:{2,20}` 限制了长度,因为顶级域名可能为 `.uk``.com``.museum` 等等,但是其长度不能超过 20 个字符。

So the domain pattern is `pattern:(\w+\.)+\w{2,20}`. Now we replace `\w` with `[\w-]`, because dashes are also allowed in domains, and we get the final result.
因此域名部分的匹配模式为 `pattern:(\w+\.)+\w{2,20}`。现在我们可以用 `[\w-]` 替换 `\w`,因为域名也可以包含中划线`-`。由此我们得到了最终的结果。

That regexp is not perfect, but usually works. It's short and good enough to fix errors or occasional mistypes.
这条正则并不完美,但是通常来说它是有效的。它很简短,并且足以让你修正错误,以及时常出现的拼写问题。

For instance, here we can find all emails in the string:
举个例子,这里我们可以找到字符串中所有的 email 地址:

```js run
let reg = /[-.\w]+@([\w-]+\.)+[\w-]{2,20}/g;
Expand All @@ -49,53 +49,52 @@ alert("my@mail.com @ his@site.com.uk".match(reg)); // my@mail.com,his@site.com.u
```


## Contents of parentheses
## 捕获内容

Parentheses are numbered from left to right. The search engine remembers the content of each and allows to reference it in the pattern or in the replacement string.
捕获括号会按从左往右的顺序标上序号。查找引擎会记住每个括号内的内容,并且允许你在模式以及替换字符串中引用它。

For instance, we can find an HTML-tag using a (simplified) pattern `pattern:<.*?>`. Usually we'd want to do something with the result after it.
举例来说,我们可以使用一个(简化版)的模式 `pattern:<.*?>` 来查找一个 HTML 标签。一般来说,我们会希望对这个结果做些什么。

If we enclose the inner contents of `<...>` into parentheses, then we can access it like this:
如果我们把 `<...>` 里面的内容放到一对捕获括号里,那么我们通过这种方法来引用它:

```js run
let str = '<h1>Hello, world!</h1>';
let reg = /<(.*?)>/;

alert( str.match(reg) ); // Array: ["<h1>", "h1"]
```
[String#match](mdn:js/String/match) 只有在正则表达式中没有 `pattern:/.../g` 标记时才会返回组。

The call to [String#match](mdn:js/String/match) returns groups only if the regexp has no `pattern:/.../g` flag.

If we need all matches with their groups then we can use [RegExp#exec](mdn:js/RegExp/exec) method as described in <info:regexp-methods>:
如果我们需要查找所有的匹配组,那么我们可以使用在 <info:regexp-methods> 中介绍过的 [RegExp#exec](mdn:js/RegExp/exec) 方法:

```js run
let str = '<h1>Hello, world!</h1>';

// two matches: opening <h1> and closing </h1> tags
// 两组匹配:起始标签 <h1>和闭合标签</h1>
let reg = /<(.*?)>/g;

let match;

while (match = reg.exec(str)) {
// first shows the match: <h1>,h1
// then shows the match: </h1>,/h1
// 第一次显示匹配:<h1>,h1
// 之后显示匹配:</h1>,/h1
alert(match);
}
```

Here we have two matches for `pattern:<(.*?)>`, each of them is an array with the full match and groups.
如此我们便得到了 `pattern:<(.*?)>` 的两个匹配项,他们中的每一个都包括完整的匹配和对应的捕获组。

## Nested groups
## 嵌套捕获组

Parentheses can be nested. In this case the numbering also goes from left to right.
捕获括号是可以嵌套的。在这种情况下,依然是从左往右编号。

For instance, when searching a tag in `subject:<span class="my">` we may be interested in:
举个例子,当对标签 `subject:<span class="my">` 进行查找时,我们可能感兴趣的有:

1. The tag content as a whole: `match:span class="my"`.
2. The tag name: `match:span`.
3. The tag attributes: `match:class="my"`.
1. 标签整体的内容:`match:span class="my"`
2. 标签名:`match:span`
3. 标签的属性:`match:class="my"`

Let's add parentheses for them:
让我们为它们加上捕获括号:

```js run
let str = '<span class="my">';
Expand All @@ -106,21 +105,21 @@ let result = str.match(reg);
alert(result); // <span class="my">, span class="my", span, class="my"
```

Here's how groups look:
它看起来像这样:

![](regexp-nested-groups.png)

At the zero index of the `result` is always the full match.
`result` 的首项永远是完整的匹配结果。

Then groups, numbered from left to right. Whichever opens first gives the first group `result[1]`. Here it encloses the whole tag content.
之后就是各个捕获组,从左往右依次排开。第一个左括号将会匹配到第一个捕获组 `result[1]`。在这个例子中,它涵盖了整个标签的内容。

Then in `result[2]` goes the group from the second opening `pattern:(` till the corresponding `pattern:)` -- tag name, then we don't group spaces, but group attributes for `result[3]`.
`result[2]` 对应第二个左括号 `pattern:(` 到与其对应的右括号 `pattern:)` 之间的内容 —— 标签名。再然后,我们跳过空格,将所有属性划为一组,对应 `result[3]`

**If a group is optional and doesn't exist in the match, the corresponding `result` index is present (and equals `undefined`).**
**如果某个捕获组是可选的,且在匹配中没有找到对应项,那么在相应的匹配结果中,该项依然会存在(值为 `undefined`)。**

For instance, let's consider the regexp `pattern:a(z)?(c)?`. It looks for `"a"` optionally followed by `"z"` optionally followed by `"c"`.
让我们考虑这条正则表达式 `pattern:a(z)?(c)?`。它会查找字符 `"a"`,之后可能跟着一个 `"z"`,之后可能跟着一个 `"c"`

If we run it on the string with a single letter `subject:a`, then the result is:
如果我们对单个字符 `subject:a` 执行匹配,其结果为:

```js run
let match = 'a'.match(/a(z)?(c)?/);
Expand All @@ -131,35 +130,35 @@ alert( match[1] ); // undefined
alert( match[2] ); // undefined
```

The array has the length of `3`, but all groups are empty.
该数组包含三项,但是所有的捕获组都为空。

And here's a more complex match for the string `subject:ack`:
对于 `subject:ack`,情况要更复杂一些:

```js run
let match = 'ack'.match(/a(z)?(c)?/)

alert( match.length ); // 3
alert( match[0] ); // ac (whole match)
alert( match[1] ); // undefined, because there's nothing for (z)?
alert( match[1] ); // undefined,因为 (z)? 没有匹配项
alert( match[2] ); // c
```

The array length is permanent: `3`. But there's nothing for the group `pattern:(z)?`, so the result is `["ac", undefined, "c"]`.
数组的长度依然是 `3`。但是因为捕获组 `pattern:(z)?` 没有对应项,所以结果为 `["ac", undefined, "c"]`

## Non-capturing groups with ?:
## 非捕获组 ?:

Sometimes we need parentheses to correctly apply a quantifier, but we don't want their contents in the array.
某些时候,我们会希望使用括号来正确设置量词,但是并不希望其内容出现在结果数组中。

A group may be excluded by adding `pattern:?:` in the beginning.
你可以通过在开头加上 `pattern:?:` 从而在结果中排除该组。

For instance, if we want to find `pattern:(go)+`, but don't want to put remember the contents (`go`) in a separate array item, we can write: `pattern:(?:go)+`.
例如,我们希望找到 `pattern:(go)+`,但是并不希望其内容(`go`)出现在单独的数组项中,那么我们可以这样写:`pattern:(?:go)+`

In the example below we only get the name "John" as a separate member of the `results` array:
下面的例子中,只有名字『John』会作为一个独立项出现在 `results` 数组里:

```js run
let str = "Gogo John!";
*!*
// exclude Gogo from capturing
// 避免捕获 Gogo
let reg = /(?:go)+ (\w+)/i;
*/!*

Expand Down
Empty file modified 5-regular-expressions/09-regexp-groups/regexp-nested-groups.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.