Skip to content

Commit b6ee914

Browse files
committed
Add ruleset style guide.
1 parent 24ca919 commit b6ee914

File tree

1 file changed

+84
-0
lines changed

1 file changed

+84
-0
lines changed

ruleset-style.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# Ruleset style guide
2+
3+
Goal: Rules should be written in a way that is consistent, easy for humans to
4+
read and debug, and makes errors less likely.
5+
6+
To that end, here are some style guidelines for writing or modifying rulesets.
7+
They are intended to help and simplify in places where choices are ambiguous,
8+
but like all guidelines they can be broken if the circumstances require it.
9+
10+
Prefer listing explicit target hosts and a single rewrite from "^http:" to
11+
"^https:". This saves you time as a ruleset author because each explicit target
12+
host automatically creates a test URL, reducing the need to add your own test
13+
URLs.
14+
15+
If all subdomains of a given domain support HTTPS, go ahead and use a
16+
left-wildcard, along with a plain rewrite from "^http:" to "^https:". Make sure
17+
to add a bunch of test URLs for the more important subdomains. If you're not
18+
sure what subdomains might exist, check the 'subdomain tab on Wolfram Alpha:
19+
http://www.wolframalpha.com/input/?i=_YOUR_DOMAIN_GOES_HERE_.
20+
21+
If there are a handful of tricky subdomains, but most subdomains can handle the
22+
plain rewrite from "^http:" to "^https:", specify the rules for the tricky
23+
subdomains first, and then then plain rule last. Earlier rules will take
24+
precedence, and processing stops at the first matching rule.
25+
26+
Avoid regexes with long strings of subdomains, e.g. <rule
27+
from="^http://(foo|bar|baz|bananas).example.com" />. These are hard to read and
28+
maintain, and are usually better expressed with a longer list of target hosts,
29+
plus a plain rewrite from "^http:" to "^https:".
30+
31+
Prefer dashes over underscores in filenames.
32+
33+
When matching an arbitrary DNS label (a single component of a hostname), prefer
34+
`([\w-]+)` for a single label (i.e www), or `([\w-.]+)` for multiple labels
35+
(i.e. www.beta). Avoid the more visually complicated `([^/:@\.]+\.)?`, seen in
36+
some rules.
37+
38+
For `securecookie` tags, it's common to match any cookie name. For these, prefer
39+
`.+` over `.*`. They are functionally equivalent, but it's nice to be
40+
consistent.
41+
42+
Avoid the negative lookahead operator `?!`. This is almost always better
43+
expressed using positive rule tags and negative exclusion tags. Some rulesets
44+
have exclusion tags that contain negative lookahead operators, which is very
45+
confusing.
46+
47+
Prefer capturing groups `(www\.)?` over non-capturing `(?:www\.)?`. The
48+
non-capturing form adds extra line noise that makes rules harder to read.
49+
Generally you can achieve the same effect by choosing a correspondingly higher
50+
index for your replacement group to account for the groups you don't care about.
51+
52+
Here is an example ruleset today:
53+
54+
```
55+
<ruleset name="WHATWG.org">
56+
<target host="whatwg.org" />
57+
<target host="*.whatwg.org" />
58+
59+
<rule from="^http://((?:developers|html-differences|images|resources|\w+\.spec|wiki|www)\.)?whatwg\.org/"
60+
to="https://$1whatwg.org/" />
61+
62+
</ruleset>
63+
```
64+
65+
Here is how you could rewrite it according to these style guidelines, including
66+
test URLs:
67+
```
68+
<ruleset name="WHATWG.org">
69+
<target host="whatwg.org" />
70+
<target host="developers.whatwg.org" />
71+
<target host="html-differences.whatwg.org" />
72+
<target host="images.whatwg.org" />
73+
<target host="resources.whatwg.org" />
74+
<target host="*.spec.whatwg.org" />
75+
<target host="wiki.whatwg.org" />
76+
<target host="www.whatwg.org" />
77+
78+
<test url="http://html.spec.whatwg.org/" />
79+
<test url="http://fetch.spec.whatwg.org/" />
80+
81+
<rule from="^http:"
82+
to="https:" />
83+
84+
</ruleset>

0 commit comments

Comments
 (0)