Skip to content

Commit 731e73c

Browse files
committed
Update rulesets HTML documentation.
1 parent 707531f commit 731e73c

File tree

3 files changed

+94
-18
lines changed

3 files changed

+94
-18
lines changed

docs/rulesets.html

Lines changed: 85 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,88 @@
1-
<p>This page describes how to write rulesets for <a href="https://eff.org/https-everywhere">HTTPS Everywhere</a>, the Firefox plugin that switches sites over from http to https automatically. HTTPS Everywhere comes with <a href="http://www.eff.org/https-everywhere/atlas/">thousands</a> of rulesets, but you might want to edit them, or write new ones. Rulesets are simple xml files. Here is a simplified version of <a href="https://gitweb.torproject.org/https-everywhere.git/blob/HEAD:/src/chrome/content/rules/Twitter.xml"><tt>Twitter.xml</tt></a>, from the plugin distribution:</p><pre>&lt;ruleset name="Twitter"&gt;
2-
&lt;target host="www.twitter.com" /&gt;
3-
&lt;target host="twitter.com" /&gt;
1+
<p>
2+
This page describes how to write rulesets for
3+
<a href="https://eff.org/https-everywhere">HTTPS Everywhere</a>,
4+
the Firefox add-on that switches sites over from HTTP
5+
to HTTPS automatically. HTTPS Everywhere comes with
6+
<a href="http://www.eff.org/https-everywhere/atlas/">thousands</a>
7+
of rulesets that tell HTTPS Everywhere which sites it should switch
8+
to HTTPS and how. If there is a site that HTTPS Everywhere doesn't
9+
switch to HTTPS and would like to add it, this guide will explain how.
10+
</p>
411

5-
&lt;rule from="^http://(www\.)?twitter\.com/" to="https://twitter.com/"/&gt;
6-
&lt;/ruleset&gt;</pre><p>The "target" tag specifies which domains the ruleset might apply to. The target host tag does not use regular expressions. The content of a target tag should be the actual name of a web server to which the ruleset applies or partially applies, like <tt>www.eff.org</tt>, <tt>www.google.com</tt>, <tt>secure.wikimedia.org</tt>, and so on. If your rule applies to the bare domain (like "eff.org", not just "www.eff.org"), you need an additional target tag to say so. For example, the sample ruleset above is meant to apply to either <tt>www.twitter.com</tt> or <tt>twitter.com</tt>, so it has a separate target tag for each. A target may, however, contain a wildcard in one portion of the domain (like <tt>*.google.com</tt> or <tt>google.*</tt>, but <tt>*.google.*</tt> would not work). A wildcard on the left will match arbitrarily deep subdomains (for instance, <tt>*.facebook.com</tt> will match <tt>s-static.ak.facebook.com</tt>).Exception: currently this is not true for a target host that is less than three levels deep. <tt>&lt;target host="*.com"&gt;</tt> would match <tt>thing.com</tt> but not <tt>very.thing.com</tt>. We would consider changing that if anybody needs to use it. The "rule" does the actual rewriting work. The "from" and "to" clauses in each rule are <a href="http://www.regular-expressions.info/javascript.html">JavaScript regular expressions</a>. You can use them to rewrite URLs in more complicated ways. Here's a simplified (and now obsolete) example for Wikipedia:</p><pre>&lt;ruleset name="Wikipedia"&gt;
12+
13+
<p>
14+
A ruleset is an XML file describing behavior for a site or group of sites.
15+
For example, here is
16+
<a href="https://github.com/efforg/https-everywhere/blob/master/src/chrome/content/rules/RabbitMQ.xml"><tt>RabbitMQ.xml</tt></a>,
17+
from the plugin distribution:
18+
</p>
19+
20+
<pre>
21+
&lt;ruleset name="RabbitMQ"&gt;
22+
&lt;target host="rabbitmq.com" /&gt;
23+
&lt;target host="www.rabbitmq.com" /&gt;
24+
25+
&lt;rule from="^http:"
26+
to="https:" /&gt;
27+
&lt;/ruleset&gt;
28+
</pre>
29+
30+
<p>
31+
The "target" tag specifies which web sites the ruleset applies
32+
to. The "rule" tag specifies how URLs on those web sites should be
33+
modified. This rule says that any URLs on <tt>rabbitmq.com</tt> and
34+
<tt>www.rabbitmq.com</tt> should be modified by replacing "http:" with
35+
"https:".
36+
</p>
37+
38+
<p>
39+
When Firefox loads a URL, HTTPS Everywhere splits out the host name
40+
name (e.g. <tt>www.google.com</tt> out of
41+
"http://www.google.com/webhp"), and searches its ruleset database for
42+
rulesets that match that host name. Host names do not use regular
43+
expressions, but they support wildcards, described below.
44+
</p>
45+
46+
<p>
47+
HTTPS Everywhere then checks the full URL against all the rules from
48+
rulesets matching the host name. Rules <em>do</em> use regular expressions.
49+
The "from" attribute is checked against the URL, and if it matches,
50+
the matched portion is replaced with the contents of the "to"
51+
attribute.
52+
</p>
53+
54+
<p>
55+
Rules are applied to URLs on any of the target hosts. They are applied
56+
in order, and only the first rule that matches a URL (using the "from"
57+
attribute) is used.
58+
</p>
59+
60+
61+
The target host tag does not use regular expressions. The content
62+
of a target tag should be the actual name of a web site to which
63+
the ruleset applies or partially applies, like <tt>www.eff.org</tt>,
64+
<tt>www.google.com</tt>, <tt>secure.wikimedia.org</tt>, and so
65+
on. If your rule applies to the bare domain (like "eff.org",
66+
not just "www.eff.org"), you need an additional target tag to
67+
say so. For example, the sample ruleset above is meant to apply
68+
to either <tt>www.twitter.com</tt> or <tt>twitter.com</tt>,
69+
so it has a separate target tag for each. A target may,
70+
however, contain a wildcard in one portion of the domain (like
71+
<tt>*.google.com</tt> or <tt>google.*</tt>, but <tt>*.google.*</tt>
72+
would not work). A wildcard on the left will match arbitrarily
73+
deep subdomains (for instance, <tt>*.facebook.com</tt> will
74+
match <tt>s-static.ak.facebook.com</tt>).Exception: currently
75+
this is not true for a target host that is less than three
76+
levels deep. <tt>&lt;target host="*.com"&gt;</tt> would match
77+
<tt>thing.com</tt> but not <tt>very.thing.com</tt>. We would consider
78+
changing that if anybody needs to use it. The "rule" does the actual
79+
rewriting work. The "from" and "to" clauses in each rule are <a
80+
href="http://www.regular-expressions.info/javascript.html">JavaScript
81+
regular expressions</a>. You can use them to rewrite URLs in more
82+
complicated ways. Here's a simplified (and now obsolete) example
83+
for Wikipedia:</p>
84+
85+
<pre>&lt;ruleset name="Wikipedia"&gt;
786
&lt;target host="*.wikipedia.org" /&gt;
887

988
&lt;rule from="^http://([^@:/][^/:@])\.wikipedia\.org/wiki/"
@@ -24,4 +103,4 @@
24103
to="https://translate.googleapis.com/"/&gt;
25104
&lt;rule from="^http://translate\.google\.com/translate_a/element\.js"
26105
to="https://translate.google.com/translate_a/element.js"/&gt;
27-
&lt;/ruleset&gt;</pre><p>Platform is a space-delimited list of platforms on which the ruleset works. Currently anticipated values are "firefox", "chromium", "mixedcontent", "cacert" and "ipsca". The "mixedcontent" value is important, and should be used for sites that render badly in Chrome because that browser <a href="https://trac.torproject.org/projects/tor/ticket/6975">blocks HTTP content from loading in HTTPS pages</a> by default. If the platform attribute is present, but does not match the current platform, the ruleset will be treated as off-by-default. <strong>Update (12/12/13)</strong>: Firefox 23+ has enabled mixed content blocking by default. As a result, many rules that were previously labeled "firefox" should now be labeled "mixedcontent." <a name="downgrade"></a></p><h3>HTTPS-&gt;HTTP downgrade rules</h3><p>By default, HTTPS Everywhere will refuse to allow rules that would downgrade a URL from HTTPS to HTTP. Occasionally, this has turned out to be necessary because when HTTPS Everywhere is active some sites write broken relative links to HTTPS resources on their own domains that are not actually available over HTTPS. If you want to have a <tt>&lt;rule&gt;</tt> like that (<a href="https://gitweb.torproject.org/pde/https-everywhere.git/blob/4.0:/src/chrome/content/rules/BBC.xml">here's an example</a>), it will need a downgrade="1" attribute to make sure you really meant it, and to facilitate auditing of the ruleset library.</p>
106+
&lt;/ruleset&gt;</pre><p>Platform is a space-delimited list of platforms on which the ruleset works. Currently anticipated values are "firefox", "chromium", "mixedcontent", "cacert" and "ipsca". The "mixedcontent" value is important, and should be used for sites that render badly in Chrome because that browser <a href="https://trac.torproject.org/projects/tor/ticket/6975">blocks HTTP content from loading in HTTPS pages</a> by default. If the platform attribute is present, but does not match the current platform, the ruleset will be treated as off-by-default. <strong>Update (12/12/13)</strong>: Firefox 23+ has enabled mixed content blocking by default. As a result, many rules that were previously labeled "firefox" should now be labeled "mixedcontent." <a name="downgrade"></a></p><h3>HTTPS-&gt;HTTP downgrade rules</h3><p>By default, HTTPS Everywhere will refuse to allow rules that would downgrade a URL from HTTPS to HTTP. Occasionally, this has turned out to be necessary because when HTTPS Everywhere is active some sites write broken relative links to HTTPS resources on their own domains that are not actually available over HTTPS. If you want to have a <tt>&lt;rule&gt;</tt> like that (<a href="https://gitweb.torproject.org/pde/https-everywhere.git/blob/4.0:/src/chrome/content/rules/BBC.xml">here's an example</a>), it will need a downgrade="1" attribute to make sure you really meant it, and to facilitate auditing of the ruleset library.</p>
Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
<!--
22
For other Pivotal coverage, see Pivotal.xml.
33
4+
https version of {lists,next}.rabbitmq.com serves content from homepage,
5+
not same content as http version
6+
47
-->
58
<ruleset name="RabbitMQ">
6-
79
<target host="rabbitmq.com" />
8-
<target host="*.rabbitmq.com" />
9-
10-
11-
<rule from="^http://((?:lists|next|www)\.)?rabbitmq\.com/"
12-
to="https://$1rabbitmq.com/" />
10+
<target host="www.rabbitmq.com" />
1311

14-
</ruleset>
12+
<rule from="^http:"
13+
to="https:" />
14+
</ruleset>
Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,7 @@
11
<ruleset name="zeromq (partial)">
2-
32
<target host="zeromq.org" />
43
<target host="www.zeromq.org" />
54

6-
7-
<rule from="^http://(?:www\.)?zeromq\.org/local--theme/"
8-
to="https://zeromq.wdfiles.com/local--theme/" />
9-
5+
<rule from="^http:"
6+
to="https:" />
107
</ruleset>

0 commit comments

Comments
 (0)