Skip to content

Commit 0883df6

Browse files
committed
add entity extractor; haven't fixed entity handling as it's likely to change and I got all kinds of funky issues...
--HG-- extra : convert_revision : svn%3Aacbfec75-9323-0410-a652-858a13e371e0/trunk%40742
1 parent 6924dc8 commit 0883df6

1 file changed

Lines changed: 13 additions & 0 deletions

File tree

utils/extract-entities.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
import urllib2
2+
entitiesPage = "http://www.whatwg.org/specs/web-apps/current-work/multipage/section-entities.html"
3+
output = ""
4+
for line in urllib2.urlopen(entitiesPage).readlines():
5+
entityNameSig = " <td> <code title=\"\">"
6+
entityValueSig = " </td><td> "
7+
if line.startswith(entityNameSig):
8+
x = len(entityNameSig)
9+
output += " \"" + line[x:-8] + "\": "
10+
elif line.startswith(entityValueSig):
11+
x = len(entityValueSig)
12+
output += "u\"" + line[x:-1].replace("U+", "\\u") + "\",\n"
13+
print output

0 commit comments

Comments
 (0)