File tree Expand file tree Collapse file tree 2 files changed +7
-4
lines changed
Expand file tree Collapse file tree 2 files changed +7
-4
lines changed Original file line number Diff line number Diff line change 11html5lib is a pure-python library for parsing HTML. It is designed to
2- conform to the Web Applications 1.0 specification, which has
3- formalized the error handling algorithms of popular web browsers.
2+ conform to the HTML 5 specification, which has formalized the error handling
3+ algorithms of popular web browsers.
44
55 = Installation =
66
@@ -36,3 +36,4 @@ http://code.google.com/p/html5lib/issues/list
3636Contributions to code or documenation are actively encouraged. Submit
3737patches to the issue tracker or discuss changes on irc in the #whatwg
3838channel on freenode.net
39+
Original file line number Diff line number Diff line change @@ -297,11 +297,15 @@ def emitCurrentToken(self):
297297
298298 def dataState (self ):
299299 data = self .stream .char ()
300+
301+ # Keep a charbuffer to handle the escapeFlag
300302 if self .contentModelFlag in \
301303 (contentModelFlags ["CDATA" ], contentModelFlags ["RCDATA" ]):
302304 if len (self .lastFourChars ) == 4 :
303305 self .lastFourChars .pop (0 )
304306 self .lastFourChars .append (data )
307+
308+ # The rest of the logic
305309 if data == "&" and self .contentModelFlag in \
306310 (contentModelFlags ["PCDATA" ], contentModelFlags ["RCDATA" ]) and not \
307311 self .escapeFlag :
@@ -328,8 +332,6 @@ def dataState(self):
328332 # Directly after emitting a token you switch back to the "data
329333 # state". At that point spaceCharacters are important so they are
330334 # emitted separately.
331- # XXX need to check if we don't need a special "spaces" flag on
332- # characters.
333335 self .tokenQueue .append ({"type" : "SpaceCharacters" , "data" :
334336 data + self .stream .charsUntil (spaceCharacters , True )})
335337 else :
You can’t perform that action at this time.
0 commit comments