handle line-ending tags; refactor payload breaks#743
Conversation
|
Thank you, @conorom! This looks good in principle; I'll probably not get around to taking a proper look until after 5.0 releases, though. Would you be able to add unit tests for this? If not, I can, but if you're comfortable with that, they'd definitely be helpful! |
|
@joedolson Actually at one point (for a prior attempt before I hit and fought more corner cases) I had added Jest to the project and did have unit tests in place for I sort of decided that adding that to the PR would be too much. Funny, at that time I was working off an out-of-date fork. I'm happy to add tests for the weird cases I can think of at this point (before I forget). One question: is there documentation on how the current test suite is run or anything like that? Thanks! |
|
The test suite is run using It's also prone to timeouts, so you may need to adjust the default timeout. I haven't resolved this yet. And I agree that adding a test suite is too much for the average PR; but since there are test runners already, I'd prefer to add tests along with functionality changes when possible. |
6794601 to
c6986c9
Compare
c6986c9 to
f022cd8
Compare
|
@joedolson Excuse the force-pushing. Thanks |
Fixes #348
... for real this time 🤞
I thought I'd take another shot at this bug. When I'd looked at it a few years ago, I was certain the problem was with the OR condition here, i.e. the
|| /^\s+$/.test(nextLine)) {bit.And it kind of is. That break-on-whitespace OR condition is always going to cause issues with unexpected whitespace use, and no small measure of confusion. In no way is any sort of non-line-ending whitespace a cue terminator per the WebVTT spec. Only a blank/empty line ends the payload/cue.
But it's not clear-cut stuff. Note also that a line ending with a tag and then a trailing whitespace causes slightly different behavior than a line simply ending in a tag. And example 2 in the issue description happens to have a trailing space, i.e.
<v Roger Bingham>I spent a lot of time working through this, admittedly chatting with Claude, and doing manual testing.
I'd already committed to removing the OR condition but once I examined the issue that caused it to be added in the first place I wanted to ensure I didn't cause a regression with multiple blank lines and empty cue payloads.
I won't bore you with any more details. The resulting change parses the below much-fiddled-with version of wwa_captions_en.vtt. I realize there are things in here that wouldn't be recommended in a real WebVTT file, but I'm happy that the parser doesn't break on them and the output is as I'd expect, as I loaded Video 1 in dev (over and over!).