Skip to content

markdown lexer fails on fenced code bocks of type raw #1616

@gerner

Description

@gerner

for example, the following produces the indicated error:

$ echo '```raw
hi
```' | pygmentize -l md

*** Error while highlighting:
TypeError: cannot use a bytes pattern on a string-like object
   (file "/home/nick/src/pygments/pygments/lexers/special.py", line 87, in get_tokens_unprocessed)
*** If this is a bug you want to report, please rerun with -v.
```raw

I believe this happens because the markdown lexer will parse the fenced code and see "raw" which invokes the RawTokenLexer which I don't think is meant to be used like a normal lexer (see https://github.com/pygments/pygments/blob/master/pygments/lexers/special.py#L45-L49)

You can see this in the wild using this README.md:

https://github.com/netdata/netdata/blob/master/collectors/perf.plugin/README.md

This repo has nearly 50k stars and over 4.5k forks so it seems pretty reasonable to handle this kind of issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions