Skip to content

Unicode confusables characters are removed #142

@chuckyblack

Description

@chuckyblack

https://util.unicode.org/UnicodeJsps/confusables.jsp?a=a&r=None

Current result

from slugify import slugify

assert slugify("𝐚́́𝕒́") == ""

Expected result

from slugify import slugify

assert slugify("𝐚́́𝕒́") == "aa"

Possible solution

I think text = unicodedata.normalize("NFKC", text) should be called before text = unidecode.unidecode(text).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions