Skip to content

Segfaults and wrong parse errors when using object_hook #48

@lightblu

Description

@lightblu

First, I am not really sure about my diagnosis because it is not easy to reproduce, but I have ran tests often and long enough to be quite sure. It seems like I'm seeing segfaults or irreproducible (i.e. when feeding the json again, it parses just fine) parse errors ("ValueError: Parse error at offset 2881568: Terminate parsing due to Handler error.") when using rapidjson.loads with an object_hook callback.

Using Python's faulthandler, the segfaults come from no definite place. The offset the parse error happens is always on the "}," between two records (so after the object_hook callback has been called?).

I tried this with Alpine Linux 3.4.4 and their Python 3.5 apks, a freshly compiled Python 3.5.2, and also a freshly compiled Python 3.5.2 on latest Debian. python-rapidjson is 0.0.6 from PyPI.

My application is a AWS Kinesis client app, which just parses jsons from the incoming stream and writes these to S3. I cannot reproduce without writing to S3, so it seems like boto3/urllib3/socket interaction is required for this to happen. Chances to see this increase when under load (continuously parsing ~9MB jsons) and using multiple threads, although I experienced these faults also completely single threaded.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions