Skip to content

Sending to clients failing with multiple clients after an extended period of time.  #889

@Bernstern

Description

@Bernstern

Describe the bug
We have begun to notice a bug with our backend setup where after an extended period of time, sending data to clients will no longer occur. Calling send(...) itself raises no errors but we stop seeing a log that the underlying emit is called.

At least we don't think that the emit is actually occuring because the log line below in the pubsub manager never executes:

self._get_logger().info('pubsub message: {}'.format(...)

And clients who are still receiving ping-pong messages from the backend never receive the emitted message.

To Reproduce
Steps to reproduce the behavior:

  1. We have not found a consistent way to trigger the bug other as it seems to happen after a week or so between restarts but we will include that if we find a way to trigger it.

Expected behavior
Emits to work with multiple clients maintaining a web socket connection.

Logs
When it is working we see logs like so:

socketio.server INFO    emitting event "message" to <room name> [/<namespace>]
socketio.server INFO    pubsub message: emit

Whenever the bug occurs we no longer see the pubsub message.

Additional context
We are using flask-socketio on aws with elastic beanstalk (with an application load balancer) with redis for the message queue and gevent as the async mode setup below:

socketio = SocketIO(
    app,
    async_mode="gevent",
    cors_allowed_origins="*",
    manage_session=False,
    message_queue="redis://...",
    logger=True,
    engine_logger=True,
)

We are using the following package versions:

python-socketio==5.3.0
gevent-websocket==0.10.1
Flask-SocketIO==5.1.0
redis==3.5.3
gevent==21.1.1

We have also found that restarting the elastic beanstalk environment fixes the issue temporarily but we are not sure why. We originally thought it was an issue with amazon replacing ec2 instances in our EB, but even with that disabled we still see this behavior. We also verified that clients are using websockets instead of long polling.

Edit: We also only had this issue pop up when we switched from a normal ec2 instance to elastic beanstalk with a remote redis.

Metadata

Metadata

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions