I have a case where a WebFlux application needs to perform some blocking work. Therefore, I use .subscribeOn(aBoundedElasticScheduler) in the stream.
When doing so, I've noticed that sometimes during heavy load, inbound requests are not fully read. After doing a lot of debugging, I believe I've narrowed it down to AbstractListenerReadPublisher publishing an onComplete signal before an onNext signal in some cases, leading to the onNext signal being dropped.
- spring-boot 2.4.5
- spring 5.3.6
- tomcat 9.0.45
Example logging for a request that is not fully read, since AbstractListenerReadPublisher sends the onComplete signal before the onNext signal.
00:25:52,141 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2b683da6] UNSUBSCRIBED -> SUBSCRIBING
00:25:52,141 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2b683da6] SUBSCRIBING -> NO_DEMAND
00:25:52,142 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.request:262] - [2b683da6] request 1
00:25:52,142 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2b683da6] NO_DEMAND -> DEMAND
00:25:52,142 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.onDataAvailable:117] - [2b683da6] onDataAvailable
00:25:52,142 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2b683da6] DEMAND -> READING
// onDataAvailable callback comes in from tomcat
00:25:52,141 [tomcat-1] [_.s.h.s.r.AbstractListenerReadPublisher.onDataAvailable:117] - [2b683da6] onDataAvailable
// meanwhile, reading is still being performed on a boundedElastic thread
00:25:52,142 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.logBytesRead:263] - [2b683da6] Read 938 bytes
// onAllDataRead callback comes in from tomcat
00:25:52,143 [tomcat-1] [_.s.h.s.r.AbstractListenerReadPublisher.onAllDataRead:128] - [2b683da6] onAllDataRead [READING]
00:25:52,143 [tomcat-1] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2b683da6] READING -> COMPLETED
// AbstractListenerReadPublisher.State.onAllDataRead sends onComplete signal
00:25:52,143 [sync-185] [_.s.h.s.r.AbstractListenerReadPublisher.readAndPublish:198] - [2b683da6] Publishing DefaultDataBuffer
// AbstractListenerReadPublisher.readAndPublish sends onNext signal.
// This signal is eventually dropped, since it occurs after the onComplete sent above
For comparison, here is an example of a working request:
00:25:52,135 [tomcat-1] [_.s.h.s.r.AbstractListenerReadPublisher.onDataAvailable:117] - [2af765a0] onDataAvailable
00:25:52,137 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] UNSUBSCRIBED -> SUBSCRIBING
00:25:52,138 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] SUBSCRIBING -> NO_DEMAND
00:25:52,138 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.request:262] - [2af765a0] request 1
00:25:52,138 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] NO_DEMAND -> DEMAND
00:25:52,139 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.onDataAvailable:117] - [2af765a0] onDataAvailable
00:25:52,139 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] DEMAND -> READING
00:25:52,139 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.logBytesRead:263] - [2af765a0] Read 938 bytes
00:25:52,139 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.readAndPublish:198] - [2af765a0] Publishing DefaultDataBuffer
00:25:52,139 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] READING -> NO_DEMAND
00:25:52,140 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.request:262] - [2af765a0] request 1
00:25:52,140 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] NO_DEMAND -> DEMAND
00:25:52,141 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.onAllDataRead:128] - [2af765a0] onAllDataRead [DEMAND]
00:25:52,141 [sync-193] [_.s.h.s.r.AbstractListenerReadPublisher.changeState:215] - [2af765a0] DEMAND -> COMPLETED
00:25:52,203 [tomcat-2] [_.s.h.s.r.AbstractListenerReadPublisher.onAllDataRead:128] - [2af765a0] onAllDataRead [COMPLETED]
This is a pretty complex app, and I haven't tried to isolate it down into a small reproducible app (yet).
@rstoyanchev, I know you've been in this code a lot recently for another issue I filed (#26407). Do you have any ideas off of the top of your head before I dive in deeper?
I have a case where a WebFlux application needs to perform some blocking work. Therefore, I use
.subscribeOn(aBoundedElasticScheduler)in the stream.When doing so, I've noticed that sometimes during heavy load, inbound requests are not fully read. After doing a lot of debugging, I believe I've narrowed it down to
AbstractListenerReadPublisherpublishing anonCompletesignal before anonNextsignal in some cases, leading to theonNextsignal being dropped.Example logging for a request that is not fully read, since
AbstractListenerReadPublishersends theonCompletesignal before theonNextsignal.For comparison, here is an example of a working request:
This is a pretty complex app, and I haven't tried to isolate it down into a small reproducible app (yet).
@rstoyanchev, I know you've been in this code a lot recently for another issue I filed (#26407). Do you have any ideas off of the top of your head before I dive in deeper?