🐛 Fix line splitting in format_sse_event to comply with SSE spec#15515
Open
Zawwarsami16 wants to merge 4 commits into
Open
🐛 Fix line splitting in format_sse_event to comply with SSE spec#15515Zawwarsami16 wants to merge 4 commits into
format_sse_event to comply with SSE spec#15515Zawwarsami16 wants to merge 4 commits into
Conversation
…g in format_sse_event splitlines() drops trailing empty strings and treats 8 extra characters (\v, \f, \x1c-\x1e, \x85, U+2028, U+2029) as line breaks. SSE only recognizes \n, \r\n, and \r per the spec, and trailing empty data lines are part of the payload — silently dropping them corrupts the stream. Both the data: and the comment branch were affected. Adds 8 unit tests covering trailing-newline preservation, CRLF/CR normalization, and the splitlines() quirks (U+2028, vertical tab) staying inside the payload. Closes fastapi#15500
YuriiMotov
approved these changes
May 18, 2026
Member
YuriiMotov
left a comment
There was a problem hiding this comment.
LGTM in general!
I would only simplify comments and tests a bit (see suggested changes) - IMO, comments are too verbose.
Also tested with more real-live tests (see in the details)
Details
from collections.abc import AsyncIterable
from fastapi import FastAPI
from fastapi.sse import EventSourceResponse, ServerSentEvent
from fastapi.testclient import TestClient
app = FastAPI()
multiline_items = [
"Hello\n",
"\n",
"Hello\n\n",
"Hello\r\nWorld",
"A\u2028B",
]
@app.get("/multiline-items/stream-json", response_class=EventSourceResponse)
async def sse_multiline_items_json() -> AsyncIterable[str]:
for item in multiline_items:
yield item
@app.get("/multiline-items/stream-raw", response_class=EventSourceResponse)
async def sse_multiline_items_raw():
for item in multiline_items:
yield ServerSentEvent(raw_data=item)
def test_sse_multiline_items_json() -> None:
client = TestClient(app)
response = client.get("/multiline-items/stream-json")
assert response.status_code == 200
assert response.headers["content-type"] == "text/event-stream; charset=utf-8"
assert response.text == (
'data: "Hello\\n"\n'
"\n"
'data: "\\n"\n'
"\n"
'data: "Hello\\n\\n"\n'
"\n"
'data: "Hello\\r\\nWorld"\n'
"\n"
'data: "A\u2028B"\n'
"\n"
)
def test_sse_multiline_items_raw() -> None:
client = TestClient(app)
response = client.get("/multiline-items/stream-raw")
assert response.status_code == 200
assert response.headers["content-type"] == "text/event-stream; charset=utf-8"
assert response.text == (
"data: Hello\n"
"data: \n"
"\n"
"data: \n"
"data: \n"
"\n"
"data: Hello\n"
"data: \n"
"data: \n"
"\n"
"data: Hello\n"
"data: World\n"
"\n"
"data: A\u2028B\n"
"\n"
)@Zawwarsami16, thanks!
format_sse_event to comply with SSE spec
This comment was marked as resolved.
This comment was marked as resolved.
Author
|
Thanks @YuriiMotov — pushed 866e577 addressing all three:
|
YuriiMotov
approved these changes
May 27, 2026
Member
YuriiMotov
left a comment
There was a problem hiding this comment.
LGTM!
Also tested with more real-live tests (see in the details)
Details
from collections.abc import AsyncIterable
from fastapi import FastAPI
from fastapi.sse import EventSourceResponse, ServerSentEvent
from fastapi.testclient import TestClient
app = FastAPI()
multiline_items = [
"Hello\n",
"\n",
"Hello\n\n",
"Hello\r\nWorld",
"A\u2028B",
]
@app.get("/multiline-items/stream-json", response_class=EventSourceResponse)
async def sse_multiline_items_json() -> AsyncIterable[str]:
for item in multiline_items:
yield item
@app.get("/multiline-items/stream-raw", response_class=EventSourceResponse)
async def sse_multiline_items_raw():
for item in multiline_items:
yield ServerSentEvent(raw_data=item)
def test_sse_multiline_items_json() -> None:
client = TestClient(app)
response = client.get("/multiline-items/stream-json")
assert response.status_code == 200
assert response.headers["content-type"] == "text/event-stream; charset=utf-8"
assert response.text == (
'data: "Hello\\n"\n'
"\n"
'data: "\\n"\n'
"\n"
'data: "Hello\\n\\n"\n'
"\n"
'data: "Hello\\r\\nWorld"\n'
"\n"
'data: "A\u2028B"\n'
"\n"
)
def test_sse_multiline_items_raw() -> None:
client = TestClient(app)
response = client.get("/multiline-items/stream-raw")
assert response.status_code == 200
assert response.headers["content-type"] == "text/event-stream; charset=utf-8"
assert response.text == (
"data: Hello\n"
"data: \n"
"\n"
"data: \n"
"data: \n"
"\n"
"data: Hello\n"
"data: \n"
"data: \n"
"\n"
"data: Hello\n"
"data: World\n"
"\n"
"data: A\u2028B\n"
"\n"
)@Zawwarsami16, thank you!
Passing this to Sebastian for the final review
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #15500.
What's happening
format_sse_eventinfastapi/sse.pyusesstr.splitlines()to splitdata_str(andcomment) into SSE lines.splitlines()is wrong on two counts here:"Hello\n".splitlines()is["Hello"], so the trailingdata:line gets eaten — corrupting any payload that ends with a newline. This is the bug in the linked issue.\v,\f,\x1c,\x1d,\x1e,\x85, U+2028, U+2029 — per the Python docs. SSE only recognizes\n,\r\n, and\rper the spec, so a payload like"A B"was being silently split into twodata:lines.The fix
A small helper,
_split_sse_lines, normalizes\r\nand lone\rto\n, then splits on\n. That preserves trailing empties (becausesplit("\n")keeps them), and only recognizes the three line terminators that SSE actually defines.Both
data:andcomment:branches now go through the helper.Tests
8 new unit tests in
tests/test_sse.pythat directly exerciseformat_sse_event:preserves_trailing_newline\"Hello\\n\"b\"data: Hello\\ndata: \\n\\n\"preserves_trailing_double_newline\"Hello\\n\\n\"b\"data: Hello\\ndata: \\ndata: \\n\\n\"single_newline_data\"\\n\"b\"data: \\ndata: \\n\\n\"crlf_normalizes_to_lf\"Hello\\r\\nWorld\"b\"data: Hello\\ndata: World\\n\\n\"bare_cr_treated_as_line_break\"Hello\\rWorld\"b\"data: Hello\\ndata: World\\n\\n\"unicode_line_separator_not_split\"A\\u2028B\"b\"data: A\\u2028B\\n\\n\"(NOT split)vertical_tab_not_split\"A\\vB\"b\"data: A\\vB\\n\\n\"(NOT split)comment_preserves_trailing_newlinecomment=\"hi\\n\"b\": hi\\n: \\n\\n\"The U+2028 /
\\vtests are the regression catchers — they distinguish a correct fix (split(\"\\n\")after normalizing) from the naive fix (splitlines()) and from a wrong fix that splits on every Unicode line break.All 26 tests in
tests/test_sse.pypass:```
tests/test_sse.py .......................... [100%]
26 passed in 1.94s
```
No production code outside
format_sse_eventwas touched.