Skip to content

🐛 Fix line splitting in format_sse_event to comply with SSE spec#15515

Open
Zawwarsami16 wants to merge 4 commits into
fastapi:masterfrom
Zawwarsami16:fix/sse-trailing-newlines
Open

🐛 Fix line splitting in format_sse_event to comply with SSE spec#15515
Zawwarsami16 wants to merge 4 commits into
fastapi:masterfrom
Zawwarsami16:fix/sse-trailing-newlines

Conversation

@Zawwarsami16
Copy link
Copy Markdown

Closes #15500.

What's happening

format_sse_event in fastapi/sse.py uses str.splitlines() to split data_str (and comment) into SSE lines. splitlines() is wrong on two counts here:

  1. Drops trailing empty strings. "Hello\n".splitlines() is ["Hello"], so the trailing data: line gets eaten — corrupting any payload that ends with a newline. This is the bug in the linked issue.
  2. Treats 8 extra characters as line breaks. \v, \f, \x1c, \x1d, \x1e, \x85, U+2028, U+2029 — per the Python docs. SSE only recognizes \n, \r\n, and \r per the spec, so a payload like "A
B" was being silently split into two data: lines.

The fix

A small helper, _split_sse_lines, normalizes \r\n and lone \r to \n, then splits on \n. That preserves trailing empties (because split("\n") keeps them), and only recognizes the three line terminators that SSE actually defines.

def _split_sse_lines(value: str) -> list[str]:
    return value.replace(\"\\r\\n\", \"\\n\").replace(\"\\r\", \"\\n\").split(\"\\n\")

Both data: and comment: branches now go through the helper.

Tests

8 new unit tests in tests/test_sse.py that directly exercise format_sse_event:

Test Input Output
preserves_trailing_newline \"Hello\\n\" b\"data: Hello\\ndata: \\n\\n\"
preserves_trailing_double_newline \"Hello\\n\\n\" b\"data: Hello\\ndata: \\ndata: \\n\\n\"
single_newline_data \"\\n\" b\"data: \\ndata: \\n\\n\"
crlf_normalizes_to_lf \"Hello\\r\\nWorld\" b\"data: Hello\\ndata: World\\n\\n\"
bare_cr_treated_as_line_break \"Hello\\rWorld\" b\"data: Hello\\ndata: World\\n\\n\"
unicode_line_separator_not_split \"A\\u2028B\" b\"data: A\\u2028B\\n\\n\" (NOT split)
vertical_tab_not_split \"A\\vB\" b\"data: A\\vB\\n\\n\" (NOT split)
comment_preserves_trailing_newline comment=\"hi\\n\" b\": hi\\n: \\n\\n\"

The U+2028 / \\v tests are the regression catchers — they distinguish a correct fix (split(\"\\n\") after normalizing) from the naive fix (splitlines()) and from a wrong fix that splits on every Unicode line break.

All 26 tests in tests/test_sse.py pass:

```
tests/test_sse.py .......................... [100%]
26 passed in 1.94s
```

No production code outside format_sse_event was touched.

Zawwarsami16 and others added 2 commits May 14, 2026 03:41
…g in format_sse_event

splitlines() drops trailing empty strings and treats 8 extra characters
(\v, \f, \x1c-\x1e, \x85, U+2028, U+2029) as line breaks. SSE only
recognizes \n, \r\n, and \r per the spec, and trailing empty data lines
are part of the payload — silently dropping them corrupts the stream.

Both the data: and the comment branch were affected. Adds 8 unit tests
covering trailing-newline preservation, CRLF/CR normalization, and the
splitlines() quirks (U+2028, vertical tab) staying inside the payload.

Closes fastapi#15500
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 14, 2026

Merging this PR will not alter performance

✅ 20 untouched benchmarks


Comparing Zawwarsami16:fix/sse-trailing-newlines (3a87839) with master (57535ef)1

Open in CodSpeed

Footnotes

  1. No successful run was found on master (ad09734) during the generation of this report, so 57535ef was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Copy link
Copy Markdown
Member

@YuriiMotov YuriiMotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general!
I would only simplify comments and tests a bit (see suggested changes) - IMO, comments are too verbose.

Also tested with more real-live tests (see in the details)

Details
from collections.abc import AsyncIterable

from fastapi import FastAPI
from fastapi.sse import EventSourceResponse, ServerSentEvent
from fastapi.testclient import TestClient

app = FastAPI()


multiline_items = [
    "Hello\n",
    "\n",
    "Hello\n\n",
    "Hello\r\nWorld",
    "A\u2028B",
]


@app.get("/multiline-items/stream-json", response_class=EventSourceResponse)
async def sse_multiline_items_json() -> AsyncIterable[str]:
    for item in multiline_items:
        yield item


@app.get("/multiline-items/stream-raw", response_class=EventSourceResponse)
async def sse_multiline_items_raw():
    for item in multiline_items:
        yield ServerSentEvent(raw_data=item)


def test_sse_multiline_items_json() -> None:

    client = TestClient(app)
    response = client.get("/multiline-items/stream-json")
    assert response.status_code == 200
    assert response.headers["content-type"] == "text/event-stream; charset=utf-8"

    assert response.text == (
        'data: "Hello\\n"\n'
        "\n"
        'data: "\\n"\n'
        "\n"
        'data: "Hello\\n\\n"\n'
        "\n"
        'data: "Hello\\r\\nWorld"\n'
        "\n"
        'data: "A\u2028B"\n'
        "\n"
    )


def test_sse_multiline_items_raw() -> None:

    client = TestClient(app)
    response = client.get("/multiline-items/stream-raw")
    assert response.status_code == 200
    assert response.headers["content-type"] == "text/event-stream; charset=utf-8"

    assert response.text == (
        "data: Hello\n"
        "data: \n"
        "\n"
        "data: \n"
        "data: \n"
        "\n"
        "data: Hello\n"
        "data: \n"
        "data: \n"
        "\n"
        "data: Hello\n"
        "data: World\n"
        "\n"
        "data: A\u2028B\n"
        "\n"
    )

@Zawwarsami16, thanks!

Comment thread fastapi/sse.py Outdated
Comment thread tests/test_sse.py Outdated
@YuriiMotov YuriiMotov changed the title fix(sse): preserve trailing newlines + use spec-correct line splitting in format_sse_event 🐛 Fix line splitting in format_sse_event to comply with SSE spec May 18, 2026
@YuriiMotov YuriiMotov added the bug Something isn't working label May 18, 2026
@YuriiMotov

This comment was marked as resolved.

@Zawwarsami16
Copy link
Copy Markdown
Author

Zawwarsami16 commented May 27, 2026

Thanks @YuriiMotov — pushed 866e577 addressing all three:

  • Shortened the comment in _split_sse_lines to a two-line summary.
  • Folded the per-case data tests into one @pytest.mark.parametrize and rewrote the literal U+2028 as \u2028 so VS Code stops flagging the file.
  • Added test_format_sse_event_keeps_empty_data_line from fix(sse): preserve empty data lines when formatting SSE #15618format_sse_event(data_str="") now has explicit coverage that it emits b"data: \n\n".

tests/test_sse.py is green locally (27 passed in 1.18s).

@github-actions github-actions Bot removed the waiting label May 27, 2026
Copy link
Copy Markdown
Member

@YuriiMotov YuriiMotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Also tested with more real-live tests (see in the details)

Details
from collections.abc import AsyncIterable

from fastapi import FastAPI
from fastapi.sse import EventSourceResponse, ServerSentEvent
from fastapi.testclient import TestClient

app = FastAPI()


multiline_items = [
    "Hello\n",
    "\n",
    "Hello\n\n",
    "Hello\r\nWorld",
    "A\u2028B",
]


@app.get("/multiline-items/stream-json", response_class=EventSourceResponse)
async def sse_multiline_items_json() -> AsyncIterable[str]:
    for item in multiline_items:
        yield item


@app.get("/multiline-items/stream-raw", response_class=EventSourceResponse)
async def sse_multiline_items_raw():
    for item in multiline_items:
        yield ServerSentEvent(raw_data=item)


def test_sse_multiline_items_json() -> None:

    client = TestClient(app)
    response = client.get("/multiline-items/stream-json")
    assert response.status_code == 200
    assert response.headers["content-type"] == "text/event-stream; charset=utf-8"

    assert response.text == (
        'data: "Hello\\n"\n'
        "\n"
        'data: "\\n"\n'
        "\n"
        'data: "Hello\\n\\n"\n'
        "\n"
        'data: "Hello\\r\\nWorld"\n'
        "\n"
        'data: "A\u2028B"\n'
        "\n"
    )


def test_sse_multiline_items_raw() -> None:

    client = TestClient(app)
    response = client.get("/multiline-items/stream-raw")
    assert response.status_code == 200
    assert response.headers["content-type"] == "text/event-stream; charset=utf-8"

    assert response.text == (
        "data: Hello\n"
        "data: \n"
        "\n"
        "data: \n"
        "data: \n"
        "\n"
        "data: Hello\n"
        "data: \n"
        "data: \n"
        "\n"
        "data: Hello\n"
        "data: World\n"
        "\n"
        "data: A\u2028B\n"
        "\n"
    )

@Zawwarsami16, thank you!
Passing this to Sebastian for the final review

@YuriiMotov

This comment was marked as resolved.

@Zawwarsami16

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants