Skip to content

multipart/form-data parsing appears to truncate a file part by 1 byte #3077

@swell-d

Description

@swell-d

Guys, sorry, I'm emotional because I spent 7 hours fixing a bug, and it turned out that the latest werkzeug 3.1.4 update was to blame. Below is text from chatgpt explaining the problem:

In Werkzeug 3.1.4, multipart/form-data parsing appears to truncate a file part by 1 byte in a reproducible edge case. The same code and client request work correctly with Werkzeug 3.1.3.

This manifests during chunked uploads from the browser: a specific chunk at a specific byte offset is parsed as length (expected - 1), while the raw (non-multipart) upload of the exact same Blob is parsed correctly. This suggests a regression in multipart parsing in 3.1.4.

Steps to reproduce:

  1. Run the minimal Flask app below.
  2. Use the HTML test page below to select a large file and run the probe.
  3. Observe that with Werkzeug 3.1.4 the multipart endpoint reports one chunk as 1 byte shorter, while the raw endpoint reports the correct size.
  4. Downgrade to Werkzeug 3.1.3 and repeat: the multipart endpoint reports correct sizes for all tested chunks.

Minimal reproducible example (server):

from flask import Flask, request, jsonify
from flask_login import LoginManager, login_required

app = Flask(__name__)
app.secret_key = "test"

login_manager = LoginManager(app)

@app.route("/upload_probe_raw", methods=["POST"])
def upload_probe_raw():
    offset = request.headers.get("X-Offset", type=int)
    expected = request.headers.get("X-Expected", type=int)
    data = request.get_data(cache=False) or b""
    return jsonify({
        "offset": offset,
        "expected": expected,
        "content_length": request.content_length,
        "data_len": len(data),
    })

@app.route("/upload_probe_form", methods=["POST"])
def upload_probe_form():
    offset = request.form.get("offset", type=int)
    expected = request.form.get("expected", type=int)

    f = request.files.get("file")
    if not f:
        return jsonify({"error": "no file"}), 400

    data = f.read() or b""

    return jsonify({
        "offset": offset,
        "expected": expected,
        "content_length": request.content_length,
        "file_len": len(data),
    })

if __name__ == "__main__":
    app.run(port=5000, debug=True)

Minimal reproducible example (client):

<input type="file" id="fileInput">
<button id="probeBtn">Probe upload</button>
<pre id="out"></pre>

<script>
const chunkSize = 10 * 1024 * 1024;
const badIndex = 25;

function log(s) {
  document.getElementById("out").textContent += s + "\n";
}

async function sendRaw(blob, offset, expected) {
  const r = await fetch("/upload_probe_raw", {
    method: "POST",
    headers: {
      "Content-Type": "application/octet-stream",
      "X-Offset": String(offset),
      "X-Expected": String(expected)
    },
    body: blob
  });
  return await r.json();
}

async function sendForm(blob, offset, expected) {
  const fd = new FormData();
  fd.append("offset", String(offset));
  fd.append("expected", String(expected));
  fd.append("file", blob, "chunk.bin");

  const r = await fetch("/upload_probe_form", {
    method: "POST",
    body: fd
  });
  return await r.json();
}

document.getElementById("probeBtn").onclick = async () => {
  const out = document.getElementById("out");
  out.textContent = "";

  const f = document.getElementById("fileInput").files[0];
  if (!f) {
    log("No file selected");
    return;
  }

  const indices = [badIndex - 1, badIndex, badIndex + 1];

  log(`File: ${f.name}`);
  log(`File size: ${f.size}`);
  log(`Chunk size: ${chunkSize}`);
  log("");

  for (const idx of indices) {
    const offset = idx * chunkSize;
    const end = Math.min(offset + chunkSize, f.size);
    const expected = end - offset;
    const blob = f.slice(offset, end);

    log(`Index ${idx} offset ${offset}`);
    log(`slice.size ${blob.size} expected ${expected}`);

    const rawRes = await sendRaw(blob, offset, expected);
    log(`RAW -> data_len ${rawRes.data_len} content_length ${rawRes.content_length}`);

    const formRes = await sendForm(blob, offset, expected);
    log(`FORM -> file_len ${formRes.file_len} content_length ${formRes.content_length}`);

    log("");
  }
};
</script>

Observed output with Werkzeug 3.1.4:

Index 24 offset 251658240
slice.size 10485760 expected 10485760
RAW -> data_len 10485760 content_length 10485760
FORM -> file_len 10485760 content_length 10486162

Index 25 offset 262144000
slice.size 10485760 expected 10485760
RAW -> data_len 10485760 content_length 10485760
FORM -> file_len 10485759 content_length 10486162

Index 26 offset 272629760
slice.size 10485760 expected 10485760
RAW -> data_len 10485760 content_length 10485760
FORM -> file_len 10485760 content_length 10486162

With Werkzeug 3.1.3, FORM -> file_len matches the expected size for all tested chunks, including index 25.

There is no exception/traceback; this is a silent data truncation of 1 byte.

Expected behavior:

Multipart/form-data parsing should yield the exact byte sequence sent by the client. The uploaded file part length should match the expected chunk length. Specifically, for offset 262144000 with a 10 MiB chunk size, the parsed file length should be 10485760, not 10485759.

Environment:

  • Python version: 3.13.10
  • Werkzeug version: 3.1.4 (regression), 3.1.3 (works)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions