Skip to content

gh-143927: Normalize all line endings (CR, CRLF, and LF) in configparser#143929

Merged
encukou merged 1 commit into
python:mainfrom
sethmlarson:normalize-newlines-configparser
Jun 23, 2026
Merged

gh-143927: Normalize all line endings (CR, CRLF, and LF) in configparser#143929
encukou merged 1 commit into
python:mainfrom
sethmlarson:normalize-newlines-configparser

Conversation

@sethmlarson

@sethmlarson sethmlarson commented Jan 16, 2026

Copy link
Copy Markdown
Contributor

Comment thread Lib/configparser.py
value)
if value is not None or not self._allow_no_value:
value = delimiter + str(value).replace('\n', '\n\t')
# Convert all possible line-endings into '\n\t'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the exact underlying security issue. As is, I think this will break round tripping on Windows; Text I/O used by the open expects to encode/decode newlines in particular formats on Windows by default (https://github.com/sethmlarson/cpython/blob/2a122fd420f5b425fd39848a48f5eb196cee2aa7/Lib/configparser.py#L753). See #143428 (comment) for a recent related case + more background links.

In the open calls can we use newline="" potentially?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Byte-for-byte round-tripping is already kind of broken for newlines. This is a text file; relying on newline format (or e.g. trailing whitespace) is a bad idea to begin with.

In the open calls can we use newline="" potentially?

Potentially, but it would change the behavior for all users on Windows.
And it wouldn't solve the issue, since users can pass in a file they opened themselves.

@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label May 5, 2026
@serhiy-storchaka serhiy-storchaka added the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label May 30, 2026
@github-actions github-actions Bot removed the stale Stale PR or inactive for long period of time. label Jun 2, 2026
Comment thread Lib/configparser.py
value)
if value is not None or not self._allow_no_value:
value = delimiter + str(value).replace('\n', '\n\t')
# Convert all possible line-endings into '\n\t'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Byte-for-byte round-tripping is already kind of broken for newlines. This is a text file; relying on newline format (or e.g. trailing whitespace) is a bad idea to begin with.

In the open calls can we use newline="" potentially?

Potentially, but it would change the behavior for all users on Windows.
And it wouldn't solve the issue, since users can pass in a file they opened themselves.

Comment thread Lib/configparser.py
Comment on lines +989 to +990
value = (delimiter + str(value).replace('\r\n', '\n')
.replace('\r', '\n').replace('\n', '\n\t'))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my system, re is significantly faster than three replaces for reasonable-sized input.

Suggested change
value = (delimiter + str(value).replace('\r\n', '\n')
.replace('\r', '\n').replace('\n', '\n\t'))
value = delimiter + re.sub('\r\n?|\n', '\n\t', str(value))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, re is about four times slower than three replacess on my macOS:

p --version --version
Python 3.15.0b2 (v3.15.0b2:94a64bbc6ce, Jun  2 2026, 11:57:29) [Clang 21.0.0 (clang-2100.1.1.101)]p -m timeit -s "v = 'lorem ipsum dolor sit amet '*5" "v.replace('\r\n', '\n').replace('\r', '\n').replace('\n', '\n\t')"
2000000 loops, best of 5: 189 nsec per loopp -m timeit -s "import re; v = 'lorem ipsum dolor sit amet '*5" "re.sub('\r\n?|\n', '\n\t', v)"
500000 loops, best of 5: 773 nsec per loopp -m timeit -s "v = ('lorem ipsum dolor sit amet '*5 + '\n')*5" "v.replace('\r\n', '\n').replace('\r', '\n').replace('\n', '\n\t')"
500000 loops, best of 5: 936 nsec per loopp -m timeit -s "import re; v = ('lorem ipsum dolor sit amet '*5 + '\n')*5" "re.sub('\r\n?|\n', '\n\t', v)"
100000 loops, best of 5: 3.43 usec per loopp -m timeit -s "v = 'a\r\nb\nc\rd ' * 200" "v.replace('\r\n', '\n').replace('\r', '\n').replace('\n', '\n\t')"
20000 loops, best of 5: 10.2 usec per loopp -m timeit -s "import re; v = 'a\r\nb\nc\rd ' * 200" "re.sub('\r\n?|\n', '\n\t', v)"
5000 loops, best of 5: 43.1 usec per loopp -m timeit -s "v = 'a\r\nb\nc\rd ' * 10000" "v.replace('\r\n', '\n').replace('\r', '\n').replace('\n', '\n\t')"
500 loops, best of 5: 516 usec per loopp -m timeit -s "import re; v = 'a\r\nb\nc\rd ' * 10000" "re.sub('\r\n?|\n', '\n\t', v)"
100 loops, best of 5: 2.07 msec per loop

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's keep the replaces.

@encukou encukou merged commit 5858e42 into python:main Jun 23, 2026
70 checks passed
@miss-islington-app

Copy link
Copy Markdown

Thanks @sethmlarson for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11, 3.12, 3.13, 3.14, 3.15.
🐍🍒⛏🤖

@bedevere-app

bedevere-app Bot commented Jun 23, 2026

Copy link
Copy Markdown

GH-152002 is a backport of this pull request to the 3.15 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jun 23, 2026
@bedevere-app

bedevere-app Bot commented Jun 23, 2026

Copy link
Copy Markdown

GH-152003 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.14 bugs and security fixes label Jun 23, 2026
@bedevere-app

bedevere-app Bot commented Jun 23, 2026

Copy link
Copy Markdown

GH-152004 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.13 bugs and security fixes label Jun 23, 2026
@bedevere-app

bedevere-app Bot commented Jun 23, 2026

Copy link
Copy Markdown

GH-152005 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.12 only security fixes label Jun 23, 2026
@bedevere-app

bedevere-app Bot commented Jun 23, 2026

Copy link
Copy Markdown

GH-152006 is a backport of this pull request to the 3.11 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.11 only security fixes label Jun 23, 2026
@bedevere-app

bedevere-app Bot commented Jun 23, 2026

Copy link
Copy Markdown

GH-152007 is a backport of this pull request to the 3.10 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.10 only security fixes label Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type-security A security issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants