Skip to content

Fix TAP output#2320

Open
pks-gitlab wants to merge 9 commits into
git:masterfrom
pks-gitlab:b4/pks-t7527-fix-tap-output
Open

Fix TAP output#2320
pks-gitlab wants to merge 9 commits into
git:masterfrom
pks-gitlab:b4/pks-t7527-fix-tap-output

Conversation

@pks-gitlab
Copy link
Copy Markdown

Test PR to check what breaks.

Hi,

this small patch series fixes another instance of broken TAP output that
has landed via 4d11b9c (Merge branch 'pt/fsmonitor-linux', 2026-05-31).

As this has happened multiple times by now I decided to have a look at
whether we can fix this class of issues a bit more holistically. So this
series also contains a change that makes prove bail out when it sees
invalid TAP output, which uncovers a small set of preexisting issues in
our test suite.

Changes in v3:
  - Fix a test gap for AlmaLinux and Debian in GitLab CI, which uncovers
    an issue flagged by Peff.
  - Fix TAP breakage in t7810.
  - Link to v2: https://patch.msgid.link/20260603-pks-t7527-fix-tap-output-v2-0-cf3af5694e20@pks.im

Changes in v2:
  - Fix waiting for p4d, and deduplicate the logic that does this.
  - Link to v1: https://patch.msgid.link/20260602-pks-t7527-fix-tap-output-v1-0-db3da2a1b137@pks.im

Thanks!

Patrick

To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>

--- b4-submit-tracking ---
# This section is used internally by b4 prep for tracking purposes.
{
  "series": {
    "revision": 3,
    "change-id": "20260601-pks-t7527-fix-tap-output-105da1d73df0",
    "prefixes": [],
    "presubject": "",
    "history": {
      "v1": [
        "20260602-pks-t7527-fix-tap-output-v1-0-db3da2a1b137@pks.im"
      ],
      "v2": [
        "20260603-pks-t7527-fix-tap-output-v2-0-cf3af5694e20@pks.im"
      ]
    }
  }
}
@gitgitgadget-git
Copy link
Copy Markdown

There are issues in commit 9b72149:
Changes in v3:

  • Commit not signed off
  • The first line must be separated from the rest by an empty line
  • Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
    Indented lines, and lines without whitespace, are exempt

@pks-gitlab pks-gitlab force-pushed the b4/pks-t7527-fix-tap-output branch from 1c12411 to b59fc32 Compare June 4, 2026 07:38
@gitgitgadget-git
Copy link
Copy Markdown

There are issues in commit 9b72149:
Changes in v3:

  • Commit not signed off
  • The first line must be separated from the rest by an empty line
  • Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
    Indented lines, and lines without whitespace, are exempt

@gitgitgadget-git
Copy link
Copy Markdown

There is an issue in commit ccc1270:
ci: unify Linux images across GitLab and GitHub

  • Commit not signed off

@pks-gitlab pks-gitlab force-pushed the b4/pks-t7527-fix-tap-output branch from b59fc32 to f75f6e9 Compare June 4, 2026 07:41
@gitgitgadget-git
Copy link
Copy Markdown

There is an issue in commit e4a45d7:
t: fix broken TAP output

  • Commit not signed off

pks-t added 8 commits June 4, 2026 09:53
Rearrange the order of Linux jobs that we have defined in GitLab CI so
that it matches the order on GitHub's side. This makes it easier to
compare whether the list of jobs actually matches on both sides.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
The GitLab CI definitions are missing jobs for AlmaLinux and Debian,
both of which exist in GitHub Workflows. Plug this gap.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
The image for the "linux-breaking-changes" job has drifted apart across
GitHub and GitLab. Adapt it to use "ubuntu:rolling" on both systems.

With this change there's only one difference remaining: GitHub uses
"ubuntu:focal" for the "linux32" job while GitLab uses "ubuntu:20.04".
These are different names for the same image, so there is no actual
difference here. Adjust GitHub to use the "20.04" tag -- this matches
all the other jobs which use version numbers, and you don't have to
learn Ubuntu's release names by heart.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Before running the tests in t7527 we first verify whether the fsmonitor
even works, which seems to depend on the actual filesystem that is in
use. The verification executes outside of any prerequisite or test body,
so its stdout/stderr is not being redirected.

The consequence of this is that any command that prints to stdout/stderr
may break the TAP specification by printing invalid lines. And in fact
we already do that, as git-init(1) prints the path to the created Git
repository by default.

Fix this issue by moving the logic into a lazy prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
In t7810 we verify whether the system has proper multibyte locale
support by executing `test-tool regex` with a unicode character. When
this check fails though we'll output an error that breaks the TAP
format.

Fix this issue by turning the logic into a lazy prerequisite.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
When tests have finished we clean up the trash directory via `rm -rf`.
On Windows this can fail with EBUSY in cases where a process still holds
some of the files open, for example when we have spawned a daemonized
process that wasn't properly terminated. We thus retry several times,
but every failure will result in error messages being printed, and that
in turn breaks the TAP output format.

One such case where this is causing issues is in t921x, which contains
tests related to Scalar. Some tests spawn the fsmonitor daemon, and we
never properly terminate it.

The obvious fix would be to ensure that we never leak any processes, but
that gets ugly fast. Instead, let's work around the issue by silencing
error messages printed by the `rm -rf` calls. We already know to print
an error when the retry loop fails, so we don't loose much.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
When stopping the p4d watchdog process via "kill -9", the shell may
print a job-control notification like:

  ./test-lib.sh: line 1269: 57960 Killed: 9               while true; do
      if test $nr_tries_left -eq 0; then
          kill -9 $p4d_pid; exit 1;
      fi; sleep 1; nr_tries_left=$(($nr_tries_left - 1));
  done 2> /dev/null 4>&2  (wd: ~)

This message is printed asynchronously by the shell when it reaps the
process. While harmless right now, this will cause breakage once we
enable strict parsing of the TAP protocol in a subsequent commit.

Fix this by using `wait` so that we can synchronously reap the watchdog
process and swallow the diagnostic.

While at it, deduplicate the logic we have in `stop_p4d_and_watchdog ()`
and `stop_and_cleanup_p4d ()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
To make the result of our tests accessible we use the TAP protocol. This
protocol is parsed by either prove or by Meson. Unfortunately, these two
tools differ when it comes to their strictness when parsing the
protocol:

  - Prove by default happily accepts lines not specified by the
    protocol.

  - Meson will also accept such lines, but prints a big and ugly warning
    message.

We have fixed our test suite in the past to not print invalid TAP lines
anymore via b1dc2e7 (Merge branch 'ps/meson-tap-parse', 2025-06-17).
But as none of our tools perform a strict check it's still possible for
broken tests to sneak back in, like for example in 362f695 (Merge
branch 'ps/t1006-tap-fix', 2025-07-16). This doesn't hurt at all when
using prove, but it's quite annoying when using Meson due to the
generated warnings.

Unfortunately, there doesn't seem to be a portable way to make all tools
complain about violations of the TAP format. The TAP 14 specification
has added pragmas to the protocol that would allow us to say `pragma
+strict`, and the effect of that would be to treat invalid TAP lines as
a test failure. But the release of TAP 14 is still rather recent, and
Test-Harness for example only gained support for it in version 3.48,
which was released in 2023.

In fact though, this pragma was already introduced as an inofficial
extension of the TAP protocol with Test-Harness 3.10, released in 2008.
So while not all tools understand the pragma, at least prove does for a
long time.

Unconditionally enable the pragma when using prove so that we'll detect
tests that emit broken TAP output right away. This would have detected
the issues fixed in preceding commits:

    $ prove t7527-builtin-fsmonitor.sh
    t7527-builtin-fsmonitor.sh .. All 69 subtests passed
            (less 6 skipped subtests: 63 okay)

    Test Summary Report
    -------------------
    t7527-builtin-fsmonitor.sh (Wstat: 0 Tests: 69 Failed: 0)
      Parse errors: Unknown TAP token: "Initialized empty Git repository in /tmp/git/test_fsmonitor_smoke/.git/"

Signed-off-by: Patrick Steinhardt <ps@pks.im>
@pks-gitlab pks-gitlab force-pushed the b4/pks-t7527-fix-tap-output branch from f75f6e9 to 012c89b Compare June 4, 2026 07:54
@gitgitgadget-git
Copy link
Copy Markdown

There is an issue in commit e4a45d7:
t: fix broken TAP output

  • Commit not signed off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants