Skip to content

sparse-index: require commands to declare sparse-index compatibility#2143

Draft
spkrka wants to merge 3 commits into
gitgitgadget:masterfrom
spkrka:krka/sparse-index-default
Draft

sparse-index: require commands to declare sparse-index compatibility#2143
spkrka wants to merge 3 commits into
gitgitgadget:masterfrom
spkrka:krka/sparse-index-default

Conversation

@spkrka

@spkrka spkrka commented Jun 9, 2026

Copy link
Copy Markdown

This series continues the sparse-index migration work started by Derrick
Stolee in 2021 (19a0acc, "sparse-index: add guard to ensure full index").

The original plan was for each command to opt in to sparse-index
compatibility by setting command_requires_full_index = 0, and to eventually
remove the field once all commands had been audited. Most commands were
converted over the following years, but the final step was never completed.

This series finishes that work:

  1. Change the default from 1 (expand) to -1 (unset) and add an accessor
    that dies when a command reads the index without declaring its
    requirement. This turns silent fallback into a loud failure, making it
    obvious which commands still need attention.

  2. The test suite surfaces four undeclared commands: git-am,
    git-submodule--helper, git-mv, and git-grep (submodule path).

  3. After analyzing each command's index access patterns, all four are
    marked as sparse-index compatible (= 0):

    • git-am: uses index_name_pos() which auto-expands sparse directories
      on demand; three-way merge uses merge-ort which works from tree OIDs.

    • git-mv: in-cone moves cannot hit sparse directory entries by
      definition of the cone. Out-of-cone moves (--sparse flag) get a
      targeted ensure_full_index().

    • git-submodule--helper: S_ISGITLINK entries are never collapsed into
      sparse directory entries (invariant in convert_to_sparse_rec).

    • git-grep (submodule): grep_cache() explicitly handles S_ISSPARSEDIR
      by recursing via grep_tree().

Code coverage analysis across the full test suite confirms that the
ensure_full_index() calls gated on command_requires_full_index never
execute -- the field is effectively dead code today. A follow-up series
could remove it entirely once this has been in the tree for a while.

Coverage summary from a full test run with gcov instrumentation:

  • The accessor is called 12,678 times; 12,455 take the fast path
    (value explicitly set to 0), 223 hit the die (tests exercising
    undeclared commands).
  • ensure_full_index() gated on the accessor in repo_read_index(),
    do_read_index(), and unpack_trees() is never executed (0% branch
    taken across all 19,553 calls).
  • All four new declaration sites are exercised (am: 7x, mv: 5x,
    submodule: 11x, grep-subrepo: 6x).

spkrka added 2 commits June 9, 2026 15:38
Change the default value of command_requires_full_index from 1 to -1
(unset) and add an accessor that dies when the value has not been
explicitly set by the command.  This catches commands that read the
index without first declaring whether they need the full index
expanded.

Each command that has already been audited for sparse-index
compatibility sets command_requires_full_index = 0.  Commands that
have not been audited will now fail loudly instead of silently
expanding the index, making it clear which commands still need
attention.

Set GIT_ALLOW_SPARSE_INDEX_WITHOUT_DECLARATION=1 to fall back to the
old behavior (expand full index) while investigating failures.
…ands

These four code paths were found to never declare their
command_requires_full_index value.  Set them to 1 (require full index
expansion) as a safe default, with comments noting they have not yet
been verified for sparse-index compatibility.

Found by changing the default to -1 (unset) and adding an accessor
that dies when a command reads the index without first declaring its
sparse-index requirement.
@gitgitgadget

gitgitgadget Bot commented Jun 9, 2026

Copy link
Copy Markdown

There is an issue in commit 3585a31:
sparse-index: require commands to declare sparse-index compatibility

  • Commit not signed off

@gitgitgadget

gitgitgadget Bot commented Jun 9, 2026

Copy link
Copy Markdown

There is an issue in commit b6b4a96:
sparse-index: declare command_requires_full_index for undeclared commands

  • Commit not signed off

@gitgitgadget

gitgitgadget Bot commented Jun 9, 2026

Copy link
Copy Markdown

There are issues in commit ad770c4:
sparse-index: mark am, mv, submodule, grep as sparse-index compatible

  • Commit not signed off
  • Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
    Indented lines, and lines without whitespace, are exempt

These commands were found to have never declared command_requires_full_index.
After analysis, all four are safe with sparse index (= 0).

git-am: The patch application path uses index_name_pos() which auto-expands
sparse directory entries on demand when a specific file path is looked up.
The three-way merge fallback uses merge-ort, which works from tree OIDs and
handles sparse directories natively.  refresh_index() explicitly skips
S_ISSPARSEDIR entries.  git-rebase --apply invokes git-am as a child
process; rebase itself already declares command_requires_full_index = 0.
Tested by t1092.42 (merge, cherry-pick, and rebase) which exercises
rebase --apply through the sparse-index repo.

git-mv: For in-cone moves, all index lookups use index_name_pos() on
paths that cannot be ancestors of sparse directory entries (by
definition of the sparse cone), so no expansion is triggered.
index_range_of_same_dir() iterates only in-cone entries in this case.
For out-of-cone moves (--sparse flag), a targeted ensure_full_index()
is needed because the bulk cache[] iteration and rename_index_entry_at()
assume individual file entries, not collapsed directory stubs.
Add tests in t1092.59 (sparse-index is not expanded) to verify that
in-cone file and directory renames do not trigger index expansion.

git-submodule--helper: Submodule entries (S_ISGITLINK) are never collapsed
into sparse directory entries -- convert_to_sparse_rec() in sparse-index.c
explicitly blocks directories containing gitlinks from being collapsed.
module_list_compute() filters on S_ISGITLINK, skipping any sparse
directory entries.  module_add() already calls ensure_full_index()
unconditionally before its index iteration.
Tested by t1092.55 (submodule handling).

grep (submodule path): grep_cache() explicitly handles S_ISSPARSEDIR
entries by reading the tree object and recursing via grep_tree(), so
sparse directory contents are grepped correctly.  The subrepo index
is independent of the superproject.
Tested by t1092.85 (grep sparse directory within submodules).
@spkrka spkrka force-pushed the krka/sparse-index-default branch from ad770c4 to 0179289 Compare June 9, 2026 15:13
@gitgitgadget

gitgitgadget Bot commented Jun 9, 2026

Copy link
Copy Markdown

There is an issue in commit 3585a31:
sparse-index: require commands to declare sparse-index compatibility

  • Commit not signed off

@gitgitgadget

gitgitgadget Bot commented Jun 9, 2026

Copy link
Copy Markdown

There is an issue in commit b6b4a96:
sparse-index: declare command_requires_full_index for undeclared commands

  • Commit not signed off

@gitgitgadget

gitgitgadget Bot commented Jun 9, 2026

Copy link
Copy Markdown

There are issues in commit 0179289:
sparse-index: mark am, mv, submodule, grep as sparse-index compatible

  • Commit not signed off
  • Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
    Indented lines, and lines without whitespace, are exempt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant