Skip to content

Update CRLF filtering to match modern git#4904

Merged
ethomson merged 11 commits intomasterfrom
ethomson/crlf
Jan 3, 2019
Merged

Update CRLF filtering to match modern git#4904
ethomson merged 11 commits intomasterfrom
ethomson/crlf

Conversation

@ethomson
Copy link
Copy Markdown
Member

@ethomson ethomson commented Dec 3, 2018

#3285 indicated that we had issues with compatibility when CRLF-filtering files when adding them to the index. This takes that pull request with a few modifications:

  1. Move the existing crlf_data folders from posix and windows to have a suffix indicating that they're the "to workdir" direction. This will let us add the "to odb" folders in the same location.
  2. Instead of creating a second script to generate a corpus of CRLF test data, this modifies the existing test script. This means that instead of four invocations (the to-workdir and to-odb scripts on both POSIX and Windows) we now only have two invocations (the CRLF script on both POSIX and Windows).
  3. I've tried to simplify the tests that use this corpus a bit - and especially important for debugging, adding a message that indicates which combination of autocrlf / safecrlf / attributes settings is failing.
  4. Instead of trying to reverse engineer git's settings, I've just taken the changes that they've made. There were a number of changes made by Torsten Bögershausen, and since he gave permission for us to use his changes we can simple examine what he did to git and replicate it ourselves. I've done that here.
  5. Cherry-pick the remaining test cases that libgit2 does not work the same way as vanilla git does for adding files to index #3285 added.

Big thanks to @csware for all the work on this. Having the test corpus from git is a huge help to ensure that we are getting this right.

Move the crlf_data folders reponsible for holding the state of the
filters going into the working directory to "to_workdir" variations of
the folder name to accommodate future growth into the "to odb" filter
variation.  Update the script to create these new folders as appopriate.
@ethomson
Copy link
Copy Markdown
Member Author

ethomson commented Dec 3, 2018

There were some questions in #3285 about the value of checking in all this test data. Yes, it is obnoxious but very useful. It's hard to generate this on-the-fly since:

  1. We would require a dependency on having git on the remote system. And although it's a pretty safe bet that git is installed on anybody's machine who's developing this (or any build system that's running a CI build), it's not clear what version of git is installed. This system lets us decide what version of git we want to emulate.
  2. More importantly, creating all these files ends up with some disk I/O and a lot of execs of various git commands. In other words, it's slow on Windows. Really slow. Checking in the data is definitely preferred.

We could do something like check in a zip that we extract, which would have the benefit of only having a single file checked in to the repository, but that feels like the only benefit and otherwise feels icky to me.

ethomson and others added 9 commits December 3, 2018 17:01
After sandboxing the crlf directory, remove the working directory
contents.  This allows us to package data within the crlf directory
(for simplicity, this allows us to script the to-odb and to-workdir
crlf filter conversion data in a single location).
Include a shell script that will generate the expected data of OIDs and
failures for calling git.git to capture its output as a test resource.

Right now, there is no need to differentiate different systems as git behaves
the same on all systems IIRC.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
Re-use the existing crlf data generation script for creating the to-odb
dataset.  Also, store the actual file contents instead of the ID so that
we can identify differences instead of detecting that differences exist.
Use the crlf data scripts to produce a corpus of known-good data in
"git" format (aka ODB format) from a variety of files with different
line endings.  `git` created these files running `git add` to stage the
contents then extracting the data from the repository.

We'll use these to ensure that we create identical contents when we add
files into the index.
Given a variety of combinations of core.autocrlf, core.safecrlf settings
and attributes settings, test that we add files to index the same way
(regarding OIDs and fatal errors) as a known-good test resource created
by git.git.

Signed-off-by: Sven Strickroth <email@cs-ware.de>
Don't simply fail when the expected output does not match the data in
the index; instead, provide a detailed output about the system, file,
and settings that caused the failure so that developers can better
isolate the problem(s).
Examine the recent CRLF changes to git by Torsten Bögershausen and
include similar changes to update our CRLF logic to match.

Note: Torsten Bögershausen has previously agreed to allow his changes to
be included in libgit2.
This is a cherry-pick of the tests from the following commits:

core.autocrlf=true and core.safecrlf=true did not fail on LF-only file as vanilla git does
Adding a CRLF-file with core.autocrlf=input and core.safecrlf=true does not fail as with vanilla git
Make files with #CR!=#CRLF not fail with core.safecrlf=true

Reported-by: Yue Lin Ho <b8732003@student.nsysu.edu.tw>
Signed-off-by: Sven Strickroth <email@cs-ware.de>
Comment thread tests/checkout/crlf.c Outdated
Comment thread tests/checkout/crlf.c
* check out over it.
*/
git_buf_puts(&reponame, "crlf");
cl_git_pass(git_path_direach(&reponame, 0, unlink_file, NULL));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use your nifty new workdir iterator for that?

Comment thread tests/index/crlf.c
Comment thread tests/index/crlf.c Outdated
Comment thread tests/index/crlf.c
Comment thread tests/index/crlf.c Outdated
Comment thread tests/index/crlf.c
cl_assert_equal_i(GITERR_FILTER, giterr_last()->klass);
cl_assert_equal_s(expected_contents.ptr, giterr_last()->message);
} else {
cl_fail("unexpected index failure");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. Okay, so you do already fix up a lot of these. Maybe want to squash those two commits together?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did that so that @csware's commits kept him as the author to make sure that he got credit in the log.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ;)

Comment thread tests/index/crlf.c
Comment thread tests/index/crlf.c
Comment thread tests/index/crlf.c
Comment thread src/crlf.c
filename);
else
giterr_set(
GITERR_FILTER, "CRLF would be replaced by LF");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, how I wish that git'd just track the files you add, without ever caring about their contents at all and leaving that problem to the user instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With SafeCRLF enabled Git must care for the content.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, that's my point ;)

@ethomson
Copy link
Copy Markdown
Member Author

Thanks, @pks-t, I'm going through and resolving your comments as I fix them locally. I'm stepping away now but I'll push up fixes shortly.

Wrap function calls in the `checkout::crlf` tests with `cl_git_pass`,
`cl_assert`, etc. to ensure that they're successful.
@ethomson
Copy link
Copy Markdown
Member Author

OK, I added some missing cl_git_pass and cl_asserts. I did it in a new commit instead of trying to fixup the prior commits - again, to keep authorship straight (and because it's a bit easier). 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants