- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 6.2k
Fix yet another bug with diff file names #12771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix yet another bug with diff file names #12771
Conversation
Following further testing it has become apparent that the diff line cannot be used to determine filenames for diffs with any sort of predictability the answer therefore is to use the other lines that are provided with a diff Fix go-gitea#12768 Signed-off-by: Andrew Thornton <[email protected]>
| Codecov Report
 @@            Coverage Diff             @@
##           master   #12771      +/-   ##
==========================================
+ Coverage   43.21%   43.24%   +0.02%     
==========================================
  Files         649      649              
  Lines       72000    72046      +46     
==========================================
+ Hits        31117    31156      +39     
- Misses      35839    35844       +5     
- Partials     5044     5046       +2     
 Continue to review full report at Codecov. 
 | 
| zeripath/gitea/tree/fix-12768-I-hate-our-diff-infrastructure 😁 Haha, yes, i was very surprised about the buggy function of getting the filenames from diff, one of the most basic functionality of a git-gui. Your current approach look great. You must not forget to trim the tab-character at the end of the filename in the +++/--- lines, it results currently in broken links at "View file" . | 
| 
 What do you mean tab character? There is no tab character. | 
| oh that's interesting... a file with spacing gets \t at the end of the line. WHY GIT WHY?! | 
|  | 
Signed-off-by: Andrew Thornton <[email protected]>
| Hi @zeripath GNU itself says: blanks are not supported: https://www.gnu.org/software/diffutils/manual/html_node/Unusual-File-Names.html#Unusual%20File%20Names They append a timestamp after the filename: https://www.gnu.org/software/diffutils/manual/html_node/Example-Unified.html#Example-Unified - but this cannot be seen at GIT GIT adds a 'tab' character after the ---/+++ filename, if it contains a blank Take a look: https://github.com/git/git/blob/master/diff.c#L1504 	case DIFF_SYMBOL_FILEPAIR_MINUS:
		meta = diff_get_color_opt(o, DIFF_METAINFO);
		reset = diff_get_color_opt(o, DIFF_RESET);
		fprintf(o->file, "%s%s--- %s%s%s\n", diff_line_prefix(o), meta,
			line, reset,
			strchr(line, ' ') ? "\t" : "");
		break;test: cd testrepo 
date > 'testfile'
date > 'testfile A'
git add testfile*
git commit -m 'test"
git log -p -n 1 | grep "+++" | xxd00000000: 2b2b 2b20 622f 7465 7374 6669 6c65 0a2b  +++ b/testfile.+
00000010: 2b2b 2062 2f74 6573 7466 696c 6520 4109  ++ b/testfile A.
00000020: 0a   After the filename 'testfile A' there is a tab-character (0x09) | 
| yeah that's what I've found from looking at the git source code myself. This whole section of code needs rewriting. | 
| Hi @zeripath you were faster. Maybe my better testing: fashberg@8c6231c It iterates all variants and adds a timestamp after \t (timestamp has to be removed) | 
Signed-off-by: Andrew Thornton <[email protected]>
| @lunny I've added several testcases to the ParsePatch test and renamed it singlefile and fixed a few more issues in the PR | 
| 
 So the reason why I went for the if  			strchr(line, ' ') ? "\t" : "");I didn't want to consider what would happen if say git decided that \t was not a reason to quote a filename. In general I'm not really a fan of Split(...) because it creates (potentially n-1) strings unnecessarily, and if you really just want the 0th element you should just use Index and grab the preceding bit of the string i.e.: index := strings.Index(foo, "\t")
if index > -1 {
  foo = foo[:index]
}that way if the rest of foo contains mulitple \t you don't end up with n individual strings. (It's also worth realising that go actually copies the bytes arrays around for each of those new strings.) As I've said above this whole section of code needs rewriting with proper reference to what git does and can do. It's highly inefficient, makes bad assumptions about sizes, copies this data around multiple times and the resulting AST it creates is rather large. | 
Backport go-gitea#12771 Following further testing it has become apparent that the diff line cannot be used to determine filenames for diffs with any sort of predictability the answer therefore is to use the other lines that are provided with a diff Fix go-gitea#12768 Signed-off-by: Andrew Thornton <[email protected]>
Backport #12771 Following further testing it has become apparent that the diff line cannot be used to determine filenames for diffs with any sort of predictability the answer therefore is to use the other lines that are provided with a diff Fix #12768 Signed-off-by: Andrew Thornton <[email protected]>
go-gitea#12771 attempted to fix diff by avoiding the git diff line as it is possible to have an ambiguous line here. go-gitea#12254 attempted to fix diff by assuming that names would quoted if they needed to be and if one was quoted then both would be. Both of these were wrong. I have now discovered `--src-prefix` and `--dst-prefix` which means that we can set this in such a way to force the git diff to always be unambiguous. Therefore this PR rollsback most of the changes in go-gitea#12771 and uses these options to fix this. Signed-off-by: Andrew Thornton <[email protected]>
* Finally fix diff names #12771 attempted to fix diff by avoiding the git diff line as it is possible to have an ambiguous line here. #12254 attempted to fix diff by assuming that names would quoted if they needed to be and if one was quoted then both would be. Both of these were wrong. I have now discovered `--src-prefix` and `--dst-prefix` which means that we can set this in such a way to force the git diff to always be unambiguous. Therefore this PR rollsback most of the changes in #12771 and uses these options to fix this. Signed-off-by: Andrew Thornton <[email protected]> * Update services/gitdiff/gitdiff.go * Update services/gitdiff/gitdiff.go * Update modules/repofiles/temp_repo.go * fix test Signed-off-by: Andrew Thornton <[email protected]> Co-authored-by: Lauris BH <[email protected]>
Backport go-gitea#13136 it is possible to have an ambiguous line here. if they needed to be and if one was quoted then both would be. Both of these were wrong. I have now discovered `--src-prefix` and `--dst-prefix` which means that we can set this in such a way to force the git diff to always be unambiguous. Therefore this PR rollsback most of the changes in go-gitea#12771 and uses these options to fix this. Signed-off-by: Andrew Thornton <[email protected]>
Backport #13136 it is possible to have an ambiguous line here. if they needed to be and if one was quoted then both would be. Both of these were wrong. I have now discovered `--src-prefix` and `--dst-prefix` which means that we can set this in such a way to force the git diff to always be unambiguous. Therefore this PR rollsback most of the changes in #12771 and uses these options to fix this. Signed-off-by: Andrew Thornton <[email protected]>
Following further testing it has become apparent that the diff line
cannot be used to determine filenames for diffs with any sort of predictability
the answer therefore is to use the other lines that are provided with a diff
Fix #12768
Closes #12769
Replaces #12769
Signed-off-by: Andrew Thornton [email protected]