Message ID | c698805f088e0643e5faf027d4eaa6de14d6c1ff.1739918546.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | gitk: Fixing file name encoding issues. | expand |
"Kazuhiro Kato via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Kazuhiro Kato <kazuhiro.kato@hotmail.co.jp> Here is a place to give a bit more context. In what way the current code is wrong, what end-user visible symptoms are brought due to that wrongness, what is the correct way to implement it, etc. > Signed-off-by: Kazuhiro Kato <kazuhiro.kato@hotmail.co.jp> > --- > gitk-git/gitk | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/gitk-git/gitk b/gitk-git/gitk > index 88951ed2384..f4f8dbd5fad 100755 > --- a/gitk-git/gitk > +++ b/gitk-git/gitk > @@ -8205,12 +8205,13 @@ proc parseblobdiffline {ids line} { > > if {$type eq "--cc"} { > # start of a new file in a merge diff > - set fname [string range $line 10 end] > + set fname_raw [string range $line 10 end] > + set fname [encoding convertfrom $fname_raw] Is this "the Tcl read from git things as sequence of bytes, not characters, so somebody needs to pass the bytes to "encoding" function to turn them into a sequence of characters? Unless everything is US-ASCII, that is. If that is the case, presumably the $line has a sequence of bytes, so it may be wrong to chop it at 10th position (presumably that's 10th byte, not 10th character) when we are trying to teach the code to deal with non-ASCII data, no? I am reasonably sure that [string length "diff --git"] is where 10 comes from, and that prefix will always be in ASCII, but it feels safer and kosher if we converted the whole line first and then chopped off the prefix. The patch title says Japanese, but I would imagine this applies to anything non-ASCII, so it would be better to retitle the patch to say "non-ASCII" instead to signal that the issue the patch fixes applies more widely. Thanks.
diff --git a/gitk-git/gitk b/gitk-git/gitk index 88951ed2384..f4f8dbd5fad 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -8205,12 +8205,13 @@ proc parseblobdiffline {ids line} { if {$type eq "--cc"} { # start of a new file in a merge diff - set fname [string range $line 10 end] + set fname_raw [string range $line 10 end] + set fname [encoding convertfrom $fname_raw] if {[lsearch -exact $treediffs($ids) $fname] < 0} { lappend treediffs($ids) $fname add_flist [list $fname] } - + set fname $fname_raw } else { set line [string range $line 11 end] # If the name hasn't changed the length will be odd, @@ -8310,6 +8311,7 @@ proc parseblobdiffline {ids line} { set diffinhdr 0 return } + set line [encoding convertfrom $line] $ctext insert end "$line\n" filesep } else {