diff mbox series

diff.c: use utf8_strwidth() instead of strlen() for display width

Message ID pull.1668.git.1708281443289.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series diff.c: use utf8_strwidth() instead of strlen() for display width | expand

Commit Message

Chandra Pratap Feb. 18, 2024, 6:37 p.m. UTC
From: Chandra Pratap <chandrapratap3519@gmail.com>

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    diff.c: use utf8_strwidth() instead of strlen() for display width

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1668%2FChand-ra%2Fdiff-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1668/Chand-ra/diff-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1668

 diff.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)


base-commit: 2996f11c1d11ab68823f0939b6469dedc2b9ab90

Comments

Patrick Steinhardt Feb. 19, 2024, 6:23 a.m. UTC | #1
On Sun, Feb 18, 2024 at 06:37:23PM +0000, Chandra Pratap via GitGitGadget wrote:
> From: Chandra Pratap <chandrapratap3519@gmail.com>
> 
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
>     diff.c: use utf8_strwidth() instead of strlen() for display width
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1668%2FChand-ra%2Fdiff-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1668/Chand-ra/diff-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1668
> 
>  diff.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/diff.c b/diff.c
> index ccfa1fca0d0..02d60af6749 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2712,13 +2712,8 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>  	 * making the line longer than the maximum width.
>  	 */
>  
> -	/*
> -	 * NEEDSWORK: line_prefix is often used for "log --graph" output
> -	 * and contains ANSI-colored string.  utf8_strnwidth() should be
> -	 * used to correctly count the display width instead of strlen().
> -	 */
>  	if (options->stat_width == -1)
> -		width = term_columns() - strlen(line_prefix);
> +		width = term_columns() - utf8_strwidth(line_prefix);

It would be nice to add a test demonstrating that this indeed fixes an
issue. This would also help to keep this from regressing in the future.

Also, do you know why we didn't use `utf8_strwidth()` right from the
start? It would have saved the writer some time to just use
`utf8_strwidth()` instead of writing a whole paragraph explaining that
we should do it eventually. Makes me wonder whether there is anything
else going on here.

Patrick

>  	else
>  		width = options->stat_width ? options->stat_width : 80;
>  	number_width = decimal_width(max_change) > number_width ?
> 
> base-commit: 2996f11c1d11ab68823f0939b6469dedc2b9ab90
> -- 
> gitgitgadget
>
Junio C Hamano Feb. 19, 2024, 7:01 a.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> Also, do you know why we didn't use `utf8_strwidth()` right from the
> start? It would have saved the writer some time to just use
> `utf8_strwidth()` instead of writing a whole paragraph explaining that
> we should do it eventually. Makes me wonder whether there is anything
> else going on here.

I suspect that it is because it is not just that single strlen()
call.  The code assumed that byte count and display width were
interchangeable, and use of strlen() there was merely an example.
Starting there, the value returned by strlen() is treated as if it
were interchangeable with a display width, and then later used to
count how many bytes to trim from either end of the string so that
the trimmed string would eventually fit a given display width, which
means that the code to compute how much to trim (which is not that
strlen() the patch in question is touching) would have to compute
the reverse, i.e. "if we need to recover N display columns, how many
bytes do we need to trim from that string?"
diff mbox series

Patch

diff --git a/diff.c b/diff.c
index ccfa1fca0d0..02d60af6749 100644
--- a/diff.c
+++ b/diff.c
@@ -2712,13 +2712,8 @@  static void show_stats(struct diffstat_t *data, struct diff_options *options)
 	 * making the line longer than the maximum width.
 	 */
 
-	/*
-	 * NEEDSWORK: line_prefix is often used for "log --graph" output
-	 * and contains ANSI-colored string.  utf8_strnwidth() should be
-	 * used to correctly count the display width instead of strlen().
-	 */
 	if (options->stat_width == -1)
-		width = term_columns() - strlen(line_prefix);
+		width = term_columns() - utf8_strwidth(line_prefix);
 	else
 		width = options->stat_width ? options->stat_width : 80;
 	number_width = decimal_width(max_change) > number_width ?