[GSOC] apply: address -Wsign-comparison warnings

Message ID	20250205014055.737190-1-jelly.zhao.42@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-pj1-f66.google.com (mail-pj1-f66.google.com [209.85.216.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19C6625A63B for <git@vger.kernel.org>; Wed, 5 Feb 2025 01:42:42 +0000 (UTC) From: Zejun Zhao <jelly.zhao.42@gmail.com> To: git@vger.kernel.org Cc: Zejun Zhao <jelly.zhao.42@gmail.com>, Junio C Hamano <gitster@pobox.com>, Elijah Newren <newren@gmail.com>, Patrick Steinhardt <ps@pks.im> Subject: [GSOC][PATCH] apply: address -Wsign-comparison warnings Date: Wed, 5 Feb 2025 01:40:55 +0000 Message-ID: <20250205014055.737190-1-jelly.zhao.42@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[GSOC] apply: address -Wsign-comparison warnings \| expand [GSOC] apply: address -Wsign-comparison warnings

Message ID

20250205014055.737190-1-jelly.zhao.42@gmail.com (mailing list archive)

State

New

Headers

From: Zejun Zhao <jelly.zhao.42@gmail.com>
To: git@vger.kernel.org
Cc: Zejun Zhao <jelly.zhao.42@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Elijah Newren <newren@gmail.com>,
	Patrick Steinhardt <ps@pks.im>
Subject: [GSOC][PATCH] apply: address -Wsign-comparison warnings
Date: Wed,  5 Feb 2025 01:40:55 +0000
Message-ID: <20250205014055.737190-1-jelly.zhao.42@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

[GSOC] apply: address -Wsign-comparison warnings | expand

Commit Message

Zejun Zhao Feb. 5, 2025, 1:40 a.m. UTC

There are several -Wsign-comparison warnings in "apply.c", which can be
classified into the following three types:

  1. comparing a `length` of `size_t` type with a result of pointer
  arithmetic of `int` type

  2. comparing a `length` of `size_t` type with a `length` of `int` type

  3. comparing a loop count `i` of `int` type with an unsigned loop
  bound

Fix these warnings following one basic principle: do not touch the
relevant logics and keep the behaviors of the code. Adopt three
different strategies for each of the above three types:

  1. cast the result of pointer arithmetic to `size_t` type

  2. try to change the type of the `length` to `size_t` (may fall back
  to Strategy 1 if the variable is not guaranteed to be unsigned)

  3. use a loop count `i` of `size_t` type

Signed-off-by: Zejun Zhao <jelly.zhao.42@gmail.com>
---
 apply.c | 73 +++++++++++++++++++++++++++------------------------------
 apply.h |  6 ++---
 2 files changed, 38 insertions(+), 41 deletions(-)


base-commit: bc204b742735ae06f65bb20291c95985c9633b7f

Comments

Patrick Steinhardt Feb. 5, 2025, 7:47 a.m. UTC | #1

On Wed, Feb 05, 2025 at 01:40:55AM +0000, Zejun Zhao wrote:
> There are several -Wsign-comparison warnings in "apply.c", which can be
> classified into the following three types:
> 
>   1. comparing a `length` of `size_t` type with a result of pointer
>   arithmetic of `int` type
> 
>   2. comparing a `length` of `size_t` type with a `length` of `int` type
> 
>   3. comparing a loop count `i` of `int` type with an unsigned loop
>   bound
> 
> Fix these warnings following one basic principle: do not touch the
> relevant logics and keep the behaviors of the code. Adopt three
> different strategies for each of the above three types:
> 
>   1. cast the result of pointer arithmetic to `size_t` type
> 
>   2. try to change the type of the `length` to `size_t` (may fall back
>   to Strategy 1 if the variable is not guaranteed to be unsigned)
> 
>   3. use a loop count `i` of `size_t` type

You should split up this patch into a series, as it is really hard to
follow what's going on. There are a couple of things happening:

  - You change types in `struct apply_state`, which bubbles up.

  - You adapt `git_hdr_len()` to receive different inputs, which bubbles
    up.

  - You perform small fixes in several places.

It might also be a good idea to split out the loop counters into a
separate commit, as those are trivially correct.

> Signed-off-by: Zejun Zhao <jelly.zhao.42@gmail.com>
> ---
>  apply.c | 73 +++++++++++++++++++++++++++------------------------------
>  apply.h |  6 ++---
>  2 files changed, 38 insertions(+), 41 deletions(-)
> 
> diff --git a/apply.c b/apply.c
> index 4a7b6120ac..0c7b89dc3a 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -8,7 +8,6 @@
>   */
>  
>  #define USE_THE_REPOSITORY_VARIABLE
> -#define DISABLE_SIGN_COMPARE_WARNINGS
>  
>  #include "git-compat-util.h"
>  #include "abspath.h"
> @@ -540,7 +539,7 @@ static size_t date_len(const char *line, size_t len)
>  	    !isdigit(*p++) || !isdigit(*p++))	/* Not a date. */
>  		return 0;
>  
> -	if (date - line >= strlen("19") &&
> +	if ((size_t) (date - line) >= strlen("19") &&

We know that `date` is always bigger than or equal to `line`, so this is
correct.

> @@ -1087,11 +1086,11 @@ static int gitdiff_index(struct gitdiff_data *state,
>  	 * and optional space with octal mode.
>  	 */
>  	const char *ptr, *eol;
> -	int len;
> -	const unsigned hexsz = the_hash_algo->hexsz;
> +	size_t len;
> +	const size_t hexsz = the_hash_algo->hexsz;

The change to `hexsz` shouldn't be needed, even if it makes us match the
type of `hexsz` as declared in `git_hash_algo`.

>  	ptr = strchr(line, '.');
> -	if (!ptr || ptr[1] != '.' || hexsz < ptr - line)
> +	if (!ptr || ptr[1] != '.' || hexsz < (size_t) (ptr - line))

`ptr` is the reline of `strrchr(line)`, so it must be either `NULL` or
greater than or equal to `line`, so this cast is fine.

> @@ -1158,7 +1157,7 @@ static const char *skip_tree_prefix(int p_value,
>   */
>  static char *git_header_name(int p_value,
>  			     const char *line,
> -			     int llen)
> +			     size_t llen)
>  {
>  	const char *name;
>  	const char *second = NULL;

It would make sense to split this change out into a separate commit, as
it bubbles up into calling functions, as well.

> @@ -1207,7 +1206,7 @@ static char *git_header_name(int p_value,
>  		cp = skip_tree_prefix(p_value, second, line + llen - second);
>  		if (!cp)
>  			goto free_and_fail1;
> -		if (line + llen - cp != first.len ||
> +		if ((size_t) (line + llen - cp) != first.len ||
>  		    memcmp(first.buf, cp, first.len))
>  			goto free_and_fail1;
>  		return strbuf_detach(&first, NULL);

This cast should be fine, too.

> @@ -1240,7 +1239,7 @@ static char *git_header_name(int p_value,
>  				goto free_and_fail2;
>  
>  			len = sp.buf + sp.len - np;
> -			if (len < second - name &&
> +			if (len < (size_t) (second - name) &&
>  			    !strncmp(np, name, len) &&
>  			    isspace(name[len])) {
>  				/* Good */

This one, too. `second` is iterating through `name`, so it's always
greater or equal to `name`.

> @@ -1371,14 +1370,13 @@ int parse_git_diff_header(struct strbuf *root,
>  			{ "index ", gitdiff_index },
>  			{ "", gitdiff_unrecognized },
>  		};
> -		int i;
>  
>  		len = linelen(line, size);
>  		if (!len || line[len-1] != '\n')
>  			break;
> -		for (i = 0; i < ARRAY_SIZE(optable); i++) {
> +		for (size_t i = 0; i < ARRAY_SIZE(optable); i++) {
>  			const struct opentry *p = optable + i;
> -			int oplen = strlen(p->str);
> +			size_t oplen = strlen(p->str);
>  			int res;
>  			if (len < oplen || memcmp(p->str, line, oplen))
>  				continue;

Makes sense.

> @@ -1592,7 +1590,7 @@ static int find_header(struct apply_state *state,
>  								size, patch);
>  			if (git_hdr_len < 0)
>  				return -128;
> -			if (git_hdr_len <= len)
> +			if ((size_t) git_hdr_len <= len)
>  				continue;
>  			*hdrsize = git_hdr_len;
>  			return offset;

We've asserted that `git_hdr_len` isn't negative, so the cast is fine.

> @@ -2185,7 +2182,7 @@ static int parse_chunk(struct apply_state *state, char *buffer, unsigned long si
>  			};
>  			int i;

This may arguably be `size_t`, as well.

> @@ -2257,12 +2255,12 @@ static void show_stats(struct apply_state *state, struct patch *patch)
>  	}
>  
>  	if (patch->is_binary) {
> -		printf(" %-*s |  Bin\n", max, qname.buf);
> +		printf(" %-*s |  Bin\n", (int) max, qname.buf);
>  		strbuf_release(&qname);
>  		return;
>  	}
>  
> -	printf(" %-*s |", max, qname.buf);
> +	printf(" %-*s |", (int) max, qname.buf);
>  	strbuf_release(&qname);
>  
>  	/*

Do we _know_ that `max` fits into an `int`?

Patrick

Junio C Hamano Feb. 5, 2025, 12:58 p.m. UTC | #2

> @@ -1087,11 +1086,11 @@ static int gitdiff_index(struct gitdiff_data *state,
>  	 * and optional space with octal mode.
>  	 */
>  	const char *ptr, *eol;
> -	int len;
> -	const unsigned hexsz = the_hash_algo->hexsz;
> +	size_t len;
> +	const size_t hexsz = the_hash_algo->hexsz;

I thought that I already saw this discussed in another thread.

The .hexsz of any hash algorithm would never be larger than what a
platform natural "unsigned" integer type can hold, so using size_t
for the member _is_ the wrong thing to do and the fix may be the
other way around, no?

There are genuinely good changes (correcting assigned variable from
int to size_t when the value ultimately came from a system function
that returns size_t) in this patch, but there are other annoying
ones like these, so I am not sure.

>  	ptr = strchr(line, '.');
> -	if (!ptr || ptr[1] != '.' || hexsz < ptr - line)
> +	if (!ptr || ptr[1] != '.' || hexsz < (size_t) (ptr - line))

Is this about -Wsign-compare complaining about size_t vs ptrdiff_t?

> @@ -1207,7 +1206,7 @@ static char *git_header_name(int p_value,
>  		cp = skip_tree_prefix(p_value, second, line + llen - second);
>  		if (!cp)
>  			goto free_and_fail1;
> -		if (line + llen - cp != first.len ||
> +		if ((size_t) (line + llen - cp) != first.len ||

Ditto.

> -			if (len < second - name &&
> +			if (len < (size_t) (second - name) &&

Ditto.

I said this before, but I am not sure if being strict about
"-Wsign-compare" is really buying us much.  If we are getting so
many false positives that need to be squelched by churning the code
with so many typecasts in order to find a new and real problem,
is it really worth it?

Zejun Zhao Feb. 5, 2025, 3:42 p.m. UTC | #3

Thank you for reviewing this patch.

On Wed, Feb 05, 2025 08:47:06 +0100, Patrick Steinhardt wrote:
> You should split up this patch into a series, as it is really hard to
> follow what's going on. There are a couple of things happening:
> 
>   - You change types in `struct apply_state`, which bubbles up.
> 
>   - You adapt `git_hdr_len()` to receive different inputs, which bubbles
>     up.
> 
>   - You perform small fixes in several places.
> 
> It might also be a good idea to split out the loop counters into a
> separate commit, as those are trivially correct.

Sure I'll come up with a v2 patch series, in which each kind of fixes 
will be put in a single commit and I'll state why I believe the type 
cast/change is safe for every single fix in the commit message.

> > @@ -1087,11 +1086,11 @@ static int gitdiff_index(struct gitdiff_data *state,
> >  	 * and optional space with octal mode.
> >  	 */
> >  	const char *ptr, *eol;
> > -	int len;
> > -	const unsigned hexsz = the_hash_algo->hexsz;
> > +	size_t len;
> > +	const size_t hexsz = the_hash_algo->hexsz;
> 
> The change to `hexsz` shouldn't be needed, even if it makes us match the
> type of `hexsz` as declared in `git_hash_algo`.

Yes it's not necessary here to change the type. And for the `hexsz` stuff,
on Wed, Feb 05 2025 04:58:57 -0800, Junio C Hamano wrote,
> I thought that I already saw this discussed in another thread.
> 
> The .hexsz of any hash algorithm would never be larger than what a
> platform natural "unsigned" integer type can hold, so using size_t
> for the member _is_ the wrong thing to do and the fix may be the
> other way around, no?

I found the discussion mentioned at [1]. It seems like the change here 
only makes things worse so I'll see if I'd leave it untouched or change 
the type of `.hexsz` member to `int` or something.

[1] https://lore.kernel.org/git/xmqqttaqw2eb.fsf@gitster.g/

> > @@ -2185,7 +2182,7 @@ static int parse_chunk(struct apply_state *state, char *buffer, unsigned long si
> >  			};
> >  			int i;
> 
> This may arguably be `size_t`, as well.

It's OK for me to use a `size_t` loop count everywhere but I tried to 
keep the changes in this patch minimal (forget about the `hexsz` thing).
I could apply this change if you insist.

> > @@ -2257,12 +2255,12 @@ static void show_stats(struct apply_state *state, struct patch *patch)
> >  	}
> >  
> >  	if (patch->is_binary) {
> > -		printf(" %-*s |  Bin\n", max, qname.buf);
> > +		printf(" %-*s |  Bin\n", (int) max, qname.buf);
> >  		strbuf_release(&qname);
> >  		return;
> >  	}
> >  
> > -	printf(" %-*s |", max, qname.buf);
> > +	printf(" %-*s |", (int) max, qname.buf);
> >  	strbuf_release(&qname);
> >  
> >  	/*
> 
> Do we _know_ that `max` fits into an `int`?

Yep we've set an upper bound for `max` before:
> /*
>  * "scale" the filename
>  */
> max = state->max_len;
> if (max > 50)
> 	      max = 50;
so it must fit into an `int`.

Zejun Zhao Feb. 5, 2025, 4:49 p.m. UTC | #4

Thank you for reviewing this patch.

On Wed, Feb 05 2025 04:58:57 -0800, Junio C Hamano wrote,
> > @@ -1087,11 +1086,11 @@ static int gitdiff_index(struct gitdiff_data *state,
> >  	 * and optional space with octal mode.
> >  	 */
> >  	const char *ptr, *eol;
> > -	int len;
> > -	const unsigned hexsz = the_hash_algo->hexsz;
> > +	size_t len;
> > +	const size_t hexsz = the_hash_algo->hexsz;
> 
> I thought that I already saw this discussed in another thread.
> 
> The .hexsz of any hash algorithm would never be larger than what a
> platform natural "unsigned" integer type can hold, so using size_t
> for the member _is_ the wrong thing to do and the fix may be the
> other way around, no?

Sorry I didn't saw the comment at [1]. You are right about this. I 
prefer changing types of the `git_hash_algo::*sz` family to `unsigned`.

[1] https://lore.kernel.org/git/xmqqttaqw2eb.fsf@gitster.g/


> >  	ptr = strchr(line, '.');
> > -	if (!ptr || ptr[1] != '.' || hexsz < ptr - line)
> > +	if (!ptr || ptr[1] != '.' || hexsz < (size_t) (ptr - line))
> 
> Is this about -Wsign-compare complaining about size_t vs ptrdiff_t?
> 
> > @@ -1207,7 +1206,7 @@ static char *git_header_name(int p_value,
> >  		cp = skip_tree_prefix(p_value, second, line + llen - second);
> >  		if (!cp)
> >  			goto free_and_fail1;
> > -		if (line + llen - cp != first.len ||
> > +		if ((size_t) (line + llen - cp) != first.len ||
> 
> Ditto.
> 
> > -			if (len < second - name &&
> > +			if (len < (size_t) (second - name) &&
> 
> Ditto.

Yes it's comparing unsigned (`size_t`) with signed (`ptrdiff_t`).

> I said this before, but I am not sure if being strict about
> "-Wsign-compare" is really buying us much.  If we are getting so
> many false positives that need to be squelched by churning the code
> with so many typecasts in order to find a new and real problem,
> is it really worth it?

Well, I contributed most before to a Rust OS kernel project so 
honestly I'm used to the idea of explicit type casting when necessary 
to avoid possible bugs. However, this is non-trivial. People should 
understand each relevant variable's semantics to decide what its type 
should be and do typecast on top of that. Even if it's already made a 
gradual process, it may still cost much from both the contributors and 
the reviewers but brings benefits that do not match up. Maybe a clear 
commit message that states why each type cast/change makes sense can 
help with the iteration? I'm not quite sure about that.

diff --git a/apply.c b/apply.c
index 4a7b6120ac..0c7b89dc3a 100644
--- a/apply.c
+++ b/apply.c
@@ -8,7 +8,6 @@ 
  */
 
 #define USE_THE_REPOSITORY_VARIABLE
-#define DISABLE_SIGN_COMPARE_WARNINGS
 
 #include "git-compat-util.h"
 #include "abspath.h"
@@ -540,7 +539,7 @@  static size_t date_len(const char *line, size_t len)
 	    !isdigit(*p++) || !isdigit(*p++))	/* Not a date. */
 		return 0;
 
-	if (date - line >= strlen("19") &&
+	if ((size_t) (date - line) >= strlen("19") &&
 	    isdigit(date[-1]) && isdigit(date[-2]))	/* 4-digit year */
 		date -= strlen("19");
 
@@ -1087,11 +1086,11 @@  static int gitdiff_index(struct gitdiff_data *state,
 	 * and optional space with octal mode.
 	 */
 	const char *ptr, *eol;
-	int len;
-	const unsigned hexsz = the_hash_algo->hexsz;
+	size_t len;
+	const size_t hexsz = the_hash_algo->hexsz;
 
 	ptr = strchr(line, '.');
-	if (!ptr || ptr[1] != '.' || hexsz < ptr - line)
+	if (!ptr || ptr[1] != '.' || hexsz < (size_t) (ptr - line))
 		return 0;
 	len = ptr - line;
 	memcpy(patch->old_oid_prefix, line, len);
@@ -1158,7 +1157,7 @@  static const char *skip_tree_prefix(int p_value,
  */
 static char *git_header_name(int p_value,
 			     const char *line,
-			     int llen)
+			     size_t llen)
 {
 	const char *name;
 	const char *second = NULL;
@@ -1207,7 +1206,7 @@  static char *git_header_name(int p_value,
 		cp = skip_tree_prefix(p_value, second, line + llen - second);
 		if (!cp)
 			goto free_and_fail1;
-		if (line + llen - cp != first.len ||
+		if ((size_t) (line + llen - cp) != first.len ||
 		    memcmp(first.buf, cp, first.len))
 			goto free_and_fail1;
 		return strbuf_detach(&first, NULL);
@@ -1240,7 +1239,7 @@  static char *git_header_name(int p_value,
 				goto free_and_fail2;
 
 			len = sp.buf + sp.len - np;
-			if (len < second - name &&
+			if (len < (size_t) (second - name) &&
 			    !strncmp(np, name, len) &&
 			    isspace(name[len])) {
 				/* Good */
@@ -1317,7 +1316,7 @@  int parse_git_diff_header(struct strbuf *root,
 			  int *linenr,
 			  int p_value,
 			  const char *line,
-			  int len,
+			  size_t len,
 			  unsigned int size,
 			  struct patch *patch)
 {
@@ -1371,14 +1370,13 @@  int parse_git_diff_header(struct strbuf *root,
 			{ "index ", gitdiff_index },
 			{ "", gitdiff_unrecognized },
 		};
-		int i;
 
 		len = linelen(line, size);
 		if (!len || line[len-1] != '\n')
 			break;
-		for (i = 0; i < ARRAY_SIZE(optable); i++) {
+		for (size_t i = 0; i < ARRAY_SIZE(optable); i++) {
 			const struct opentry *p = optable + i;
-			int oplen = strlen(p->str);
+			size_t oplen = strlen(p->str);
 			int res;
 			if (len < oplen || memcmp(p->str, line, oplen))
 				continue;
@@ -1592,7 +1590,7 @@  static int find_header(struct apply_state *state,
 								size, patch);
 			if (git_hdr_len < 0)
 				return -128;
-			if (git_hdr_len <= len)
+			if ((size_t) git_hdr_len <= len)
 				continue;
 			*hdrsize = git_hdr_len;
 			return offset;
@@ -2097,7 +2095,6 @@  static void add_name_limit(struct apply_state *state,
 static int use_patch(struct apply_state *state, struct patch *p)
 {
 	const char *pathname = p->new_name ? p->new_name : p->old_name;
-	int i;
 
 	/* Paths outside are not touched regardless of "--include" */
 	if (state->prefix && *state->prefix) {
@@ -2107,7 +2104,7 @@  static int use_patch(struct apply_state *state, struct patch *p)
 	}
 
 	/* See if it matches any of exclude/include rule */
-	for (i = 0; i < state->limit_by_name.nr; i++) {
+	for (size_t i = 0; i < state->limit_by_name.nr; i++) {
 		struct string_list_item *it = &state->limit_by_name.items[i];
 		if (!wildmatch(it->string, pathname, 0))
 			return (it->util != NULL);
@@ -2185,7 +2182,7 @@  static int parse_chunk(struct apply_state *state, char *buffer, unsigned long si
 			};
 			int i;
 			for (i = 0; binhdr[i]; i++) {
-				int len = strlen(binhdr[i]);
+				size_t len = strlen(binhdr[i]);
 				if (len < size - hd &&
 				    !memcmp(binhdr[i], buffer + hd, len)) {
 					state->linenr++;
@@ -2238,7 +2235,8 @@  static void show_stats(struct apply_state *state, struct patch *patch)
 {
 	struct strbuf qname = STRBUF_INIT;
 	char *cp = patch->new_name ? patch->new_name : patch->old_name;
-	int max, add, del;
+	size_t max;
+	int add, del;
 
 	quote_c_style(cp, &qname, NULL, 0);
 
@@ -2257,12 +2255,12 @@  static void show_stats(struct apply_state *state, struct patch *patch)
 	}
 
 	if (patch->is_binary) {
-		printf(" %-*s |  Bin\n", max, qname.buf);
+		printf(" %-*s |  Bin\n", (int) max, qname.buf);
 		strbuf_release(&qname);
 		return;
 	}
 
-	printf(" %-*s |", max, qname.buf);
+	printf(" %-*s |", (int) max, qname.buf);
 	strbuf_release(&qname);
 
 	/*
@@ -2319,7 +2317,7 @@  static void update_pre_post_images(struct image *preimage,
 {
 	struct image fixed_preimage = IMAGE_INIT;
 	size_t insert_pos = 0;
-	int i, ctx, reduced;
+	size_t ctx, reduced;
 	const char *fixed;
 
 	/*
@@ -2328,7 +2326,7 @@  static void update_pre_post_images(struct image *preimage,
 	 * free "oldlines".
 	 */
 	image_prepare(&fixed_preimage, buf, len, 1);
-	for (i = 0; i < fixed_preimage.line_nr; i++)
+	for (size_t i = 0; i < fixed_preimage.line_nr; i++)
 		fixed_preimage.line[i].flag = preimage->line[i].flag;
 	image_clear(preimage);
 	*preimage = fixed_preimage;
@@ -2337,7 +2335,7 @@  static void update_pre_post_images(struct image *preimage,
 	/*
 	 * Adjust the common context lines in postimage.
 	 */
-	for (i = reduced = ctx = 0; i < postimage->line_nr; i++) {
+	for (size_t i = reduced = ctx = 0; i < postimage->line_nr; i++) {
 		size_t l_len = postimage->line[i].len;
 
 		if (!(postimage->line[i].flag & LINE_COMMON)) {
@@ -2417,9 +2415,9 @@  static int line_by_line_fuzzy_match(struct image *img,
 				    struct image *postimage,
 				    unsigned long current,
 				    int current_lno,
-				    int preimage_limit)
+				    size_t preimage_limit)
 {
-	int i;
+	size_t i;
 	size_t imgoff = 0;
 	size_t preoff = 0;
 	size_t extra_chars;
@@ -2486,12 +2484,12 @@  static int match_fragment(struct apply_state *state,
 			  unsigned ws_rule,
 			  int match_beginning, int match_end)
 {
-	int i;
+	size_t i;
 	const char *orig, *target;
 	struct strbuf fixed = STRBUF_INIT;
 	char *fixed_buf;
 	size_t fixed_len;
-	int preimage_limit;
+	size_t preimage_limit;
 	int ret;
 
 	if (preimage->line_nr + current_lno <= img->line_nr) {
@@ -2663,12 +2661,11 @@  static int match_fragment(struct apply_state *state,
 	for ( ; i < preimage->line_nr; i++) {
 		size_t fixstart = fixed.len; /* start of the fixed preimage */
 		size_t oldlen = preimage->line[i].len;
-		int j;
 
 		/* Try fixing the line in the preimage */
 		ws_fix_copy(&fixed, orig, oldlen, ws_rule, NULL);
 
-		for (j = fixstart; j < fixed.len; j++) {
+		for (size_t j = fixstart; j < fixed.len; j++) {
 			if (!isspace(fixed.buf[j])) {
 				ret = 0;
 				goto out;
@@ -2705,7 +2702,7 @@  static int find_pos(struct apply_state *state,
 {
 	int i;
 	unsigned long backwards, forwards, current;
-	int backwards_lno, forwards_lno, current_lno;
+	size_t backwards_lno, forwards_lno, current_lno;
 
 	/*
 	 * When running with --allow-overlap, it is possible that a hunk is
@@ -2790,7 +2787,7 @@  static int find_pos(struct apply_state *state,
  */
 static void update_image(struct apply_state *state,
 			 struct image *img,
-			 int applied_pos,
+			 size_t applied_pos,
 			 struct image *preimage,
 			 struct image *postimage)
 {
@@ -2798,11 +2795,11 @@  static void update_image(struct apply_state *state,
 	 * remove the copy of preimage at offset in img
 	 * and replace it with postimage
 	 */
-	int i, nr;
+	int nr;
 	size_t remove_count, insert_count, applied_at = 0;
 	size_t result_alloc;
 	char *result;
-	int preimage_limit;
+	size_t preimage_limit;
 
 	/*
 	 * If we are removing blank lines at the end of img,
@@ -2817,11 +2814,11 @@  static void update_image(struct apply_state *state,
 	if (preimage_limit > img->line_nr - applied_pos)
 		preimage_limit = img->line_nr - applied_pos;
 
-	for (i = 0; i < applied_pos; i++)
+	for (size_t i = 0; i < applied_pos; i++)
 		applied_at += img->line[i].len;
 
 	remove_count = 0;
-	for (i = 0; i < preimage_limit; i++)
+	for (size_t i = 0; i < preimage_limit; i++)
 		remove_count += img->line[applied_pos + i].len;
 	insert_count = postimage->buf.len;
 
@@ -2850,7 +2847,7 @@  static void update_image(struct apply_state *state,
 			   img->line_nr - (applied_pos + preimage_limit));
 	COPY_ARRAY(img->line + applied_pos, postimage->line, postimage->line_nr);
 	if (!state->allow_overlap)
-		for (i = 0; i < postimage->line_nr; i++)
+		for (size_t i = 0; i < postimage->line_nr; i++)
 			img->line[applied_pos + i].flag |= LINE_PATCHED;
 	img->line_nr = nr;
 }
@@ -4287,19 +4284,19 @@  static void summary_patch_list(struct patch *patch)
 
 static void patch_stats(struct apply_state *state, struct patch *patch)
 {
-	int lines = patch->lines_added + patch->lines_deleted;
+	unsigned lines = patch->lines_added + patch->lines_deleted;
 
 	if (lines > state->max_change)
 		state->max_change = lines;
 	if (patch->old_name) {
-		int len = quote_c_style(patch->old_name, NULL, NULL, 0);
+		size_t len = quote_c_style(patch->old_name, NULL, NULL, 0);
 		if (!len)
 			len = strlen(patch->old_name);
 		if (len > state->max_len)
 			state->max_len = len;
 	}
 	if (patch->new_name) {
-		int len = quote_c_style(patch->new_name, NULL, NULL, 0);
+		size_t len = quote_c_style(patch->new_name, NULL, NULL, 0);
 		if (!len)
 			len = strlen(patch->new_name);
 		if (len > state->max_len)
diff --git a/apply.h b/apply.h
index 90e887ec0e..f53fbb1777 100644
--- a/apply.h
+++ b/apply.h
@@ -90,8 +90,8 @@  struct apply_state {
 	 * we've seen, and the longest filename. That allows us to do simple
 	 * scaling.
 	 */
-	int max_change;
-	int max_len;
+	unsigned max_change;
+	size_t max_len;
 
 	/*
 	 * Records filenames that have been touched, in order to handle
@@ -170,7 +170,7 @@  int parse_git_diff_header(struct strbuf *root,
 			  int *linenr,
 			  int p_value,
 			  const char *line,
-			  int len,
+			  size_t len,
 			  unsigned int size,
 			  struct patch *patch);

[GSOC] apply: address -Wsign-comparison warnings

Commit Message

Comments

Patch