diff mbox series

git: use U to denote unsigned to prevent UB

Message ID pull.1849.git.git.1734488549111.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series git: use U to denote unsigned to prevent UB | expand

Commit Message

Seija Kijin Dec. 18, 2024, 2:22 a.m. UTC
From: Seija Kijin <doremylover123@gmail.com>

1 << can be UB if 1 ends up overflowing and
being assigned to an unsigned int or long.

Signed-off-by: Seija Kijin <doremylover123@gmail.com>
---
    git: use U to denote unsigned to prevent UB
    
    1 << can be UB if 1 ends up overflowing and being assigned to an
    unsigned int or long.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1849%2FAreaZR%2F1U-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1849/AreaZR/1U-v1
Pull-Request: https://github.com/git/git/pull/1849

 builtin/checkout.c     |  2 +-
 builtin/merge-tree.c   |  4 ++--
 builtin/receive-pack.c |  2 +-
 color.c                |  4 ++--
 delta-islands.c        |  2 +-
 diff-delta.c           |  2 +-
 diff.c                 |  2 +-
 help.c                 |  2 +-
 imap-send.c            |  2 +-
 merge-ort.c            | 18 +++++++++---------
 xdiff/xhistogram.c     |  2 +-
 xdiff/xprepare.c       |  4 ++--
 12 files changed, 23 insertions(+), 23 deletions(-)


base-commit: 063bcebf0c917140ca0e705cbe0fdea127e90086

Comments

Jonathan Nieder Jan. 2, 2025, 3:43 p.m. UTC | #1
Hi,

Seija Kijin wrote:

> 1 << can be UB if 1 ends up overflowing and
> being assigned to an unsigned int or long.
>
> Signed-off-by: Seija Kijin <doremylover123@gmail.com>
> ---
>  builtin/checkout.c     |  2 +-
>  builtin/merge-tree.c   |  4 ++--
>  builtin/receive-pack.c |  2 +-
>  color.c                |  4 ++--
>  delta-islands.c        |  2 +-
>  diff-delta.c           |  2 +-
>  diff.c                 |  2 +-
>  help.c                 |  2 +-
>  imap-send.c            |  2 +-
>  merge-ort.c            | 18 +++++++++---------
>  xdiff/xhistogram.c     |  2 +-
>  xdiff/xprepare.c       |  4 ++--
>  12 files changed, 23 insertions(+), 23 deletions(-)

That said, most of these don't overflow, so it's not obvious this
results in higher quality or more readable code than before the patch.

By "not obvious" I don't mean that it _doesn't_, by the way, but just
that we don't have enough information to evaluate it here.  What
motivated writing this patch?  Is there a style guideline about it
that will remind us not to backslide in the future, for example?  Or
is there a tool that notices?  Was there an example you ran into that
led you to look for more examples?

This kind of information about context will make it easier for other
in the project to ensure the patch does what it intends, and even more
importantly, to see if there are additional checks to add or other
instances that also need updating.

Thanks and hope that helps,
Jonathan
Junio C Hamano Jan. 2, 2025, 9:33 p.m. UTC | #2
"AreaZR via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Seija Kijin <doremylover123@gmail.com>
>
> 1 << can be UB if 1 ends up overflowing and
> being assigned to an unsigned int or long.

 * Spell out what you meant by "UB".

 * "1 ends up overflowing"?  One is one; as long as you have two
   bits, it won't overflow.  This needs rewriting.

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 5e5afa0f267..a636e71e05c 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -223,7 +223,7 @@ static int check_stages(unsigned stages, const struct cache_entry *ce, int pos)
>  		ce = the_repository->index->cache[pos];
>  		if (strcmp(name, ce->name))
>  			break;
> -		seen |= (1 << ce_stage(ce));
> +		seen |= (1U << ce_stage(ce));

Here "seen" is "unsigned" initialized to 0.  Matching the type of
the value that is assigned with an explicit U does make sense, but
as Jonathan already pointed out elsewhere, as ce_stage() cannot be
more than 3, this would never overflow.

> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index c5ed472967a..d0104dfa0c7 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -270,13 +270,13 @@ static void unresolved(const struct traverse_info *info, struct name_entry n[3])
>  	unsigned dirmask = 0, mask = 0;
>  
>  	for (i = 0; i < 3; i++) {
> -		mask |= (1 << i);
> +		mask |= (1U << i);

Ditto.

>  		/*
>  		 * Treat missing entries as directories so that we return
>  		 * after unresolved_directory has handled this.
>  		 */
>  		if (!n[i].mode || S_ISDIR(n[i].mode))
> -			dirmask |= (1 << i);
> +			dirmask |= (1U << i);

Ditto.

There are a few instances that left shifts by 31-bit, which requires
(1 * 2 ** 31) to be representable in the result type of signed int
if we want to avoid an undefined behaviour,, but even if your signed
int were wider than 32-bit, it is a good hygiene to write your bit
shift as (1U << shift_count).

So I am not opposed to these changes.  The guiding principle should
probably be "bit patterns should by unsigned by default, unless you
have a strong and valid reason to use signed" (and the only single
plausible reason being when you take advantage of fact that the sign
bit is propagated if you shift right); as most of the changed code
paths do deal with a signed result that is representable and does
not risk any undefined behaviour, it is an inappropriate rationale
to justify this particular patch, I would think.
diff mbox series

Patch

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 5e5afa0f267..a636e71e05c 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -223,7 +223,7 @@  static int check_stages(unsigned stages, const struct cache_entry *ce, int pos)
 		ce = the_repository->index->cache[pos];
 		if (strcmp(name, ce->name))
 			break;
-		seen |= (1 << ce_stage(ce));
+		seen |= (1U << ce_stage(ce));
 		pos++;
 	}
 	if ((stages & seen) != stages)
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index c5ed472967a..d0104dfa0c7 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -270,13 +270,13 @@  static void unresolved(const struct traverse_info *info, struct name_entry n[3])
 	unsigned dirmask = 0, mask = 0;
 
 	for (i = 0; i < 3; i++) {
-		mask |= (1 << i);
+		mask |= (1U << i);
 		/*
 		 * Treat missing entries as directories so that we return
 		 * after unresolved_directory has handled this.
 		 */
 		if (!n[i].mode || S_ISDIR(n[i].mode))
-			dirmask |= (1 << i);
+			dirmask |= (1U << i);
 	}
 
 	unresolved_directory(info, n);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 9d2c07f68da..b958eeee8fe 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1306,7 +1306,7 @@  static int update_shallow_ref(struct command *cmd, struct shallow_info *si)
 	struct shallow_lock shallow_lock = SHALLOW_LOCK_INIT;
 	struct oid_array extra = OID_ARRAY_INIT;
 	struct check_connected_options opt = CHECK_CONNECTED_INIT;
-	uint32_t mask = 1 << (cmd->index % 32);
+	uint32_t mask = 1U << (cmd->index % 32);
 	int i;
 
 	trace_printf_key(&trace_shallow,
diff --git a/color.c b/color.c
index 227a5ab2f42..ab9a3d2a097 100644
--- a/color.c
+++ b/color.c
@@ -317,7 +317,7 @@  int color_parse_mem(const char *value, int value_len, char *dst)
 		}
 		val = parse_attr(word, wordlen);
 		if (0 <= val)
-			attr |= (1 << val);
+			attr |= (1U << val);
 		else
 			goto bad;
 	}
@@ -340,7 +340,7 @@  int color_parse_mem(const char *value, int value_len, char *dst)
 			sep++;
 
 		for (i = 0; attr; i++) {
-			unsigned bit = (1 << i);
+			unsigned bit = (1U << i);
 			if (!(attr & bit))
 				continue;
 			attr &= ~bit;
diff --git a/delta-islands.c b/delta-islands.c
index 84435512593..a041cfa1ab3 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -78,7 +78,7 @@  static int island_bitmap_is_subset(struct island_bitmap *self,
 }
 
 #define ISLAND_BITMAP_BLOCK(x) (x / 32)
-#define ISLAND_BITMAP_MASK(x) (1 << (x % 32))
+#define ISLAND_BITMAP_MASK(x) (1U << (x % 32))
 
 static void island_bitmap_set(struct island_bitmap *self, uint32_t i)
 {
diff --git a/diff-delta.c b/diff-delta.c
index 77fea08dfb0..fbdfec7037f 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -156,7 +156,7 @@  struct delta_index * create_delta_index(const void *buf, unsigned long bufsize)
 	}
 	hsize = entries / 4;
 	for (i = 4; (1u << i) < hsize; i++);
-	hsize = 1 << i;
+	hsize = 1u << i;
 	hmask = hsize - 1;
 
 	/* allocate lookup index */
diff --git a/diff.c b/diff.c
index 266ddf18e73..021df059e0b 100644
--- a/diff.c
+++ b/diff.c
@@ -4815,7 +4815,7 @@  static void prepare_filter_bits(void)
 
 	if (!filter_bit[DIFF_STATUS_ADDED]) {
 		for (i = 0; diff_status_letters[i]; i++)
-			filter_bit[(int) diff_status_letters[i]] = (1 << i);
+			filter_bit[(int) diff_status_letters[i]] = (1U << i);
 	}
 }
 
diff --git a/help.c b/help.c
index 8a830ba35c6..839596156fe 100644
--- a/help.c
+++ b/help.c
@@ -394,7 +394,7 @@  void list_cmds_by_category(struct string_list *list,
 
 	for (i = 0; category_names[i]; i++) {
 		if (!strcmp(cat, category_names[i])) {
-			cat_id = 1UL << i;
+			cat_id = 1U << i;
 			break;
 		}
 	}
diff --git a/imap-send.c b/imap-send.c
index 25c68fd90d7..fdb9e658e70 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -644,7 +644,7 @@  static void parse_capability(struct imap *imap, char *cmd)
 	while ((arg = next_arg(&cmd)))
 		for (i = 0; i < ARRAY_SIZE(cap_list); i++)
 			if (!strcmp(cap_list[i], arg))
-				imap->caps |= 1 << i;
+				imap->caps |= 1U << i;
 	imap->rcaps = imap->caps;
 }
 
diff --git a/merge-ort.c b/merge-ort.c
index 11029c10be3..5a99bec7a04 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -1218,7 +1218,7 @@  static void collect_rename_info(struct merge_options *opt,
 		return;
 
 	for (side = MERGE_SIDE1; side <= MERGE_SIDE2; ++side) {
-		unsigned side_mask = (1 << side);
+		unsigned side_mask = (1U << side);
 
 		/* Check for deletion on side */
 		if ((filemask & 1) && !(filemask & side_mask))
@@ -2026,7 +2026,7 @@  static void initialize_attr_index(struct merge_options *opt)
 
 		ASSIGN_AND_VERIFY_CI(ci, mi);
 		for (stage = 0; stage < 3; stage++) {
-			unsigned stage_mask = (1 << stage);
+			unsigned stage_mask = (1U << stage);
 
 			if (!(ci->filemask & stage_mask))
 				continue;
@@ -2362,7 +2362,7 @@  static char *handle_path_level_conflicts(struct merge_options *opt,
 	 */
 	if (c_info->reported_already) {
 		clean = 0;
-	} else if (path_in_way(&opt->priv->paths, new_path, 1 << side_index)) {
+	} else if (path_in_way(&opt->priv->paths, new_path, 1U << side_index)) {
 		c_info->reported_already = 1;
 		strbuf_add_separated_string_list(&collision_paths, ", ",
 						 &c_info->source_files);
@@ -2747,7 +2747,7 @@  static void apply_directory_rename_modifications(struct merge_options *opt,
 		ci->filemask = 0;
 		ci->merged.clean = 1;
 		for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) {
-			if (ci->dirmask & (1 << i))
+			if (ci->dirmask & (1U << i))
 				continue;
 			/* zero out any entries related to files */
 			ci->stages[i].mode = 0;
@@ -2915,7 +2915,7 @@  static int process_renames(struct merge_options *opt,
 				assert(side1 == side2);
 				memcpy(&side1->stages[0], &base->stages[0],
 				       sizeof(merged));
-				side1->filemask |= (1 << MERGE_BASE);
+				side1->filemask |= (1U << MERGE_BASE);
 				/* Mark base as resolved by removal */
 				base->merged.is_null = 1;
 				base->merged.clean = 1;
@@ -3002,7 +3002,7 @@  static int process_renames(struct merge_options *opt,
 		target_index = pair->score; /* from collect_renames() */
 		assert(target_index == 1 || target_index == 2);
 		other_source_index = 3 - target_index;
-		old_sidemask = (1 << other_source_index); /* 2 or 4 */
+		old_sidemask = (1U << other_source_index); /* 2 or 4 */
 		source_deleted = (oldinfo->filemask == 1);
 		collision = ((newinfo->filemask & old_sidemask) != 0);
 		type_changed = !source_deleted &&
@@ -3116,7 +3116,7 @@  static int process_renames(struct merge_options *opt,
 			 */
 			memcpy(&newinfo->stages[0], &oldinfo->stages[0],
 			       sizeof(newinfo->stages[0]));
-			newinfo->filemask |= (1 << MERGE_BASE);
+			newinfo->filemask |= (1U << MERGE_BASE);
 			newinfo->pathnames[0] = oldpath;
 			if (type_changed) {
 				/* rename vs. typechange */
@@ -3139,7 +3139,7 @@  static int process_renames(struct merge_options *opt,
 				memcpy(&newinfo->stages[other_source_index],
 				       &oldinfo->stages[other_source_index],
 				       sizeof(newinfo->stages[0]));
-				newinfo->filemask |= (1 << other_source_index);
+				newinfo->filemask |= (1U << other_source_index);
 				newinfo->pathnames[other_source_index] = oldpath;
 			}
 		}
@@ -3990,7 +3990,7 @@  static int process_entry(struct merge_options *opt,
 		ci->match_mask = (ci->match_mask & ~ci->dirmask);
 		ci->dirmask = 0;
 		for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) {
-			if (ci->filemask & (1 << i))
+			if (ci->filemask & (1U << i))
 				continue;
 			ci->stages[i].mode = 0;
 			oidcpy(&ci->stages[i].oid, null_oid());
diff --git a/xdiff/xhistogram.c b/xdiff/xhistogram.c
index 16a8fe2f3f3..18a037a3ba8 100644
--- a/xdiff/xhistogram.c
+++ b/xdiff/xhistogram.c
@@ -265,7 +265,7 @@  static int find_lcs(xpparam_t const *xpp, xdfenv_t *env,
 	index.rcha.head = NULL;
 
 	index.table_bits = xdl_hashbits(count1);
-	index.records_size = 1 << index.table_bits;
+	index.records_size = 1U << index.table_bits;
 	if (!XDL_CALLOC_ARRAY(index.records, index.records_size))
 		goto cleanup;
 
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index c84549f6c50..18c176462ec 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -72,7 +72,7 @@  static int xdl_init_classifier(xdlclassifier_t *cf, long size, long flags) {
 	cf->flags = flags;
 
 	cf->hbits = xdl_hashbits((unsigned int) size);
-	cf->hsize = 1 << cf->hbits;
+	cf->hsize = 1U << cf->hbits;
 
 	if (xdl_cha_init(&cf->ncha, sizeof(xdlclass_t), size / 4 + 1) < 0) {
 
@@ -174,7 +174,7 @@  static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 		goto abort;
 
 	hbits = xdl_hashbits((unsigned int) narec);
-	hsize = 1 << hbits;
+	hsize = 1U << hbits;
 	if (!XDL_CALLOC_ARRAY(rhash, hsize))
 		goto abort;