diff mbox series

sideband.c: replace int with size_t for clarity

Message ID pull.1625.git.1703264469238.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series sideband.c: replace int with size_t for clarity | expand

Commit Message

Chandra Pratap Dec. 22, 2023, 5:01 p.m. UTC
From: Chandra Pratap <chandrapratap3519@gmail.com>

Replace int with size_t for clarity and remove the
'NEEDSWORK' tag associated with it.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    sideband.c: replace int with size_t for clarity

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1625

 sideband.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


base-commit: 1a87c842ece327d03d08096395969aca5e0a6996

Comments

Torsten Bögershausen Dec. 22, 2023, 5:57 p.m. UTC | #1
On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> From: Chandra Pratap <chandrapratap3519@gmail.com>
>
> Replace int with size_t for clarity and remove the
> 'NEEDSWORK' tag associated with it.
>
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
>     sideband.c: replace int with size_t for clarity
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1625
>
>  sideband.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/sideband.c b/sideband.c
> index 6cbfd391c47..1599e408d1b 100644
> --- a/sideband.c
> +++ b/sideband.c
> @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
>   * of the line. This should be called for a single line only, which is
>   * passed as the first N characters of the SRC array.
>   *
> - * NEEDSWORK: use "size_t n" instead for clarity.
>   */
> -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
>  {
>  	int i;
>

Thanks for working on this.
There is, however, more potential for improvements.
First of all: the "int i" from above.
Further down, we read
	for (i = 0; i < ARRAY_SIZE(keywords); i++) {

However, a size of an array can never be negative, so that
an unsigned data type is a better choice than a signed.
And, arrays can have more elements than an int can address,
at least in theory.
For a reader it makes more sense, to replace
int i;
with
size_t i;


And further down, there is another place for improvments:

		int len = strlen(p->keyword);
		if (n < len)
			continue;

Even here, a strlen is never negative. And a size_t is the choice for len,
since all "modern" implementations declare strlen() to return size_t

Do you think that you can have a look at these changes ?

I will be happy to do a review, and possibly other people as well.
Chandra Pratap Dec. 22, 2023, 6:27 p.m. UTC | #2
Thanks for the feedback, Torsten. I was working on improving the rest of
sideband.c as you suggested when I encountered this snippet on line 82:

while (0 < n && isspace(*src)) {
strbuf_addch(dest, *src);
src++;
n--;
}

Here, we are decreasing the value of an unsigned type to potentially below
0 which may lead to overflow and result in some nasty bug. In this case,
is it wise to continue with replacing 'int n' with 'size_t n' as the
NEEDSWORK tag suggests or should we improve upon the rest of the file
and revert 'size_t n' to 'int n' ?

On Fri, 22 Dec 2023 at 23:27, Torsten Bögershausen <tboegi@web.de> wrote:
>
> On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> > From: Chandra Pratap <chandrapratap3519@gmail.com>
> >
> > Replace int with size_t for clarity and remove the
> > 'NEEDSWORK' tag associated with it.
> >
> > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> > ---
> >     sideband.c: replace int with size_t for clarity
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/1625
> >
> >  sideband.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/sideband.c b/sideband.c
> > index 6cbfd391c47..1599e408d1b 100644
> > --- a/sideband.c
> > +++ b/sideband.c
> > @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
> >   * of the line. This should be called for a single line only, which is
> >   * passed as the first N characters of the SRC array.
> >   *
> > - * NEEDSWORK: use "size_t n" instead for clarity.
> >   */
> > -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
> >  {
> >       int i;
> >
>
> Thanks for working on this.
> There is, however, more potential for improvements.
> First of all: the "int i" from above.
> Further down, we read
>         for (i = 0; i < ARRAY_SIZE(keywords); i++) {
>
> However, a size of an array can never be negative, so that
> an unsigned data type is a better choice than a signed.
> And, arrays can have more elements than an int can address,
> at least in theory.
> For a reader it makes more sense, to replace
> int i;
> with
> size_t i;
>
>
> And further down, there is another place for improvments:
>
>                 int len = strlen(p->keyword);
>                 if (n < len)
>                         continue;
>
> Even here, a strlen is never negative. And a size_t is the choice for len,
> since all "modern" implementations declare strlen() to return size_t
>
> Do you think that you can have a look at these changes ?
>
> I will be happy to do a review, and possibly other people as well.
Torsten Bögershausen Dec. 22, 2023, 6:39 p.m. UTC | #3
(PLease, avoid top-posting on this list)
On Fri, Dec 22, 2023 at 11:57:25PM +0530, Chandra Pratap wrote:
> Thanks for the feedback, Torsten. I was working on improving the rest of
> sideband.c as you suggested when I encountered this snippet on line 82:
>
> while (0 < n && isspace(*src)) {
> strbuf_addch(dest, *src);
> src++;
> n--;
> }
>
> Here, we are decreasing the value of an unsigned type to potentially below
> 0 which may lead to overflow and result in some nasty bug. In this case,
> is it wise to continue with replacing 'int n' with 'size_t n' as the
> NEEDSWORK tag suggests or should we improve upon the rest of the file
> and revert 'size_t n' to 'int n' ?

Yes, that could be a solution.
But.
In general, we are are striving to use size_t for all objects that live in
memory, and that is a long term thing.
Careful review is needed, and that is what you just did.

If we look at this code again:
while (0 < n && isspace(*src)) {
  strbuf_addch(dest, *src);
  src++;
  n--;
}

When n is signed, it makes sense to use "0 < n".
However, if I think about it, it should work for unsigned as well,
wouldn't it ?
We can leave it as is, or replace it by

while (n && isspace(*src)) {
  strbuf_addch(dest, *src);
  src++;
  n--;
}




>
> On Fri, 22 Dec 2023 at 23:27, Torsten Bögershausen <tboegi@web.de> wrote:
> >
> > On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> > > From: Chandra Pratap <chandrapratap3519@gmail.com>
> > >
> > > Replace int with size_t for clarity and remove the
> > > 'NEEDSWORK' tag associated with it.
> > >
> > > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> > > ---
> > >     sideband.c: replace int with size_t for clarity
> > >
> > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> > > Pull-Request: https://github.com/gitgitgadget/git/pull/1625
> > >
> > >  sideband.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/sideband.c b/sideband.c
> > > index 6cbfd391c47..1599e408d1b 100644
> > > --- a/sideband.c
> > > +++ b/sideband.c
> > > @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
> > >   * of the line. This should be called for a single line only, which is
> > >   * passed as the first N characters of the SRC array.
> > >   *
> > > - * NEEDSWORK: use "size_t n" instead for clarity.
> > >   */
> > > -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > > +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
> > >  {
> > >       int i;
> > >
> >
> > Thanks for working on this.
> > There is, however, more potential for improvements.
> > First of all: the "int i" from above.
> > Further down, we read
> >         for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> >
> > However, a size of an array can never be negative, so that
> > an unsigned data type is a better choice than a signed.
> > And, arrays can have more elements than an int can address,
> > at least in theory.
> > For a reader it makes more sense, to replace
> > int i;
> > with
> > size_t i;
> >
> >
> > And further down, there is another place for improvments:
> >
> >                 int len = strlen(p->keyword);
> >                 if (n < len)
> >                         continue;
> >
> > Even here, a strlen is never negative. And a size_t is the choice for len,
> > since all "modern" implementations declare strlen() to return size_t
> >
> > Do you think that you can have a look at these changes ?
> >
> > I will be happy to do a review, and possibly other people as well.
>
Junio C Hamano Dec. 22, 2023, 7:01 p.m. UTC | #4
Torsten Bögershausen <tboegi@web.de> writes:

Just this part.

> Further down, we read
> 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
>
> However, a size of an array can never be negative, so that
> an unsigned data type is a better choice than a signed.
> And, arrays can have more elements than an int can address,
> at least in theory.
> For a reader it makes more sense, to replace
> int i;
> with
> size_t i;

It is a very good discipline to use size_t to index into an array
whose size is externally controled (e.g., we slurp what the end user
or the server on the other end of the connection gave us into a
piece of memory we allocate) to avoid integer overflows as "int" is
often narrower than "size_t".  But this particular one is a Meh; the
keywords[] is a small hardcoded array whose size and contents are
totally under our control.
Junio C Hamano Dec. 26, 2023, 5:14 p.m. UTC | #5
"Chandra Pratap via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)

Changing the type of the paramter alone might be a good start, but
does not really help anybody, as (1) the callers are not taught to
handle wider integral types for the values they pass and (2) the
function uses "int len" it computes internally to compare with "n".

There are three callers in demultiplex_sideband(), one of whose
paramters is "int len" and is passed to this function in one of
these calls.  Among the other two, one uses "int linelen" derived
from scanning the piece of memory read from sideband via strpbrk(),
and the other uses strlen(b) which is the length of the "remainder"
of the same buffer after the loop that processes one line at a time
using the said strpbrk() is done with the complete lines in the
early part.

The buffer involved in all of the above stores bytes taken from a
packet received via the pkt-line interface, which is capable of
transferring only 64kB at a time.

I _think_ the most productive use of our time is to replace the
NEEDSWORK with a comment saying why it is fine to use "int" here to
avoid tempting the next developer to come up with this patch
again---they will waste their time to do so without thinking it
through if we leave the (incomplete) NEEDSWORK comment, I am afraid.
Taylor Blau Jan. 2, 2024, 10:31 p.m. UTC | #6
On Fri, Dec 22, 2023 at 11:01:37AM -0800, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
> Just this part.
>
> > Further down, we read
> > 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> >
> > However, a size of an array can never be negative, so that
> > an unsigned data type is a better choice than a signed.
> > And, arrays can have more elements than an int can address,
> > at least in theory.
> > For a reader it makes more sense, to replace
> > int i;
> > with
> > size_t i;
>
> It is a very good discipline to use size_t to index into an array
> whose size is externally controled (e.g., we slurp what the end user
> or the server on the other end of the connection gave us into a
> piece of memory we allocate) to avoid integer overflows as "int" is
> often narrower than "size_t".  But this particular one is a Meh; the
> keywords[] is a small hardcoded array whose size and contents are
> totally under our control.

I certainly agree in theory, though I've always erred on the side of
always using size_t for indexing into arrays, even if they're small. It
removes a potential pitfall if you are working with an
externally-controlled array and happen to forget to use size_t.

But if there is an existing index variable with type "int", and we can
easily validate that it's small, I probably wouldn't bother changing it
if I was editing nearby code.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/sideband.c b/sideband.c
index 6cbfd391c47..1599e408d1b 100644
--- a/sideband.c
+++ b/sideband.c
@@ -69,9 +69,8 @@  void list_config_color_sideband_slots(struct string_list *list, const char *pref
  * of the line. This should be called for a single line only, which is
  * passed as the first N characters of the SRC array.
  *
- * NEEDSWORK: use "size_t n" instead for clarity.
  */
-static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
+static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
 {
 	int i;