[3/3] reftable tests: avoid "int" overflow, use "uint64_t"

Message ID	patch-3.3-93112305523-20220111T163908Z-avarab@gmail.com (mailing list archive)
State	Accepted
Commit	22d2f70e85e767abba2e284e32c0edb7f749e29c
Headers	show Return-Path: <git-owner@kernel.org> From: =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= <avarab@gmail.com> To: git@vger.kernel.org Cc: Junio C Hamano <gitster@pobox.com>, Johannes Schindelin <Johannes.Schindelin@gmx.de>, Han-Wen Nienhuys <hanwen@google.com>, =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsCBC?= =?utf-8?b?amFybWFzb24=?= <avarab@gmail.com> Subject: [PATCH 3/3] reftable tests: avoid "int" overflow, use "uint64_t" Date: Tue, 11 Jan 2022 17:40:23 +0100 Message-Id: <patch-3.3-93112305523-20220111T163908Z-avarab@gmail.com> In-Reply-To: <cover-0.3-00000000000-20220111T163908Z-avarab@gmail.com> References: <cover-0.3-00000000000-20220111T163908Z-avarab@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Fix SunCC compiler complaints new in v2.35.0-rc0 \| expand [0/3] Fix SunCC compiler complaints new in v2.35.0-rc0 [1/3] test-tool genzeros: initialize "zeros" to avoid SunCC warning [2/3] reftable: remove unreachable "return" statements [3/3] reftable tests: avoid "int" overflow, use "uint64_t"

Ævar Arnfjörð Bjarmason Jan. 11, 2022, 4:40 p.m. UTC

Change code added in 1ae2b8cda84 (reftable: add merged table view,
2021-10-07) to consistently use the "uint64_t" type. These "min" and
"max" variables get passed in the body of this function to a function
whose prototype is:

    [...] reftable_writer_set_limits([...], uint64_t min, uint64_t max

This avoids the following warning on SunCC 12.5 on
gcc211.fsffrance.org:

    "reftable/merged_test.c", line 27: warning: initializer does not fit or is out of range: 0xffffffff

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 reftable/merged_test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Taylor Blau Jan. 11, 2022, 7:28 p.m. UTC | #1

On Tue, Jan 11, 2022 at 05:40:23PM +0100, Ævar Arnfjörð Bjarmason wrote:
> diff --git a/reftable/merged_test.c b/reftable/merged_test.c
> index 24461e8a802..b87ff495dfd 100644
> --- a/reftable/merged_test.c
> +++ b/reftable/merged_test.c
> @@ -24,8 +24,8 @@ license that can be found in the LICENSE file or at
>  static void write_test_table(struct strbuf *buf,
>  			     struct reftable_ref_record refs[], int n)
>  {
> -	int min = 0xffffffff;
> -	int max = 0;
> +	uint64_t min = 0xffffffff;
> +	uint64_t max = 0;

Han-Wen: it looks like the loop below the context here is to set the
min/max of update_index over all of the ref records?

If so, making these comparisons all unsigned makes sense to me. In
practice it's probably fine at least from a signedness perspective,
since the compiler _should_ be coercing both operands to unsigned.

But perhaps not so from a width perspective, if sizeof(int) != 8 (though
I suspect in practice that we are unlikely to have enough possible
values of update_index for that to matter).

In any case, you're only setting the lower half of `min` high. Maybe:

    uint64_t min = ~0ul;

instead?

Thanks,
Taylor

Han-Wen Nienhuys Jan. 11, 2022, 7:31 p.m. UTC | #2

On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Tue, Jan 11, 2022 at 05:40:23PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > diff --git a/reftable/merged_test.c b/reftable/merged_test.c
> > index 24461e8a802..b87ff495dfd 100644
> > --- a/reftable/merged_test.c
> > +++ b/reftable/merged_test.c
> > @@ -24,8 +24,8 @@ license that can be found in the LICENSE file or at
> >  static void write_test_table(struct strbuf *buf,
> >                            struct reftable_ref_record refs[], int n)
> >  {
> > -     int min = 0xffffffff;
> > -     int max = 0;
> > +     uint64_t min = 0xffffffff;
> > +     uint64_t max = 0;
>
> Han-Wen: it looks like the loop below the context here is to set the
> min/max of update_index over all of the ref records?

correct.

> But perhaps not so from a width perspective, if sizeof(int) != 8 (though
> I suspect in practice that we are unlikely to have enough possible
> values of update_index for that to matter).

correct.

> In any case, you're only setting the lower half of `min` high. Maybe:
>
>     uint64_t min = ~0ul;

yeah, that works.

Taylor Blau Jan. 11, 2022, 7:41 p.m. UTC | #3

On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
> > In any case, you're only setting the lower half of `min` high. Maybe:
> >
> >     uint64_t min = ~0ul;
>
> yeah, that works.

I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
somebody more confident than I in this area would be welcome :).

Thanks,
Taylor

Johannes Sixt Jan. 11, 2022, 8:08 p.m. UTC | #4

Am 11.01.22 um 20:41 schrieb Taylor Blau:
> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
>>> In any case, you're only setting the lower half of `min` high. Maybe:
>>>
>>>     uint64_t min = ~0ul;
>>
>> yeah, that works.
> 
> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
> somebody more confident than I in this area would be welcome :).

It does not work on Windows: unsigned long is 32 bits wide. You have to
make it

   uint64_t min = ~(uint64_t)0;

-- Hannes

Taylor Blau Jan. 11, 2022, 8:18 p.m. UTC | #5

On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
> Am 11.01.22 um 20:41 schrieb Taylor Blau:
> > On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
> >> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
> >>> In any case, you're only setting the lower half of `min` high. Maybe:
> >>>
> >>>     uint64_t min = ~0ul;
> >>
> >> yeah, that works.
> >
> > I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
> > somebody more confident than I in this area would be welcome :).
>
> It does not work on Windows: unsigned long is 32 bits wide. You have to
> make it
>
>    uint64_t min = ~(uint64_t)0;

Perfect; this is exactly what I was looking for. Thanks!

Taylor

Johannes Sixt Jan. 11, 2022, 8:21 p.m. UTC | #6

Am 11.01.22 um 21:18 schrieb Taylor Blau:
> On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
>> Am 11.01.22 um 20:41 schrieb Taylor Blau:
>>> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
>>>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
>>>>> In any case, you're only setting the lower half of `min` high. Maybe:
>>>>>
>>>>>     uint64_t min = ~0ul;
>>>>
>>>> yeah, that works.
>>>
>>> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
>>> somebody more confident than I in this area would be welcome :).
>>
>> It does not work on Windows: unsigned long is 32 bits wide. You have to
>> make it
>>
>>    uint64_t min = ~(uint64_t)0;
> 
> Perfect; this is exactly what I was looking for. Thanks!

Actually, on second thought, UINT64_MAX would be even better.

-- Hannes

Taylor Blau Jan. 11, 2022, 8:24 p.m. UTC | #7

On Tue, Jan 11, 2022 at 09:21:11PM +0100, Johannes Sixt wrote:
> Am 11.01.22 um 21:18 schrieb Taylor Blau:
> > On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
> >> Am 11.01.22 um 20:41 schrieb Taylor Blau:
> >>> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
> >>>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
> >>>>> In any case, you're only setting the lower half of `min` high. Maybe:
> >>>>>
> >>>>>     uint64_t min = ~0ul;
> >>>>
> >>>> yeah, that works.
> >>>
> >>> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
> >>> somebody more confident than I in this area would be welcome :).
> >>
> >> It does not work on Windows: unsigned long is 32 bits wide. You have to
> >> make it
> >>
> >>    uint64_t min = ~(uint64_t)0;
> >
> > Perfect; this is exactly what I was looking for. Thanks!
>
> Actually, on second thought, UINT64_MAX would be even better.

:-). I think that either is probably fine; I couldn't remember if
UINT64_MAX was part of POSIX or not (and clearly didn't bother to check!)

Thanks,
Taylor

Johannes Schindelin Jan. 12, 2022, 2:18 p.m. UTC | #8

Hi Taylor,

On Tue, 11 Jan 2022, Taylor Blau wrote:

> On Tue, Jan 11, 2022 at 09:21:11PM +0100, Johannes Sixt wrote:
> > Am 11.01.22 um 21:18 schrieb Taylor Blau:
> > > On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
> > >> Am 11.01.22 um 20:41 schrieb Taylor Blau:
> > >>> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
> > >>>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
> > >>>>> In any case, you're only setting the lower half of `min` high. Maybe:
> > >>>>>
> > >>>>>     uint64_t min = ~0ul;
> > >>>>
> > >>>> yeah, that works.
> > >>>
> > >>> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
> > >>> somebody more confident than I in this area would be welcome :).
> > >>
> > >> It does not work on Windows: unsigned long is 32 bits wide. You have to
> > >> make it
> > >>
> > >>    uint64_t min = ~(uint64_t)0;
> > >
> > > Perfect; this is exactly what I was looking for. Thanks!
> >
> > Actually, on second thought, UINT64_MAX would be even better.
>
> :-). I think that either is probably fine; I couldn't remember if
> UINT64_MAX was part of POSIX or not (and clearly didn't bother to check!)

The best solution, of course, would be to `git grep` through the code and
see that UINT64_MAX is not used at all.

And that brings us to the question whether we really need to ensure that
exactly, precisely 64 bit are used for this variable? The answer is: no.
We may need it to be larger than 32-bit, so why not go for `uintmax_t` and
`UINTMAX_MAX`, both of which _are_ already used in Git's source code?

Ciao,
Dscho

Junio C Hamano Jan. 12, 2022, 7:02 p.m. UTC | #9

Johannes Sixt <j6t@kdbg.org> writes:

> Am 11.01.22 um 21:18 schrieb Taylor Blau:
>> On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
>>> Am 11.01.22 um 20:41 schrieb Taylor Blau:
>>>> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
>>>>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
>>>>>> In any case, you're only setting the lower half of `min` high. Maybe:
>>>>>>
>>>>>>     uint64_t min = ~0ul;
>>>>>
>>>>> yeah, that works.
>>>>
>>>> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
>>>> somebody more confident than I in this area would be welcome :).
>>>
>>> It does not work on Windows: unsigned long is 32 bits wide. You have to
>>> make it
>>>
>>>    uint64_t min = ~(uint64_t)0;
>> 
>> Perfect; this is exactly what I was looking for. Thanks!

That sounds perfect.

> Actually, on second thought, UINT64_MAX would be even better.

I wouldn't introduce use of UINT64_MAX, which "git grep" does not
produce any hits for.

Unless it is very early in a development cycle, that is, in which
case we have enough time to help platforms that are not quite POSIX.

Taylor Blau Jan. 12, 2022, 7:07 p.m. UTC | #10

On Wed, Jan 12, 2022 at 11:02:05AM -0800, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
>
> > Am 11.01.22 um 21:18 schrieb Taylor Blau:
> >> On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
> >>> Am 11.01.22 um 20:41 schrieb Taylor Blau:
> >>>> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
> >>>>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
> >>>>>> In any case, you're only setting the lower half of `min` high. Maybe:
> >>>>>>
> >>>>>>     uint64_t min = ~0ul;
> >>>>>
> >>>>> yeah, that works.
> >>>>
> >>>> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
> >>>> somebody more confident than I in this area would be welcome :).
> >>>
> >>> It does not work on Windows: unsigned long is 32 bits wide. You have to
> >>> make it
> >>>
> >>>    uint64_t min = ~(uint64_t)0;
> >>
> >> Perfect; this is exactly what I was looking for. Thanks!
>
> That sounds perfect.
>
> > Actually, on second thought, UINT64_MAX would be even better.
>
> I wouldn't introduce use of UINT64_MAX, which "git grep" does not
> produce any hits for.

> Unless it is very early in a development cycle, that is, in which
> case we have enough time to help platforms that are not quite POSIX.

Yep, I agree that avoiding introducing the first instance of UINT64_MAX
in our tree is worth doing (probably in general, but certainly now that
we're past even -rc0).

Either `~(uint64_t)0` or `UINTMAX_MAX` would be fine with me.

Thanks,
Taylor

Ævar Arnfjörð Bjarmason Jan. 13, 2022, 10:04 a.m. UTC | #11

On Wed, Jan 12 2022, Taylor Blau wrote:

> On Wed, Jan 12, 2022 at 11:02:05AM -0800, Junio C Hamano wrote:
>> Johannes Sixt <j6t@kdbg.org> writes:
>>
>> > Am 11.01.22 um 21:18 schrieb Taylor Blau:
>> >> On Tue, Jan 11, 2022 at 09:08:46PM +0100, Johannes Sixt wrote:
>> >>> Am 11.01.22 um 20:41 schrieb Taylor Blau:
>> >>>> On Tue, Jan 11, 2022 at 08:31:47PM +0100, Han-Wen Nienhuys wrote:
>> >>>>> On Tue, Jan 11, 2022 at 8:28 PM Taylor Blau <me@ttaylorr.com> wrote:
>> >>>>>> In any case, you're only setting the lower half of `min` high. Maybe:
>> >>>>>>
>> >>>>>>     uint64_t min = ~0ul;
>> >>>>>
>> >>>>> yeah, that works.
>> >>>>
>> >>>> I'm pretty sure this is OK on 32-bit systems, too, but confirmation from
>> >>>> somebody more confident than I in this area would be welcome :).
>> >>>
>> >>> It does not work on Windows: unsigned long is 32 bits wide. You have to
>> >>> make it
>> >>>
>> >>>    uint64_t min = ~(uint64_t)0;
>> >>
>> >> Perfect; this is exactly what I was looking for. Thanks!
>>
>> That sounds perfect.
>>
>> > Actually, on second thought, UINT64_MAX would be even better.
>>
>> I wouldn't introduce use of UINT64_MAX, which "git grep" does not
>> produce any hits for.
>
>> Unless it is very early in a development cycle, that is, in which
>> case we have enough time to help platforms that are not quite POSIX.
>
> Yep, I agree that avoiding introducing the first instance of UINT64_MAX
> in our tree is worth doing (probably in general, but certainly now that
> we're past even -rc0).
>
> Either `~(uint64_t)0` or `UINTMAX_MAX` would be fine with me.

The reason I left it at 0xffffffff is because the current test clearly
doesn't care about using the maximum width of the type, and I was just
trying to get rid of the associated compiler warning.

So I'll leave it to Han-Wen to comment on if the "max" being the maximum
of the type is actually important here.

As far as what we'd pick to get the maximum type value goes, we should
just prefer whatever we use for that already in that codebase, and we've
got this in a related file there:
    
    reftable/generic.c:     struct reftable_log_record log = {
    reftable/generic.c-             .refname = (char *)name,
    reftable/generic.c-             .update_index = ~((uint64_t)0),
    reftable/generic.c-     };

(Which is what Johannes Sixt independently suggested upthread in
<45baffd7-c9f3-cc52-47b4-ea0fee0182a8@kdbg.org>).

Junio C Hamano Jan. 13, 2022, 9:38 p.m. UTC | #12

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> The reason I left it at 0xffffffff is because the current test clearly
> doesn't care about using the maximum width of the type, and I was just
> trying to get rid of the associated compiler warning.

Sounds sane.  Let's take the original one you sent out, and those
who want to make things consistent can swap it with ~((uint64_t)0)
after the release.

Thanks.

[3/3] reftable tests: avoid "int" overflow, use "uint64_t"

Commit Message

Comments

Patch