Message ID | RFC-patch-2.2-364d1194a95-20220415T101740Z-avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | reftable: remove poor man's SANITIZE=address, fix a memset() bug | expand |
Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: > Return NULL instead of possibly feeding a NULL to memset() in > reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: > > reftable/publicbasics.c: In function ‘reftable_calloc’: > reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] > 43 | memset(p, 0, sz); > | ^~~~~~~~~~~~~~~~ > [...] > > This bug has been with us ever since this code was added in > ef8a6c62687 (reftable: utility functions, 2021-10-07). Not sure it's a bug, or limited to this specific location. AFAICS it's more a general lack of handling of allocation failures in the reftable code. Perhaps it was a conscious decision to let the system deal with them via segfaults? When the code is used by Git eventually it should use xmalloc and xrealloc instead of calling malloc(3) and realloc(3) directly, to handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. This will calm down the analyzer and extend the safety to the rest of the reftable code, not just this memset call. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > reftable/publicbasics.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c > index 0ad7d5c0ff2..a18167f5ab7 100644 > --- a/reftable/publicbasics.c > +++ b/reftable/publicbasics.c > @@ -40,6 +40,8 @@ void reftable_free(void *p) > void *reftable_calloc(size_t sz) > { > void *p = reftable_malloc(sz); > + if (!p) > + return NULL; > memset(p, 0, sz); This function is calloc(3) reimplemented, except it can't make use of pre-zero'd blocks of memory and doesn't multiply element size and count for the caller while checking for overflow, making it slower and harder to use securely. :-/ > return p; > }
On Fri, Apr 15 2022, René Scharfe wrote: > Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: >> Return NULL instead of possibly feeding a NULL to memset() in >> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: >> >> reftable/publicbasics.c: In function ‘reftable_calloc’: >> reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] >> 43 | memset(p, 0, sz); >> | ^~~~~~~~~~~~~~~~ >> [...] >> >> This bug has been with us ever since this code was added in >> ef8a6c62687 (reftable: utility functions, 2021-10-07). > > Not sure it's a bug, or limited to this specific location. AFAICS it's > more a general lack of handling of allocation failures in the reftable > code. Perhaps it was a conscious decision to let the system deal with > them via segfaults? I think it's just buggy, it tries to deal with malloc returning NULL in other contexts. > When the code is used by Git eventually it should use xmalloc and > xrealloc instead of calling malloc(3) and realloc(3) directly, to > handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. > This will calm down the analyzer and extend the safety to the rest of > the reftable code, not just this memset call. Yeah, its whole custom malloc handling should go away. >> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> >> --- >> reftable/publicbasics.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c >> index 0ad7d5c0ff2..a18167f5ab7 100644 >> --- a/reftable/publicbasics.c >> +++ b/reftable/publicbasics.c >> @@ -40,6 +40,8 @@ void reftable_free(void *p) >> void *reftable_calloc(size_t sz) >> { >> void *p = reftable_malloc(sz); >> + if (!p) >> + return NULL; >> memset(p, 0, sz); > > This function is calloc(3) reimplemented, except it can't make use of > pre-zero'd blocks of memory and doesn't multiply element size and count > for the caller while checking for overflow, making it slower and harder > to use securely. :-/ *nod*, this is really just re-arranging the deck chairs. Maybe Han-Wen had something in mind, but I really don't see the point of having it use malloc wrappers at all, as opposed to just having the linker point it to the right "malloc". So if it needs to be used outside of git.git it would just need the trivial xmalloc() etc. wrappers.
Hi Ævar and René On 15/04/2022 14:53, Ævar Arnfjörð Bjarmason wrote: > > On Fri, Apr 15 2022, René Scharfe wrote: > >> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: >>> Return NULL instead of possibly feeding a NULL to memset() in >>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: >>> >>> reftable/publicbasics.c: In function ‘reftable_calloc’: >>> reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] >>> 43 | memset(p, 0, sz); >>> | ^~~~~~~~~~~~~~~~ >>> [...] >>> >>> This bug has been with us ever since this code was added in >>> ef8a6c62687 (reftable: utility functions, 2021-10-07). >> >> Not sure it's a bug, or limited to this specific location. AFAICS it's >> more a general lack of handling of allocation failures in the reftable >> code. Perhaps it was a conscious decision to let the system deal with >> them via segfaults? > > I think it's just buggy, it tries to deal with malloc returning NULL in > other contexts. Does it? I just quickly scanned the output of git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master and I didn't see a single NULL check >> When the code is used by Git eventually it should use xmalloc and >> xrealloc instead of calling malloc(3) and realloc(3) directly, to >> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. >> This will calm down the analyzer and extend the safety to the rest of >> the reftable code, not just this memset call. > > Yeah, its whole custom malloc handling should go away. Isn't it there so the same code can be used by libgit2 and other things that let the caller specify a custom allocator? Best Wishes Phillip
On Fri, Apr 15 2022, Phillip Wood wrote: > Hi Ævar and René > > On 15/04/2022 14:53, Ævar Arnfjörð Bjarmason wrote: >> On Fri, Apr 15 2022, René Scharfe wrote: >> >>> Am 15.04.22 um 12:21 schrieb Ævar Arnfjörð Bjarmason: >>>> Return NULL instead of possibly feeding a NULL to memset() in >>>> reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: >>>> >>>> reftable/publicbasics.c: In function ‘reftable_calloc’: >>>> reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] >>>> 43 | memset(p, 0, sz); >>>> | ^~~~~~~~~~~~~~~~ >>>> [...] >>>> >>>> This bug has been with us ever since this code was added in >>>> ef8a6c62687 (reftable: utility functions, 2021-10-07). >>> >>> Not sure it's a bug, or limited to this specific location. AFAICS it's >>> more a general lack of handling of allocation failures in the reftable >>> code. Perhaps it was a conscious decision to let the system deal with >>> them via segfaults? >> I think it's just buggy, it tries to deal with malloc returning NULL >> in >> other contexts. > > Does it? I just quickly scanned the output of > > git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master > > and I didn't see a single NULL check I think you're right, I retrieved that information from wetware. Looking again I think I was confusing it with the if (x) free(x) fixes in 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16). >>> When the code is used by Git eventually it should use xmalloc and >>> xrealloc instead of calling malloc(3) and realloc(3) directly, to >>> handle allocation failures explicitly, and to support GIT_ALLOC_LIMIT. >>> This will calm down the analyzer and extend the safety to the rest of >>> the reftable code, not just this memset call. >> Yeah, its whole custom malloc handling should go away. > > Isn't it there so the same code can be used by libgit2 and other > things that let the caller specify a custom allocator? I don't think so, but I may be wrong. I think it's a case of over-generalization where we'd be better off with something simpler, i.e. just using our normal allocation functions. That still allows for taking this code and plugging it into any custom allocator. If you simply want to compile some code such as this to use another allocator you can easily do that via the linker, as e.g. git.git's build process allows you to do with nedmalloc. So presumably if libgit2 would want to use this they'd just do that via their build process. I.e. they'd have "malloc" point to a symbol that would resolve to a shimmy layer that would dispatch to functions using this: https://github.com/libgit2/libgit2/blob/main/src/util/alloc.c The reason you'd use these sorts of pluggable malloc functions is, I think, a few: 1. You are a library like libgit2, as opposed to libreftable itself, so you want to provide this sort of "set the alloc pointers to this" now. But in this case we either always want malloc/free (git.git) or the "parent" can provide these wrappers (libgit2). 2. You want to support switching dynamically (or not-globally), or have N instances with different malloc, as e.g. the PCREv2 API does: cbe81e653fa (grep/pcre2: move back to thread-only PCREv2 structures, 2021-02-18). For this code I think that's thoroughly in YAGNI territory. 3. Your distribution model can't assume the user can adjust this via linking, e.g. C source that's intended to be #included as-is (although there you could use defines...) etc. But maybe I'm missing something. But a really good reason not to do this and just rely on the link-time overriding is: A. Things look the same, and benefit from more general fixes (although to be fair the x*alloc() wrappers would work either way..) B. You don't get subtle bugs where you forget some_custom_malloc() and then use a generic free(). Having looked just now reftable/ has at least one such submarine-segfault waiting to happen. C. If you don't actually need the hyper-customize pluggable model of say PCREv2 where the library is trying to support having two patterns in memory managed by different allocators, or other such advanced use-cases it's just extra complexity. Some of this was discussed for xdiff/* in this thread: https://lore.kernel.org/git/220216.86leybszht.gmgdl@evledraar.gmail.com/
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> Does it? I just quickly scanned the output of >> >> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master >> >> and I didn't see a single NULL check > > I think you're right, I retrieved that information from wetware. Looking > again I think I was confusing it with the if (x) free(x) fixes in > 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16). True. Initially I hoped that these RFC changes gives us a better solution that comes from stepping back and rethinking the issues around the original "why are we calling memset() with NULL?", and if it were only "well, in _return_block() functions, we clear the block before calling _free()---that shouldn't be necessary, if the calling custom malloc-free pair cares, they can do the clearing, and plain vanilla free() certainly doesn't, so let's stop calling memset()", it certainly would have fallen into that category, but everything these RFC patches do beyond that seems "oh, it does not look important to me, so let's rip it out to simplify", without knowing enough to say if that is sensible. But ... >> Isn't it there so the same code can be used by libgit2 and other >> things that let the caller specify a custom allocator? > > I don't think so, but I may be wrong. I think it's a case of > over-generalization where we'd be better off with something simpler, > i.e. just using our normal allocation functions. ... many points that was raised on the reftable code in this thread may deserve response to explain the rationale behind the current code and design, as nobody can improve, without breaking the intended way to be used, without knowing what it is. Thanks for marking the patches with RFC. I think the "patch" is a bit too dense a form to point out the issues in the existing code, but the discussion that follows by quoting the points in the original code and suggested changes is a good way to think about the code before these RFC patches. Ideally, it would have been much better if these points were raised during the review before the code was accepted to the code base, but it is better to have one now, rather than never ;-) THanks.
On Fri, Apr 15, 2022 at 6:23 PM Junio C Hamano <gitster@pobox.com> wrote: > >> git grep -e 'reftable_[mc]alloc' -e 'reftable_realloc' -A2 origin/master > >> > >> and I didn't see a single NULL check > > > > I think you're right, I retrieved that information from wetware. Looking > > again I think I was confusing it with the if (x) free(x) fixes in > > 34230514b83 (Merge branch 'hn/reftable-coverity-fixes', 2022-02-16). > > True. Initially I hoped that these RFC changes gives us a better > solution that comes from stepping back and rethinking the issues > around the original "why are we calling memset() with NULL?", and memset with NULL is an oversight. The malloc routines are pluggable so the code could be reused for libgit2. However, as use within Git itself is still not imminent, they could just as well be removed as they are just a premature generalization right now. > if it were only "well, in _return_block() functions, we clear the > block before calling _free()---that shouldn't be necessary, if the > calling custom malloc-free pair cares, they can do the clearing, and > plain vanilla free() certainly doesn't, so let's stop calling The memset() calls on free() (eg. in are there to tease out use-after-free bugs in the unittests, but they should probably be removed from file_return_block() as that is production code.
diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c index 0ad7d5c0ff2..a18167f5ab7 100644 --- a/reftable/publicbasics.c +++ b/reftable/publicbasics.c @@ -40,6 +40,8 @@ void reftable_free(void *p) void *reftable_calloc(size_t sz) { void *p = reftable_malloc(sz); + if (!p) + return NULL; memset(p, 0, sz); return p; }
Return NULL instead of possibly feeding a NULL to memset() in reftable_calloc(). This issue was noted by GCC 12's -fanalyzer: reftable/publicbasics.c: In function ‘reftable_calloc’: reftable/publicbasics.c:43:9: error: use of possibly-NULL ‘p’ where non-null expected [CWE-690] [-Werror=analyzer-possible-null-argument] 43 | memset(p, 0, sz); | ^~~~~~~~~~~~~~~~ [...] This bug has been with us ever since this code was added in ef8a6c62687 (reftable: utility functions, 2021-10-07). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- reftable/publicbasics.c | 2 ++ 1 file changed, 2 insertions(+)