diff mbox series

[v2,4/4] xdiff: introduce XDL_ALLOC_GROW()

Message ID 8c24cd7737b29d461788b71f6a94eb74c468ad33.1657297520.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit f7b587bf656574103a0a2ad9c2337b0d15c0e92d
Headers show
Series xdiff: introduce memory allocation macros | expand

Commit Message

Phillip Wood July 8, 2022, 4:25 p.m. UTC
From: Phillip Wood <phillip.wood@dunelm.org.uk>

Add a helper to grow an array. This is analogous to ALLOC_GROW() in
the rest of the codebase but returns −1 on allocation failure to
accommodate other users of libxdiff such as libgit2. It will also
return a error if the multiplication overflows while calculating the
new allocation size. Note that this keeps doubling on reallocation
like the code it is replacing rather than increasing the existing size
by half like ALLOC_GROW(). It does however copy ALLOC_GROW()'s trick
of adding a small amount to the new allocation to avoid a lot of
reallocations at small sizes.

Note that xdl_alloc_grow_helper() uses long rather than size_t for
`nr` and `alloc` to match the existing code.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 xdiff/xmacros.h  | 10 ++++++++++
 xdiff/xprepare.c | 19 ++++---------------
 xdiff/xutils.c   | 17 +++++++++++++++++
 xdiff/xutils.h   |  3 ++-
 4 files changed, 33 insertions(+), 16 deletions(-)

Comments

Ævar Arnfjörð Bjarmason July 8, 2022, 10:17 p.m. UTC | #1
On Fri, Jul 08 2022, Phillip Wood via GitGitGadget wrote:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> Add a helper to grow an array. This is analogous to ALLOC_GROW() in
> the rest of the codebase but returns −1 on allocation failure to
> accommodate other users of libxdiff such as libgit2. 
> [...]
> +		if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
> +void* xdl_alloc_grow_helper(void *p, long nr, long *alloc, size_t size)
> +{
> +	void *tmp = NULL;
> +	size_t n = ((LONG_MAX - 16) / 2 >= *alloc) ? 2 * *alloc + 16 : LONG_MAX;
> +	if (nr > n)
> +		n = nr;
> +	if (SIZE_MAX / size >= n)
> +		tmp = xdl_realloc(p, n * size);
> +	if (tmp) {
> +		*alloc = n;
> +	} else {
> +		xdl_free(p);
> +		*alloc = 0;
> +	}
> +	return tmp;
> +}

Whether you agree with the entire approach in my series-on-top[1] or not
(and it looks like not), this way of doing it seems needlessly complex.

I.e. the whole "pass the size" business is here because you're wanting
to use this as an expression, which ALLOC_GROW() isn't able to do.

But you've also changed every single callsite anyway.

So why not just change:

    if (XDL_ALLOC_GROW(recs, ...))

To:

    XDL_ALLOC_GROW(recs, ...);
    if (!recs)

And do away with needing to pass this through a function where we get
the size, and thus losing the type information before we can call
sizeof().

Then, as that series shows the implementation here could be pretty much
an exact line-for-line copy of what we have in cache.h, including the
same alloc_nr(), all without casts etc.

All of which seems much simpler than trying to maintain the constraint
that this must be usable in an expression.

Your commit message sort-of addresses this by mentioning that this
"returns −1 on allocation failure to accommodate other users of libxdiff
such as libgit2".

But since libgit2 won't use this, but only a copy of this xdiff code in
libgit2 I don't see how that makes sense. We're only talking about using
this in the xdiff code we maintain, and even if libgit2 wanted to use it
it could deal with not being able to use it in an expression, no?
    	
1. https://lore.kernel.org/git/patch-5.7-3665576576f-20220708T140354Z-avarab@gmail.com/
Phillip Wood July 11, 2022, 10 a.m. UTC | #2
On 08/07/2022 23:17, Ævar Arnfjörð Bjarmason wrote:
> 
> On Fri, Jul 08 2022, Phillip Wood via GitGitGadget wrote:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> Add a helper to grow an array. This is analogous to ALLOC_GROW() in
>> the rest of the codebase but returns −1 on allocation failure to
>> accommodate other users of libxdiff such as libgit2.
>> [...]
>> +		if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
>> +void* xdl_alloc_grow_helper(void *p, long nr, long *alloc, size_t size)
>> +{
>> +	void *tmp = NULL;
>> +	size_t n = ((LONG_MAX - 16) / 2 >= *alloc) ? 2 * *alloc + 16 : LONG_MAX;
>> +	if (nr > n)
>> +		n = nr;
>> +	if (SIZE_MAX / size >= n)
>> +		tmp = xdl_realloc(p, n * size);
>> +	if (tmp) {
>> +		*alloc = n;
>> +	} else {
>> +		xdl_free(p);
>> +		*alloc = 0;
>> +	}
>> +	return tmp;
>> +}
> 
> Whether you agree with the entire approach in my series-on-top[1] or not
> (and it looks like not), this way of doing it seems needlessly complex.
> 
> I.e. the whole "pass the size" business is here because you're wanting
> to use this as an expression, which ALLOC_GROW() isn't able to do.
> 
> But you've also changed every single callsite anyway.
> 
> So why not just change:
> 
>      if (XDL_ALLOC_GROW(recs, ...))
> 
> To:
> 
>      XDL_ALLOC_GROW(recs, ...);
>      if (!recs)

Because I think it's ugly, we'd never define a function as void(int* 
result, args...) and I don't think we should do that for a macro where 
we can avoid it. The existing ALLOC_* macros make sense as statements 
because they die on failure.

Best Wishes

Phillip

> And do away with needing to pass this through a function where we get
> the size, and thus losing the type information before we can call
> sizeof().
> 
> Then, as that series shows the implementation here could be pretty much
> an exact line-for-line copy of what we have in cache.h, including the
> same alloc_nr(), all without casts etc.
 >
> All of which seems much simpler than trying to maintain the constraint
> that this must be usable in an expression.
> 
> Your commit message sort-of addresses this by mentioning that this
> "returns −1 on allocation failure to accommodate other users of libxdiff
> such as libgit2".
> 
> But since libgit2 won't use this, but only a copy of this xdiff code in
> libgit2 I don't see how that makes sense. We're only talking about using
> this in the xdiff code we maintain, and even if libgit2 wanted to use it
> it could deal with not being able to use it in an expression, no?
>      	
> 1. https://lore.kernel.org/git/patch-5.7-3665576576f-20220708T140354Z-avarab@gmail.com/
Jeff King July 12, 2022, 7:19 a.m. UTC | #3
On Mon, Jul 11, 2022 at 11:00:06AM +0100, Phillip Wood wrote:

> > But you've also changed every single callsite anyway.
> > 
> > So why not just change:
> > 
> >      if (XDL_ALLOC_GROW(recs, ...))
> > 
> > To:
> > 
> >      XDL_ALLOC_GROW(recs, ...);
> >      if (!recs)
> 
> Because I think it's ugly, we'd never define a function as void(int* result,
> args...) and I don't think we should do that for a macro where we can avoid
> it. The existing ALLOC_* macros make sense as statements because they die on
> failure.

Those macros are already unlike functions, though. They take a variable
_name_, not a pointer to a variable, and assign to it. A version which
looked like a function would have to be:

  if (XDL_ALLOC_GROW(&recs, ...))

So in my head I read them as assignment statements already, making the
expression-like form a bit jarring.

Just my two cents. I don't care _too_ much either way, but I'd probably
be swayed if one implementation is much simpler than the other.

-Peff
Phillip Wood July 13, 2022, 9:38 a.m. UTC | #4
Hi Peff

It's great to see you back on the list

On 12/07/2022 08:19, Jeff King wrote:
> On Mon, Jul 11, 2022 at 11:00:06AM +0100, Phillip Wood wrote:
> 
>>> But you've also changed every single callsite anyway.
>>>
>>> So why not just change:
>>>
>>>       if (XDL_ALLOC_GROW(recs, ...))
>>>
>>> To:
>>>
>>>       XDL_ALLOC_GROW(recs, ...);
>>>       if (!recs)
>>
>> Because I think it's ugly, we'd never define a function as void(int* result,
>> args...) and I don't think we should do that for a macro where we can avoid
>> it. The existing ALLOC_* macros make sense as statements because they die on
>> failure.
> 
> Those macros are already unlike functions, though. They take a variable
> _name_, not a pointer to a variable, and assign to it. A version which
> looked like a function would have to be:
> 
>    if (XDL_ALLOC_GROW(&recs, ...))
> 
> So in my head I read them as assignment statements already, making the
> expression-like form a bit jarring.

I see what you're saying, I agree it is not exactly analogous. I tend to 
think of assignment as an expression because it returns the value that's 
being assigned, though here it's a bit muddy because the assignment is 
hidden inside the macro and there is no tell-tale '&' as there would be 
if it were calling function.

> Just my two cents. I don't care _too_ much either way, but I'd probably
> be swayed if one implementation is much simpler than the other.

I don't think there is much difference in the complexity in this case. 
In general I think that using a function with a macro wrapper can make 
the implementation more readable as the function it is not littered with 
the extra parentheses and backslashes that the same code in a macro 
requires.

Best Wishes

Phillip
diff mbox series

Patch

diff --git a/xdiff/xmacros.h b/xdiff/xmacros.h
index 0977d1615ac..8487bb396fa 100644
--- a/xdiff/xmacros.h
+++ b/xdiff/xmacros.h
@@ -58,4 +58,14 @@  do { \
 /* Allocate an array of nr zeroed out elements, returns NULL on failure */
 #define XDL_CALLOC_ARRAY(p, nr)	((p) = xdl_calloc(nr, sizeof(*(p))))
 
+/*
+ * Ensure array p can accommodate at least nr elements, growing the
+ * array and updating alloc (which is the number of allocated
+ * elements) as necessary. Frees p and returns -1 on failure, returns
+ * 0 on success
+ */
+#define XDL_ALLOC_GROW(p, nr, alloc)	\
+	(-!((nr) <= (alloc) ||		\
+	    ((p) = xdl_alloc_grow_helper((p), (nr), &(alloc), sizeof(*(p))))))
+
 #endif /* #if !defined(XMACROS_H) */
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index b016570c488..c84549f6c50 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -111,7 +111,6 @@  static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 	long hi;
 	char const *line;
 	xdlclass_t *rcrec;
-	xdlclass_t **rcrecs;
 
 	line = rec->ptr;
 	hi = (long) XDL_HASHLONG(rec->ha, cf->hbits);
@@ -127,14 +126,8 @@  static int xdl_classify_record(unsigned int pass, xdlclassifier_t *cf, xrecord_t
 			return -1;
 		}
 		rcrec->idx = cf->count++;
-		if (cf->count > cf->alloc) {
-			cf->alloc *= 2;
-			if (!(rcrecs = (xdlclass_t **) xdl_realloc(cf->rcrecs, cf->alloc * sizeof(xdlclass_t *)))) {
-
+		if (XDL_ALLOC_GROW(cf->rcrecs, cf->count, cf->alloc))
 				return -1;
-			}
-			cf->rcrecs = rcrecs;
-		}
 		cf->rcrecs[rcrec->idx] = rcrec;
 		rcrec->line = line;
 		rcrec->size = rec->size;
@@ -163,7 +156,7 @@  static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 	unsigned long hav;
 	char const *blk, *cur, *top, *prev;
 	xrecord_t *crec;
-	xrecord_t **recs, **rrecs;
+	xrecord_t **recs;
 	xrecord_t **rhash;
 	unsigned long *ha;
 	char *rchg;
@@ -190,12 +183,8 @@  static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_
 		for (top = blk + bsize; cur < top; ) {
 			prev = cur;
 			hav = xdl_hash_record(&cur, top, xpp->flags);
-			if (nrec >= narec) {
-				narec *= 2;
-				if (!(rrecs = (xrecord_t **) xdl_realloc(recs, narec * sizeof(xrecord_t *))))
-					goto abort;
-				recs = rrecs;
-			}
+			if (XDL_ALLOC_GROW(recs, nrec + 1, narec))
+				goto abort;
 			if (!(crec = xdl_cha_alloc(&xdf->rcha)))
 				goto abort;
 			crec->ptr = prev;
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 115b2b1640b..9e36f24875d 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -432,3 +432,20 @@  int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 
 	return 0;
 }
+
+void* xdl_alloc_grow_helper(void *p, long nr, long *alloc, size_t size)
+{
+	void *tmp = NULL;
+	size_t n = ((LONG_MAX - 16) / 2 >= *alloc) ? 2 * *alloc + 16 : LONG_MAX;
+	if (nr > n)
+		n = nr;
+	if (SIZE_MAX / size >= n)
+		tmp = xdl_realloc(p, n * size);
+	if (tmp) {
+		*alloc = n;
+	} else {
+		xdl_free(p);
+		*alloc = 0;
+	}
+	return tmp;
+}
diff --git a/xdiff/xutils.h b/xdiff/xutils.h
index fba7bae03c7..fd0bba94e8b 100644
--- a/xdiff/xutils.h
+++ b/xdiff/xutils.h
@@ -42,6 +42,7 @@  int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2,
 int xdl_fall_back_diff(xdfenv_t *diff_env, xpparam_t const *xpp,
 		       int line1, int count1, int line2, int count2);
 
-
+/* Do not call this function, use XDL_ALLOC_GROW instead */
+void* xdl_alloc_grow_helper(void* p, long nr, long* alloc, size_t size);
 
 #endif /* #if !defined(XUTILS_H) */