diff mbox series

[1/9] lib/sort.c: implement sort() variant taking context argument

Message ID 20190619121540.29320-2-boris.brezillon@collabora.com (mailing list archive)
State New, archived
Headers show
Series media: hantro: Add support for H264 decoding | expand

Commit Message

Boris Brezillon June 19, 2019, 12:15 p.m. UTC
From: Rasmus Villemoes <linux@rasmusvillemoes.dk>

Our list_sort() utility has always supported a context argument that
is passed through to the comparison routine. Now there's a use case
for the similar thing for sort().

This implements sort_r by simply extending the existing sort function
in the obvious way. To avoid code duplication, we want to implement
sort() in terms of sort_r(). The naive way to do that is

static int cmp_wrapper(const void *a, const void *b, const void *ctx)
{
  int (*real_cmp)(const void*, const void*) = ctx;
  return real_cmp(a, b);
}

sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }

but this would do two indirect calls for each comparison. Instead, do
as is done for the default swap functions - that only adds a cost of a
single easily predicted branch to each comparison call.

Aside from introducing support for the context argument, this also
serves as preparation for patches that will eliminate the indirect
comparison calls in common cases.

Requested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
Hi all,

Andrew, you acked the first version of this patch, but Rasmus proposed
a better solution and posted a v2. Can you review/ack this version.

Hans, Mauro, Andrew suggested to have this patch applied along with
its first user (the H264 backend of the hantro codec), so here it is.
Note that, if possible, I'd like to have this patch queued for the next
release even if the H264 bits don't get accepted as is. The rationale
here being that Rasmus told me he was planning to further improve the
sort() logic after the next -rc1 is out, and I fear his changes will
conflict with this patch, which might involve some kind synchronisation
(a topic branch) between the media maintainers and Andrew.

Let me know how you want to proceed with that.

Regards,

Boris
---
 include/linux/sort.h |  5 +++++
 lib/sort.c           | 34 ++++++++++++++++++++++++++++------
 2 files changed, 33 insertions(+), 6 deletions(-)

Comments

Boris Brezillon July 25, 2019, 12:40 p.m. UTC | #1
Hi Andrew,

On Wed, 19 Jun 2019 14:15:32 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> 
> Our list_sort() utility has always supported a context argument that
> is passed through to the comparison routine. Now there's a use case
> for the similar thing for sort().
> 
> This implements sort_r by simply extending the existing sort function
> in the obvious way. To avoid code duplication, we want to implement
> sort() in terms of sort_r(). The naive way to do that is
> 
> static int cmp_wrapper(const void *a, const void *b, const void *ctx)
> {
>   int (*real_cmp)(const void*, const void*) = ctx;
>   return real_cmp(a, b);
> }
> 
> sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }
> 
> but this would do two indirect calls for each comparison. Instead, do
> as is done for the default swap functions - that only adds a cost of a
> single easily predicted branch to each comparison call.
> 
> Aside from introducing support for the context argument, this also
> serves as preparation for patches that will eliminate the indirect
> comparison calls in common cases.
> 
> Requested-by: Boris Brezillon <boris.brezillon@collabora.com>
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
> Hi all,
> 
> Andrew, you acked the first version of this patch, but Rasmus proposed
> a better solution and posted a v2. Can you review/ack this version.

Hans is planning to take that patch soon. Would you mind adding your
Ack back (assuming you're okay with the new version of course)?

Thanks,

Boris

> 
> Hans, Mauro, Andrew suggested to have this patch applied along with
> its first user (the H264 backend of the hantro codec), so here it is.
> Note that, if possible, I'd like to have this patch queued for the next
> release even if the H264 bits don't get accepted as is. The rationale
> here being that Rasmus told me he was planning to further improve the
> sort() logic after the next -rc1 is out, and I fear his changes will
> conflict with this patch, which might involve some kind synchronisation
> (a topic branch) between the media maintainers and Andrew.
> 
> Let me know how you want to proceed with that.
> 
> Regards,
> 
> Boris
> ---
>  include/linux/sort.h |  5 +++++
>  lib/sort.c           | 34 ++++++++++++++++++++++++++++------
>  2 files changed, 33 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/sort.h b/include/linux/sort.h
> index 2b99a5dd073d..61b96d0ebc44 100644
> --- a/include/linux/sort.h
> +++ b/include/linux/sort.h
> @@ -4,6 +4,11 @@
>  
>  #include <linux/types.h>
>  
> +void sort_r(void *base, size_t num, size_t size,
> +	    int (*cmp)(const void *, const void *, const void *),
> +	    void (*swap)(void *, void *, int),
> +	    const void *priv);
> +
>  void sort(void *base, size_t num, size_t size,
>  	  int (*cmp)(const void *, const void *),
>  	  void (*swap)(void *, void *, int));
> diff --git a/lib/sort.c b/lib/sort.c
> index cf408aec3733..d54cf97e9548 100644
> --- a/lib/sort.c
> +++ b/lib/sort.c
> @@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>  		swap_func(a, b, (int)size);
>  }
>  
> +typedef int (*cmp_func_t)(const void *, const void *);
> +typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
> +#define _CMP_WRAPPER ((cmp_r_func_t)0L)
> +
> +static int do_cmp(const void *a, const void *b,
> +		  cmp_r_func_t cmp, const void *priv)
> +{
> +	if (cmp == _CMP_WRAPPER)
> +		return ((cmp_func_t)(priv))(a, b);
> +	return cmp(a, b, priv);
> +}
> +
>  /**
>   * parent - given the offset of the child, find the offset of the parent.
>   * @i: the offset of the heap element whose parent is sought.  Non-zero.
> @@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>  }
>  
>  /**
> - * sort - sort an array of elements
> + * sort_r - sort an array of elements
>   * @base: pointer to data to sort
>   * @num: number of elements
>   * @size: size of each element
>   * @cmp_func: pointer to comparison function
>   * @swap_func: pointer to swap function or NULL
> + * @priv: third argument passed to comparison function
>   *
>   * This function does a heapsort on the given array.  You may provide
>   * a swap_func function if you need to do something more than a memory
> @@ -188,9 +201,10 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>   * O(n*n) worst-case behavior and extra memory requirements that make
>   * it less suitable for kernel use.
>   */
> -void sort(void *base, size_t num, size_t size,
> -	  int (*cmp_func)(const void *, const void *),
> -	  void (*swap_func)(void *, void *, int size))
> +void sort_r(void *base, size_t num, size_t size,
> +	    int (*cmp_func)(const void *, const void *, const void *),
> +	    void (*swap_func)(void *, void *, int size),
> +	    const void *priv)
>  {
>  	/* pre-scale counters for performance */
>  	size_t n = num * size, a = (num/2) * size;
> @@ -238,12 +252,12 @@ void sort(void *base, size_t num, size_t size,
>  		 * average, 3/4 worst-case.)
>  		 */
>  		for (b = a; c = 2*b + size, (d = c + size) < n;)
> -			b = cmp_func(base + c, base + d) >= 0 ? c : d;
> +			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
>  		if (d == n)	/* Special case last leaf with no sibling */
>  			b = c;
>  
>  		/* Now backtrack from "b" to the correct location for "a" */
> -		while (b != a && cmp_func(base + a, base + b) >= 0)
> +		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
>  			b = parent(b, lsbit, size);
>  		c = b;			/* Where "a" belongs */
>  		while (b != a) {	/* Shift it into place */
> @@ -252,4 +266,12 @@ void sort(void *base, size_t num, size_t size,
>  		}
>  	}
>  }
> +EXPORT_SYMBOL(sort_r);
> +
> +void sort(void *base, size_t num, size_t size,
> +	  int (*cmp_func)(const void *, const void *),
> +	  void (*swap_func)(void *, void *, int size))
> +{
> +	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
> +}
>  EXPORT_SYMBOL(sort);
Andrew Morton July 26, 2019, 12:05 a.m. UTC | #2
On Wed, 19 Jun 2019 14:15:32 +0200 Boris Brezillon <boris.brezillon@collabora.com> wrote:

> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> 
> Our list_sort() utility has always supported a context argument that
> is passed through to the comparison routine. Now there's a use case
> for the similar thing for sort().
> 
> This implements sort_r by simply extending the existing sort function
> in the obvious way. To avoid code duplication, we want to implement
> sort() in terms of sort_r(). The naive way to do that is
> 
> static int cmp_wrapper(const void *a, const void *b, const void *ctx)
> {
>   int (*real_cmp)(const void*, const void*) = ctx;
>   return real_cmp(a, b);
> }
> 
> sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }
> 
> but this would do two indirect calls for each comparison. Instead, do
> as is done for the default swap functions - that only adds a cost of a
> single easily predicted branch to each comparison call.
> 
> Aside from introducing support for the context argument, this also
> serves as preparation for patches that will eliminate the indirect
> comparison calls in common cases.

Acked-by: Andrew Morton <akpm@linux-foundation.org>

> --- a/lib/sort.c
> +++ b/lib/sort.c
> @@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>  		swap_func(a, b, (int)size);
>  }
>  
> +typedef int (*cmp_func_t)(const void *, const void *);
> +typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
> +#define _CMP_WRAPPER ((cmp_r_func_t)0L)

Although I can't say I'm a fan of _CMP_WRAPPER.  I don't understand
what the name means.  Why not simply open-code NULL in the two sites?

> +static int do_cmp(const void *a, const void *b,
> +		  cmp_r_func_t cmp, const void *priv)
> +{
> +	if (cmp == _CMP_WRAPPER)
> +		return ((cmp_func_t)(priv))(a, b);
> +	return cmp(a, b, priv);
> +}
> +
>  /**
>   * parent - given the offset of the child, find the offset of the parent.
>   * @i: the offset of the heap element whose parent is sought.  Non-zero.
> @@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>  }
>  
>  /**
> - * sort - sort an array of elements
> + * sort_r - sort an array of elements
>   * @base: pointer to data to sort
>   * @num: number of elements
>   * @size: size of each element
>   * @cmp_func: pointer to comparison function
>   * @swap_func: pointer to swap function or NULL
> + * @priv: third argument passed to comparison function

Passing priv==NULLis part of the interface and should be documented?

>   *
>   * This function does a heapsort on the given array.  You may provide
>   * a swap_func function if you need to do something more than a memory
> @@ -188,9 +201,10 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>   * O(n*n) worst-case behavior and extra memory requirements that make
>   * it less suitable for kernel use.
>   */
> -void sort(void *base, size_t num, size_t size,
> -	  int (*cmp_func)(const void *, const void *),
> -	  void (*swap_func)(void *, void *, int size))
> +void sort_r(void *base, size_t num, size_t size,
> +	    int (*cmp_func)(const void *, const void *, const void *),
> +	    void (*swap_func)(void *, void *, int size),
> +	    const void *priv)
>  {
>  	/* pre-scale counters for performance */
>  	size_t n = num * size, a = (num/2) * size;
> @@ -238,12 +252,12 @@ void sort(void *base, size_t num, size_t size,
>  		 * average, 3/4 worst-case.)
>  		 */
>  		for (b = a; c = 2*b + size, (d = c + size) < n;)
> -			b = cmp_func(base + c, base + d) >= 0 ? c : d;
> +			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
>  		if (d == n)	/* Special case last leaf with no sibling */
>  			b = c;
>  
>  		/* Now backtrack from "b" to the correct location for "a" */
> -		while (b != a && cmp_func(base + a, base + b) >= 0)
> +		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
>  			b = parent(b, lsbit, size);
>  		c = b;			/* Where "a" belongs */
>  		while (b != a) {	/* Shift it into place */
> @@ -252,4 +266,12 @@ void sort(void *base, size_t num, size_t size,
>  		}
>  	}
>  }
> +EXPORT_SYMBOL(sort_r);
> +
> +void sort(void *base, size_t num, size_t size,
> +	  int (*cmp_func)(const void *, const void *),
> +	  void (*swap_func)(void *, void *, int size))
> +{
> +	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
> +}
>  EXPORT_SYMBOL(sort);
Rasmus Villemoes July 26, 2019, 6:59 a.m. UTC | #3
On 26/07/2019 02.05, Andrew Morton wrote:
> On Wed, 19 Jun 2019 14:15:32 +0200 Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
>> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>>
>> Our list_sort() utility has always supported a context argument that
>> is passed through to the comparison routine. Now there's a use case
>> for the similar thing for sort().
>>
>> This implements sort_r by simply extending the existing sort function
>> in the obvious way. To avoid code duplication, we want to implement
>> sort() in terms of sort_r(). The naive way to do that is
>>
>> static int cmp_wrapper(const void *a, const void *b, const void *ctx)
>> {
>>   int (*real_cmp)(const void*, const void*) = ctx;
>>   return real_cmp(a, b);
>> }
>>
>> sort(..., cmp) { sort_r(..., cmp_wrapper, cmp) }
>>
>> but this would do two indirect calls for each comparison. Instead, do
>> as is done for the default swap functions - that only adds a cost of a
>> single easily predicted branch to each comparison call.
>>
>> Aside from introducing support for the context argument, this also
>> serves as preparation for patches that will eliminate the indirect
>> comparison calls in common cases.
> 
> Acked-by: Andrew Morton <akpm@linux-foundation.org>
> 
>> --- a/lib/sort.c
>> +++ b/lib/sort.c
>> @@ -144,6 +144,18 @@ static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
>>  		swap_func(a, b, (int)size);
>>  }
>>  
>> +typedef int (*cmp_func_t)(const void *, const void *);
>> +typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
>> +#define _CMP_WRAPPER ((cmp_r_func_t)0L)
> 
> Although I can't say I'm a fan of _CMP_WRAPPER.  I don't understand
> what the name means.  Why not simply open-code NULL in the two sites?

That's the preparation part. Once I find time to tie up the loose ends,
there'll be a

  sort_by_key(base, num, swap, key)

where base must be a pointer to (array of) struct foobar, and key is the
name of an u32 or u64 (more can be added) member. Internally, that will
work by calling sort_r with a sentinel _CMP_SORT_U32 (or _CMP_SORT_U64,
...) as cmp function and offsetof(typeof(*base), key) as the priv argument.

In do_cmp, we then check whether the cmp function is a small
non-negative integer and then do the appropriate comparison directly
instead of doing an indirect call.

And this infrastructure will be shared with list_sort which will grow a
similar list_sort_by_key(). I think the actual value of _CMP_WRAPPER
will change to simplify that part, so for that reason alone I won't
hard-code NULL.


>> +static int do_cmp(const void *a, const void *b,
>> +		  cmp_r_func_t cmp, const void *priv)
>> +{
>> +	if (cmp == _CMP_WRAPPER)
>> +		return ((cmp_func_t)(priv))(a, b);
>> +	return cmp(a, b, priv);
>> +}
>> +

i.e., this becomes something like

if ((unsigned long)cmp <= ...) {
  if (cmp == CMP_WRAPPER)
     // called from sort(), priv is the original cmp_func pointer
     return ((cmp_func_t)(priv))(a, b);
  // must be called from sort_by_key, priv is the offset in each struct
  long offset = (long)priv;
  a += offset;
  b += offset;
  if (cmp == CMP_U32)
    return *(u32*)a > *(u32*)b;
  if (cmp == CMP_u64)
    return *(u64*)a > *(u64*)b;
  WARN_ONCE() // should be eliminated by smart enough compiler
  return 0;
}
return cmp(a, b, priv);

>>  /**
>>   * parent - given the offset of the child, find the offset of the parent.
>>   * @i: the offset of the heap element whose parent is sought.  Non-zero.
>> @@ -171,12 +183,13 @@ static size_t parent(size_t i, unsigned int lsbit, size_t size)
>>  }
>>  
>>  /**
>> - * sort - sort an array of elements
>> + * sort_r - sort an array of elements
>>   * @base: pointer to data to sort
>>   * @num: number of elements
>>   * @size: size of each element
>>   * @cmp_func: pointer to comparison function
>>   * @swap_func: pointer to swap function or NULL
>> + * @priv: third argument passed to comparison function
> 
> Passing priv==NULLis part of the interface and should be documented?

No, to sort_r() as a public function @priv is completely opaque, and the
user can pass anything he likes. Only when sort_r() is called
"internally" with a sentinel value of cmp_func (e.g. from sort() or
sort_by_key()) does the priv argument have any meaning, but that's
implementation details that should absolutely not be documented.

Rasmus
diff mbox series

Patch

diff --git a/include/linux/sort.h b/include/linux/sort.h
index 2b99a5dd073d..61b96d0ebc44 100644
--- a/include/linux/sort.h
+++ b/include/linux/sort.h
@@ -4,6 +4,11 @@ 
 
 #include <linux/types.h>
 
+void sort_r(void *base, size_t num, size_t size,
+	    int (*cmp)(const void *, const void *, const void *),
+	    void (*swap)(void *, void *, int),
+	    const void *priv);
+
 void sort(void *base, size_t num, size_t size,
 	  int (*cmp)(const void *, const void *),
 	  void (*swap)(void *, void *, int));
diff --git a/lib/sort.c b/lib/sort.c
index cf408aec3733..d54cf97e9548 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -144,6 +144,18 @@  static void do_swap(void *a, void *b, size_t size, swap_func_t swap_func)
 		swap_func(a, b, (int)size);
 }
 
+typedef int (*cmp_func_t)(const void *, const void *);
+typedef int (*cmp_r_func_t)(const void *, const void *, const void *);
+#define _CMP_WRAPPER ((cmp_r_func_t)0L)
+
+static int do_cmp(const void *a, const void *b,
+		  cmp_r_func_t cmp, const void *priv)
+{
+	if (cmp == _CMP_WRAPPER)
+		return ((cmp_func_t)(priv))(a, b);
+	return cmp(a, b, priv);
+}
+
 /**
  * parent - given the offset of the child, find the offset of the parent.
  * @i: the offset of the heap element whose parent is sought.  Non-zero.
@@ -171,12 +183,13 @@  static size_t parent(size_t i, unsigned int lsbit, size_t size)
 }
 
 /**
- * sort - sort an array of elements
+ * sort_r - sort an array of elements
  * @base: pointer to data to sort
  * @num: number of elements
  * @size: size of each element
  * @cmp_func: pointer to comparison function
  * @swap_func: pointer to swap function or NULL
+ * @priv: third argument passed to comparison function
  *
  * This function does a heapsort on the given array.  You may provide
  * a swap_func function if you need to do something more than a memory
@@ -188,9 +201,10 @@  static size_t parent(size_t i, unsigned int lsbit, size_t size)
  * O(n*n) worst-case behavior and extra memory requirements that make
  * it less suitable for kernel use.
  */
-void sort(void *base, size_t num, size_t size,
-	  int (*cmp_func)(const void *, const void *),
-	  void (*swap_func)(void *, void *, int size))
+void sort_r(void *base, size_t num, size_t size,
+	    int (*cmp_func)(const void *, const void *, const void *),
+	    void (*swap_func)(void *, void *, int size),
+	    const void *priv)
 {
 	/* pre-scale counters for performance */
 	size_t n = num * size, a = (num/2) * size;
@@ -238,12 +252,12 @@  void sort(void *base, size_t num, size_t size,
 		 * average, 3/4 worst-case.)
 		 */
 		for (b = a; c = 2*b + size, (d = c + size) < n;)
-			b = cmp_func(base + c, base + d) >= 0 ? c : d;
+			b = do_cmp(base + c, base + d, cmp_func, priv) >= 0 ? c : d;
 		if (d == n)	/* Special case last leaf with no sibling */
 			b = c;
 
 		/* Now backtrack from "b" to the correct location for "a" */
-		while (b != a && cmp_func(base + a, base + b) >= 0)
+		while (b != a && do_cmp(base + a, base + b, cmp_func, priv) >= 0)
 			b = parent(b, lsbit, size);
 		c = b;			/* Where "a" belongs */
 		while (b != a) {	/* Shift it into place */
@@ -252,4 +266,12 @@  void sort(void *base, size_t num, size_t size,
 		}
 	}
 }
+EXPORT_SYMBOL(sort_r);
+
+void sort(void *base, size_t num, size_t size,
+	  int (*cmp_func)(const void *, const void *),
+	  void (*swap_func)(void *, void *, int size))
+{
+	return sort_r(base, num, size, _CMP_WRAPPER, swap_func, cmp_func);
+}
 EXPORT_SYMBOL(sort);