diff mbox series

[bpf-next,6/7] bpf: Only call maybe_wait_bpf_programs() when at least one map operation succeeds

Message ID 20231208102355.2628918-7-houtao@huaweicloud.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: Fixes for maybe_wait_bpf_programs() | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 7874 this patch: 7874
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 2577 this patch: 2577
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8409 this patch: 8409
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 210 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Commit Message

Hou Tao Dec. 8, 2023, 10:23 a.m. UTC
From: Hou Tao <houtao1@huawei.com>

There is no need to call maybe_wait_bpf_programs() if all operations in
batched update, deletion, or lookup_and_deletion fail. So only call
maybe_wait_bpf_programs() if at least one map operation succeeds.

Similar with uattr->batch.count which is used to return the number of
succeeded map operations to userspace application, use attr->batch.count
to record the number of succeeded map operations in kernel. Sometimes
these two number may be different. For example, in
__htab_map_lookup_and_delete_batch(do_delete=true), it is possible that
10 items in current bucket have been successfully deleted, but copying
the deleted keys to userspace application fails, attr->batch.count will
be 10 but uattr->batch.count will be 0 instead.

Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 include/linux/bpf.h  | 14 +++++++-------
 kernel/bpf/hashtab.c | 20 +++++++++++---------
 kernel/bpf/syscall.c | 21 ++++++++++++++-------
 3 files changed, 32 insertions(+), 23 deletions(-)

Comments

Yonghong Song Dec. 8, 2023, 10:55 p.m. UTC | #1
On 12/8/23 2:23 AM, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>
>
> There is no need to call maybe_wait_bpf_programs() if all operations in
> batched update, deletion, or lookup_and_deletion fail. So only call
> maybe_wait_bpf_programs() if at least one map operation succeeds.
>
> Similar with uattr->batch.count which is used to return the number of
> succeeded map operations to userspace application, use attr->batch.count
> to record the number of succeeded map operations in kernel. Sometimes
> these two number may be different. For example, in
> __htab_map_lookup_and_delete_batch(do_delete=true), it is possible that
> 10 items in current bucket have been successfully deleted, but copying
> the deleted keys to userspace application fails, attr->batch.count will
> be 10 but uattr->batch.count will be 0 instead.
>
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>   include/linux/bpf.h  | 14 +++++++-------
>   kernel/bpf/hashtab.c | 20 +++++++++++---------
>   kernel/bpf/syscall.c | 21 ++++++++++++++-------
>   3 files changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index f7aa255c634f..a0c4d696a231 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -81,17 +81,17 @@ struct bpf_map_ops {
>   	int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key);
>   	void (*map_release_uref)(struct bpf_map *map);
>   	void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
> -	int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr,
> +	int (*map_lookup_batch)(struct bpf_map *map, union bpf_attr *attr,
>   				union bpf_attr __user *uattr);
>   	int (*map_lookup_and_delete_elem)(struct bpf_map *map, void *key,
>   					  void *value, u64 flags);
>   	int (*map_lookup_and_delete_batch)(struct bpf_map *map,
> -					   const union bpf_attr *attr,
> +					   union bpf_attr *attr,
>   					   union bpf_attr __user *uattr);
>   	int (*map_update_batch)(struct bpf_map *map, struct file *map_file,
> -				const union bpf_attr *attr,
> +				union bpf_attr *attr,
>   				union bpf_attr __user *uattr);
> -	int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr,
> +	int (*map_delete_batch)(struct bpf_map *map, union bpf_attr *attr,
>   				union bpf_attr __user *uattr);
>   
>   	/* funcs callable from userspace and from eBPF programs */
> @@ -2095,13 +2095,13 @@ void bpf_map_area_free(void *base);
>   bool bpf_map_write_active(const struct bpf_map *map);
>   void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr);
>   int  generic_map_lookup_batch(struct bpf_map *map,
> -			      const union bpf_attr *attr,
> +			      union bpf_attr *attr,
>   			      union bpf_attr __user *uattr);
>   int  generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> -			      const union bpf_attr *attr,
> +			      union bpf_attr *attr,
>   			      union bpf_attr __user *uattr);
>   int  generic_map_delete_batch(struct bpf_map *map,
> -			      const union bpf_attr *attr,
> +			      union bpf_attr *attr,
>   			      union bpf_attr __user *uattr);
>   struct bpf_map *bpf_map_get_curr_or_next(u32 *id);
>   struct bpf_prog *bpf_prog_get_curr_or_next(u32 *id);
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 5b9146fa825f..b777bd8d4f8d 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -1673,7 +1673,7 @@ static int htab_lru_percpu_map_lookup_and_delete_elem(struct bpf_map *map,
>   
>   static int
>   __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> -				   const union bpf_attr *attr,
> +				   union bpf_attr *attr,
>   				   union bpf_attr __user *uattr,
>   				   bool do_delete, bool is_lru_map,
>   				   bool is_percpu)
> @@ -1708,6 +1708,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>   	if (!max_count)
>   		return 0;
>   
> +	attr->batch.count = 0;
>   	if (put_user(0, &uattr->batch.count))
>   		return -EFAULT;
>   
> @@ -1845,6 +1846,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
>   		}
>   		dst_key += key_size;
>   		dst_val += value_size;
> +		attr->batch.count++;
>   	}
>   
>   	htab_unlock_bucket(htab, b, batch, flags);

[...]

> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index efda2353a7d5..d2641e51a1a7 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1695,7 +1695,7 @@ static int map_get_next_key(union bpf_attr *attr)
>   }
>   
>   int generic_map_delete_batch(struct bpf_map *map,
> -			     const union bpf_attr *attr,
> +			     union bpf_attr *attr,
>   			     union bpf_attr __user *uattr)
>   {
>   	void __user *keys = u64_to_user_ptr(attr->batch.keys);
> @@ -1715,6 +1715,7 @@ int generic_map_delete_batch(struct bpf_map *map,
>   	if (!max_count)
>   		return 0;
>   
> +	attr->batch.count = 0;
>   	if (put_user(0, &uattr->batch.count))
>   		return -EFAULT;
>   
> @@ -1742,6 +1743,8 @@ int generic_map_delete_batch(struct bpf_map *map,
>   			break;
>   		cond_resched();
>   	}
> +
> +	attr->batch.count = cp;
>   	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
>   		err = -EFAULT;
>   
> @@ -1751,7 +1754,7 @@ int generic_map_delete_batch(struct bpf_map *map,
>   }
>   
>   int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> -			     const union bpf_attr *attr,
> +			     union bpf_attr *attr,
>   			     union bpf_attr __user *uattr)
>   {
>   	void __user *values = u64_to_user_ptr(attr->batch.values);
> @@ -1774,6 +1777,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
>   	if (!max_count)
>   		return 0;
>   
> +	attr->batch.count = 0;
>   	if (put_user(0, &uattr->batch.count))
>   		return -EFAULT;
>   
> @@ -1802,6 +1806,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
>   		cond_resched();
>   	}
>   
> +	attr->batch.count = cp;
>   	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
>   		err = -EFAULT;
>   
> @@ -1813,9 +1818,8 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
>   
>   #define MAP_LOOKUP_RETRIES 3
>   
> -int generic_map_lookup_batch(struct bpf_map *map,
> -				    const union bpf_attr *attr,
> -				    union bpf_attr __user *uattr)
> +int generic_map_lookup_batch(struct bpf_map *map, union bpf_attr *attr,
> +			     union bpf_attr __user *uattr)
>   {
>   	void __user *uobatch = u64_to_user_ptr(attr->batch.out_batch);
>   	void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch);
> @@ -1838,6 +1842,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
>   	if (!max_count)
>   		return 0;
>   
> +	attr->batch.count = 0;
>   	if (put_user(0, &uattr->batch.count))
>   		return -EFAULT;
>   
> @@ -1903,6 +1908,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
>   	if (err == -EFAULT)
>   		goto free_buf;
>   
> +	attr->batch.count = cp;

You don't need to change generic_map_lookup_batch() here. It won't trigger
maybe_wait_bpf_programs().

>   	if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) ||
>   		    (cp && copy_to_user(uobatch, prev_key, map->key_size))))
>   		err = -EFAULT;
> @@ -4926,7 +4932,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
>   		err = fn(__VA_ARGS__);		\
>   	} while (0)
>   
> -static int bpf_map_do_batch(const union bpf_attr *attr,
> +static int bpf_map_do_batch(union bpf_attr *attr,
>   			    union bpf_attr __user *uattr,
>   			    int cmd)
>   {
> @@ -4966,7 +4972,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
>   		BPF_DO_BATCH(map->ops->map_delete_batch, map, attr, uattr);
>   err_put:
>   	if (has_write) {
> -		maybe_wait_bpf_programs(map);
> +		if (attr->batch.count)
> +			maybe_wait_bpf_programs(map);

Your code logic sounds correct but I feel you are optimizing for extreme
corner cases. In really esp production environment, a fault with something
like copy_to_user() should be extremely rare. So in my opinion, this optimization
is not needed.

>   		bpf_map_write_active_dec(map);
>   	}
>   	fdput(f);
Alexei Starovoitov Dec. 10, 2023, 2:11 a.m. UTC | #2
On Fri, Dec 8, 2023 at 2:55 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
>
> On 12/8/23 2:23 AM, Hou Tao wrote:
> > From: Hou Tao <houtao1@huawei.com>
> >
> > There is no need to call maybe_wait_bpf_programs() if all operations in
> > batched update, deletion, or lookup_and_deletion fail. So only call
> > maybe_wait_bpf_programs() if at least one map operation succeeds.
> >
> > Similar with uattr->batch.count which is used to return the number of
> > succeeded map operations to userspace application, use attr->batch.count
> > to record the number of succeeded map operations in kernel. Sometimes
> > these two number may be different. For example, in
> > __htab_map_lookup_and_delete_batch(do_delete=true), it is possible that
> > 10 items in current bucket have been successfully deleted, but copying
> > the deleted keys to userspace application fails, attr->batch.count will
> > be 10 but uattr->batch.count will be 0 instead.
> >
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
> > ---
> >   include/linux/bpf.h  | 14 +++++++-------
> >   kernel/bpf/hashtab.c | 20 +++++++++++---------
> >   kernel/bpf/syscall.c | 21 ++++++++++++++-------
> >   3 files changed, 32 insertions(+), 23 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index f7aa255c634f..a0c4d696a231 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -81,17 +81,17 @@ struct bpf_map_ops {
> >       int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key);
> >       void (*map_release_uref)(struct bpf_map *map);
> >       void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
> > -     int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr,
> > +     int (*map_lookup_batch)(struct bpf_map *map, union bpf_attr *attr,
> >                               union bpf_attr __user *uattr);
> >       int (*map_lookup_and_delete_elem)(struct bpf_map *map, void *key,
> >                                         void *value, u64 flags);
> >       int (*map_lookup_and_delete_batch)(struct bpf_map *map,
> > -                                        const union bpf_attr *attr,
> > +                                        union bpf_attr *attr,
> >                                          union bpf_attr __user *uattr);
> >       int (*map_update_batch)(struct bpf_map *map, struct file *map_file,
> > -                             const union bpf_attr *attr,
> > +                             union bpf_attr *attr,
> >                               union bpf_attr __user *uattr);
> > -     int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr,
> > +     int (*map_delete_batch)(struct bpf_map *map, union bpf_attr *attr,
> >                               union bpf_attr __user *uattr);
> >
> >       /* funcs callable from userspace and from eBPF programs */
> > @@ -2095,13 +2095,13 @@ void bpf_map_area_free(void *base);
> >   bool bpf_map_write_active(const struct bpf_map *map);
> >   void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr);
> >   int  generic_map_lookup_batch(struct bpf_map *map,
> > -                           const union bpf_attr *attr,
> > +                           union bpf_attr *attr,
> >                             union bpf_attr __user *uattr);
> >   int  generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> > -                           const union bpf_attr *attr,
> > +                           union bpf_attr *attr,
> >                             union bpf_attr __user *uattr);
> >   int  generic_map_delete_batch(struct bpf_map *map,
> > -                           const union bpf_attr *attr,
> > +                           union bpf_attr *attr,
> >                             union bpf_attr __user *uattr);
> >   struct bpf_map *bpf_map_get_curr_or_next(u32 *id);
> >   struct bpf_prog *bpf_prog_get_curr_or_next(u32 *id);
> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > index 5b9146fa825f..b777bd8d4f8d 100644
> > --- a/kernel/bpf/hashtab.c
> > +++ b/kernel/bpf/hashtab.c
> > @@ -1673,7 +1673,7 @@ static int htab_lru_percpu_map_lookup_and_delete_elem(struct bpf_map *map,
> >
> >   static int
> >   __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> > -                                const union bpf_attr *attr,
> > +                                union bpf_attr *attr,
> >                                  union bpf_attr __user *uattr,
> >                                  bool do_delete, bool is_lru_map,
> >                                  bool is_percpu)
> > @@ -1708,6 +1708,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >       if (!max_count)
> >               return 0;
> >
> > +     attr->batch.count = 0;
> >       if (put_user(0, &uattr->batch.count))
> >               return -EFAULT;
> >
> > @@ -1845,6 +1846,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
> >               }
> >               dst_key += key_size;
> >               dst_val += value_size;
> > +             attr->batch.count++;
> >       }
> >
> >       htab_unlock_bucket(htab, b, batch, flags);
>
> [...]
>
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index efda2353a7d5..d2641e51a1a7 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -1695,7 +1695,7 @@ static int map_get_next_key(union bpf_attr *attr)
> >   }
> >
> >   int generic_map_delete_batch(struct bpf_map *map,
> > -                          const union bpf_attr *attr,
> > +                          union bpf_attr *attr,
> >                            union bpf_attr __user *uattr)
> >   {
> >       void __user *keys = u64_to_user_ptr(attr->batch.keys);
> > @@ -1715,6 +1715,7 @@ int generic_map_delete_batch(struct bpf_map *map,
> >       if (!max_count)
> >               return 0;
> >
> > +     attr->batch.count = 0;
> >       if (put_user(0, &uattr->batch.count))
> >               return -EFAULT;
> >
> > @@ -1742,6 +1743,8 @@ int generic_map_delete_batch(struct bpf_map *map,
> >                       break;
> >               cond_resched();
> >       }
> > +
> > +     attr->batch.count = cp;
> >       if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
> >               err = -EFAULT;
> >
> > @@ -1751,7 +1754,7 @@ int generic_map_delete_batch(struct bpf_map *map,
> >   }
> >
> >   int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> > -                          const union bpf_attr *attr,
> > +                          union bpf_attr *attr,
> >                            union bpf_attr __user *uattr)
> >   {
> >       void __user *values = u64_to_user_ptr(attr->batch.values);
> > @@ -1774,6 +1777,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> >       if (!max_count)
> >               return 0;
> >
> > +     attr->batch.count = 0;
> >       if (put_user(0, &uattr->batch.count))
> >               return -EFAULT;
> >
> > @@ -1802,6 +1806,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> >               cond_resched();
> >       }
> >
> > +     attr->batch.count = cp;
> >       if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
> >               err = -EFAULT;
> >
> > @@ -1813,9 +1818,8 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> >
> >   #define MAP_LOOKUP_RETRIES 3
> >
> > -int generic_map_lookup_batch(struct bpf_map *map,
> > -                                 const union bpf_attr *attr,
> > -                                 union bpf_attr __user *uattr)
> > +int generic_map_lookup_batch(struct bpf_map *map, union bpf_attr *attr,
> > +                          union bpf_attr __user *uattr)
> >   {
> >       void __user *uobatch = u64_to_user_ptr(attr->batch.out_batch);
> >       void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch);
> > @@ -1838,6 +1842,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
> >       if (!max_count)
> >               return 0;
> >
> > +     attr->batch.count = 0;
> >       if (put_user(0, &uattr->batch.count))
> >               return -EFAULT;
> >
> > @@ -1903,6 +1908,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
> >       if (err == -EFAULT)
> >               goto free_buf;
> >
> > +     attr->batch.count = cp;
>
> You don't need to change generic_map_lookup_batch() here. It won't trigger
> maybe_wait_bpf_programs().
>
> >       if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) ||
> >                   (cp && copy_to_user(uobatch, prev_key, map->key_size))))
> >               err = -EFAULT;
> > @@ -4926,7 +4932,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
> >               err = fn(__VA_ARGS__);          \
> >       } while (0)
> >
> > -static int bpf_map_do_batch(const union bpf_attr *attr,
> > +static int bpf_map_do_batch(union bpf_attr *attr,
> >                           union bpf_attr __user *uattr,
> >                           int cmd)
> >   {
> > @@ -4966,7 +4972,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
> >               BPF_DO_BATCH(map->ops->map_delete_batch, map, attr, uattr);
> >   err_put:
> >       if (has_write) {
> > -             maybe_wait_bpf_programs(map);
> > +             if (attr->batch.count)
> > +                     maybe_wait_bpf_programs(map);
>
> Your code logic sounds correct but I feel you are optimizing for extreme
> corner cases. In really esp production environment, a fault with something
> like copy_to_user() should be extremely rare. So in my opinion, this optimization
> is not needed.

+1
the code is fine as-is.
Hou Tao Dec. 11, 2023, 1:11 a.m. UTC | #3
Hi

On 12/10/2023 10:11 AM, Alexei Starovoitov wrote:
> On Fri, Dec 8, 2023 at 2:55 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>> On 12/8/23 2:23 AM, Hou Tao wrote:
>>> From: Hou Tao <houtao1@huawei.com>
>>>
>>> There is no need to call maybe_wait_bpf_programs() if all operations in
>>> batched update, deletion, or lookup_and_deletion fail. So only call
>>> maybe_wait_bpf_programs() if at least one map operation succeeds.
>>>

SNIP
>>> +     attr->batch.count = 0;
>>>       if (put_user(0, &uattr->batch.count))
>>>               return -EFAULT;
>>>
>>> @@ -1903,6 +1908,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
>>>       if (err == -EFAULT)
>>>               goto free_buf;
>>>
>>> +     attr->batch.count = cp;
>> You don't need to change generic_map_lookup_batch() here. It won't trigger
>> maybe_wait_bpf_programs().
>>
>>>       if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) ||
>>>                   (cp && copy_to_user(uobatch, prev_key, map->key_size))))
>>>               err = -EFAULT;
>>> @@ -4926,7 +4932,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
>>>               err = fn(__VA_ARGS__);          \
>>>       } while (0)
>>>
>>> -static int bpf_map_do_batch(const union bpf_attr *attr,
>>> +static int bpf_map_do_batch(union bpf_attr *attr,
>>>                           union bpf_attr __user *uattr,
>>>                           int cmd)
>>>   {
>>> @@ -4966,7 +4972,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
>>>               BPF_DO_BATCH(map->ops->map_delete_batch, map, attr, uattr);
>>>   err_put:
>>>       if (has_write) {
>>> -             maybe_wait_bpf_programs(map);
>>> +             if (attr->batch.count)
>>> +                     maybe_wait_bpf_programs(map);
>> Your code logic sounds correct but I feel you are optimizing for extreme
>> corner cases. In really esp production environment, a fault with something
>> like copy_to_user() should be extremely rare. So in my opinion, this optimization
>> is not needed.
> +1
> the code is fine as-is.

Thanks for suggestions. It is indeed an over-optimization for a rare
scenario. Will drop it.
diff mbox series

Patch

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f7aa255c634f..a0c4d696a231 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -81,17 +81,17 @@  struct bpf_map_ops {
 	int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key);
 	void (*map_release_uref)(struct bpf_map *map);
 	void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
-	int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr,
+	int (*map_lookup_batch)(struct bpf_map *map, union bpf_attr *attr,
 				union bpf_attr __user *uattr);
 	int (*map_lookup_and_delete_elem)(struct bpf_map *map, void *key,
 					  void *value, u64 flags);
 	int (*map_lookup_and_delete_batch)(struct bpf_map *map,
-					   const union bpf_attr *attr,
+					   union bpf_attr *attr,
 					   union bpf_attr __user *uattr);
 	int (*map_update_batch)(struct bpf_map *map, struct file *map_file,
-				const union bpf_attr *attr,
+				union bpf_attr *attr,
 				union bpf_attr __user *uattr);
-	int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr,
+	int (*map_delete_batch)(struct bpf_map *map, union bpf_attr *attr,
 				union bpf_attr __user *uattr);
 
 	/* funcs callable from userspace and from eBPF programs */
@@ -2095,13 +2095,13 @@  void bpf_map_area_free(void *base);
 bool bpf_map_write_active(const struct bpf_map *map);
 void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr);
 int  generic_map_lookup_batch(struct bpf_map *map,
-			      const union bpf_attr *attr,
+			      union bpf_attr *attr,
 			      union bpf_attr __user *uattr);
 int  generic_map_update_batch(struct bpf_map *map, struct file *map_file,
-			      const union bpf_attr *attr,
+			      union bpf_attr *attr,
 			      union bpf_attr __user *uattr);
 int  generic_map_delete_batch(struct bpf_map *map,
-			      const union bpf_attr *attr,
+			      union bpf_attr *attr,
 			      union bpf_attr __user *uattr);
 struct bpf_map *bpf_map_get_curr_or_next(u32 *id);
 struct bpf_prog *bpf_prog_get_curr_or_next(u32 *id);
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 5b9146fa825f..b777bd8d4f8d 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -1673,7 +1673,7 @@  static int htab_lru_percpu_map_lookup_and_delete_elem(struct bpf_map *map,
 
 static int
 __htab_map_lookup_and_delete_batch(struct bpf_map *map,
-				   const union bpf_attr *attr,
+				   union bpf_attr *attr,
 				   union bpf_attr __user *uattr,
 				   bool do_delete, bool is_lru_map,
 				   bool is_percpu)
@@ -1708,6 +1708,7 @@  __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 	if (!max_count)
 		return 0;
 
+	attr->batch.count = 0;
 	if (put_user(0, &uattr->batch.count))
 		return -EFAULT;
 
@@ -1845,6 +1846,7 @@  __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 		}
 		dst_key += key_size;
 		dst_val += value_size;
+		attr->batch.count++;
 	}
 
 	htab_unlock_bucket(htab, b, batch, flags);
@@ -1900,7 +1902,7 @@  __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 }
 
 static int
-htab_percpu_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
+htab_percpu_map_lookup_batch(struct bpf_map *map, union bpf_attr *attr,
 			     union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
@@ -1909,7 +1911,7 @@  htab_percpu_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
 
 static int
 htab_percpu_map_lookup_and_delete_batch(struct bpf_map *map,
-					const union bpf_attr *attr,
+					union bpf_attr *attr,
 					union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
@@ -1917,7 +1919,7 @@  htab_percpu_map_lookup_and_delete_batch(struct bpf_map *map,
 }
 
 static int
-htab_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
+htab_map_lookup_batch(struct bpf_map *map, union bpf_attr *attr,
 		      union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
@@ -1926,7 +1928,7 @@  htab_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
 
 static int
 htab_map_lookup_and_delete_batch(struct bpf_map *map,
-				 const union bpf_attr *attr,
+				 union bpf_attr *attr,
 				 union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
@@ -1935,7 +1937,7 @@  htab_map_lookup_and_delete_batch(struct bpf_map *map,
 
 static int
 htab_lru_percpu_map_lookup_batch(struct bpf_map *map,
-				 const union bpf_attr *attr,
+				 union bpf_attr *attr,
 				 union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
@@ -1944,7 +1946,7 @@  htab_lru_percpu_map_lookup_batch(struct bpf_map *map,
 
 static int
 htab_lru_percpu_map_lookup_and_delete_batch(struct bpf_map *map,
-					    const union bpf_attr *attr,
+					    union bpf_attr *attr,
 					    union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
@@ -1952,7 +1954,7 @@  htab_lru_percpu_map_lookup_and_delete_batch(struct bpf_map *map,
 }
 
 static int
-htab_lru_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
+htab_lru_map_lookup_batch(struct bpf_map *map, union bpf_attr *attr,
 			  union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
@@ -1961,7 +1963,7 @@  htab_lru_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
 
 static int
 htab_lru_map_lookup_and_delete_batch(struct bpf_map *map,
-				     const union bpf_attr *attr,
+				     union bpf_attr *attr,
 				     union bpf_attr __user *uattr)
 {
 	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index efda2353a7d5..d2641e51a1a7 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1695,7 +1695,7 @@  static int map_get_next_key(union bpf_attr *attr)
 }
 
 int generic_map_delete_batch(struct bpf_map *map,
-			     const union bpf_attr *attr,
+			     union bpf_attr *attr,
 			     union bpf_attr __user *uattr)
 {
 	void __user *keys = u64_to_user_ptr(attr->batch.keys);
@@ -1715,6 +1715,7 @@  int generic_map_delete_batch(struct bpf_map *map,
 	if (!max_count)
 		return 0;
 
+	attr->batch.count = 0;
 	if (put_user(0, &uattr->batch.count))
 		return -EFAULT;
 
@@ -1742,6 +1743,8 @@  int generic_map_delete_batch(struct bpf_map *map,
 			break;
 		cond_resched();
 	}
+
+	attr->batch.count = cp;
 	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
 		err = -EFAULT;
 
@@ -1751,7 +1754,7 @@  int generic_map_delete_batch(struct bpf_map *map,
 }
 
 int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
-			     const union bpf_attr *attr,
+			     union bpf_attr *attr,
 			     union bpf_attr __user *uattr)
 {
 	void __user *values = u64_to_user_ptr(attr->batch.values);
@@ -1774,6 +1777,7 @@  int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
 	if (!max_count)
 		return 0;
 
+	attr->batch.count = 0;
 	if (put_user(0, &uattr->batch.count))
 		return -EFAULT;
 
@@ -1802,6 +1806,7 @@  int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
 		cond_resched();
 	}
 
+	attr->batch.count = cp;
 	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
 		err = -EFAULT;
 
@@ -1813,9 +1818,8 @@  int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
 
 #define MAP_LOOKUP_RETRIES 3
 
-int generic_map_lookup_batch(struct bpf_map *map,
-				    const union bpf_attr *attr,
-				    union bpf_attr __user *uattr)
+int generic_map_lookup_batch(struct bpf_map *map, union bpf_attr *attr,
+			     union bpf_attr __user *uattr)
 {
 	void __user *uobatch = u64_to_user_ptr(attr->batch.out_batch);
 	void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch);
@@ -1838,6 +1842,7 @@  int generic_map_lookup_batch(struct bpf_map *map,
 	if (!max_count)
 		return 0;
 
+	attr->batch.count = 0;
 	if (put_user(0, &uattr->batch.count))
 		return -EFAULT;
 
@@ -1903,6 +1908,7 @@  int generic_map_lookup_batch(struct bpf_map *map,
 	if (err == -EFAULT)
 		goto free_buf;
 
+	attr->batch.count = cp;
 	if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) ||
 		    (cp && copy_to_user(uobatch, prev_key, map->key_size))))
 		err = -EFAULT;
@@ -4926,7 +4932,7 @@  static int bpf_task_fd_query(const union bpf_attr *attr,
 		err = fn(__VA_ARGS__);		\
 	} while (0)
 
-static int bpf_map_do_batch(const union bpf_attr *attr,
+static int bpf_map_do_batch(union bpf_attr *attr,
 			    union bpf_attr __user *uattr,
 			    int cmd)
 {
@@ -4966,7 +4972,8 @@  static int bpf_map_do_batch(const union bpf_attr *attr,
 		BPF_DO_BATCH(map->ops->map_delete_batch, map, attr, uattr);
 err_put:
 	if (has_write) {
-		maybe_wait_bpf_programs(map);
+		if (attr->batch.count)
+			maybe_wait_bpf_programs(map);
 		bpf_map_write_active_dec(map);
 	}
 	fdput(f);