diff mbox series

[net,v4] bpf, net: Check cgroup_bpf_enabled() only once in do_sock_getsockopt()

Message ID 20240821093016.2533-1-Tze-nan.Wu@mediatek.com (mailing list archive)
State New, archived
Headers show
Series [net,v4] bpf, net: Check cgroup_bpf_enabled() only once in do_sock_getsockopt() | expand

Commit Message

Tze-nan Wu Aug. 21, 2024, 9:30 a.m. UTC
The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can change
between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
`BPF_CGROUP_RUN_PROG_GETSOCKOPT`.

If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
"true" between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
`BPF_CGROUP_RUN_PROG_GETSOCKOPT`, `BPF_CGROUP_RUN_PROG_GETSOCKOPT` will
receive an -EFAULT from `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
due to `get_user()` was not reached in `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.

Scenario shown as below:

           `process A`                      `process B`
           -----------                      ------------
  BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
                                            enable CGROUP_GETSOCKOPT
  BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)

To prevent this, invoke `cgroup_bpf_enabled()` only once and cache the
result in a newly added local variable `enabled`.
Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then check their
condition using the same `enabled` variable as the condition variable,
instead of using the return values from `cgroup_bpf_enabled` called by
themselves as the condition variable(which could yield different results).
This ensures that either both `BPF_CGROUP_*` macros pass the condition
or neither does.

Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
---

Chagnes from v1 to v2: https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@mediatek.com/
  Instead of using cgroup_lock in the fastpath, invoke cgroup_bpf_enabled
  only once and cache the value in the newly added variable `enabled`.
  `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check their
  condition with the new variable `enable`, ensuring that either they both
  passing the condition or both do not.

Chagnes from v2 to v3: https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@mediatek.com/
  Hide cgroup_bpf_enabled in the macro, and some modifications to adapt
  the coding style.

Chagnes from v3 to v4: https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@mediatek.com/
  Add bpf tag to subject, and Fixes tag in body.

---
 include/linux/bpf-cgroup.h | 15 ++++++++-------
 net/socket.c               |  5 +++--
 2 files changed, 11 insertions(+), 9 deletions(-)

Comments

Yonghong Song Aug. 21, 2024, 6:44 p.m. UTC | #1
On 8/21/24 2:30 AM, Tze-nan Wu wrote:
> The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can change
> between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
>
> If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
> "true" between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT`, `BPF_CGROUP_RUN_PROG_GETSOCKOPT` will
> receive an -EFAULT from `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
> due to `get_user()` was not reached in `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
>
> Scenario shown as below:
>
>             `process A`                      `process B`
>             -----------                      ------------
>    BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
>                                              enable CGROUP_GETSOCKOPT
>    BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
>
> To prevent this, invoke `cgroup_bpf_enabled()` only once and cache the
> result in a newly added local variable `enabled`.
> Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then check their
> condition using the same `enabled` variable as the condition variable,
> instead of using the return values from `cgroup_bpf_enabled` called by
> themselves as the condition variable(which could yield different results).
> This ensures that either both `BPF_CGROUP_*` macros pass the condition
> or neither does.
>
> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
> Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
> Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> ---
>
> Chagnes from v1 to v2: https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@mediatek.com/
>    Instead of using cgroup_lock in the fastpath, invoke cgroup_bpf_enabled
>    only once and cache the value in the newly added variable `enabled`.
>    `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check their
>    condition with the new variable `enable`, ensuring that either they both
>    passing the condition or both do not.
>
> Chagnes from v2 to v3: https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@mediatek.com/
>    Hide cgroup_bpf_enabled in the macro, and some modifications to adapt
>    the coding style.
>
> Chagnes from v3 to v4: https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@mediatek.com/
>    Add bpf tag to subject, and Fixes tag in body.
>
> ---
>   include/linux/bpf-cgroup.h | 15 ++++++++-------
>   net/socket.c               |  5 +++--
>   2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> index fb3c3e7181e6..5afa2ac76aae 100644
> --- a/include/linux/bpf-cgroup.h
> +++ b/include/linux/bpf-cgroup.h
> @@ -390,20 +390,20 @@ static inline bool cgroup_bpf_sock_enabled(struct sock *sk,
>   	__ret;								       \
>   })
>   
> -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)			       \
> +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled)		       \
>   ({									       \
>   	int __ret = 0;							       \
> -	if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))			       \
> +	enabled = cgroup_bpf_enabled(CGROUP_GETSOCKOPT);		       \
> +	if (enabled)							       \
>   		copy_from_sockptr(&__ret, optlen, sizeof(int));		       \
>   	__ret;								       \
>   })
>   
>   #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, optlen,   \
> -				       max_optlen, retval)		       \
> +				       max_optlen, retval, enabled)	       \
>   ({									       \
>   	int __ret = retval;						       \
> -	if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&			       \
> -	    cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))		       \
> +	if (enabled && cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))       \
>   		if (!(sock)->sk_prot->bpf_bypass_getsockopt ||		       \
>   		    !INDIRECT_CALL_INET_1((sock)->sk_prot->bpf_bypass_getsockopt, \
>   					tcp_bpf_bypass_getsockopt,	       \
> @@ -518,9 +518,10 @@ static inline int bpf_percpu_cgroup_storage_update(struct bpf_map *map,
>   #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; })
>   #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(atype, major, minor, access) ({ 0; })
>   #define BPF_CGROUP_RUN_PROG_SYSCTL(head,table,write,buf,count,pos) ({ 0; })
> -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) ({ 0; })
> +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled) ({ 0; })
>   #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, \
> -				       optlen, max_optlen, retval) ({ retval; })
> +				       optlen, max_optlen, retval, \
> +				       enabled) ({ retval; })
>   #define BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sock, level, optname, optval, \
>   					    optlen, retval) ({ retval; })
>   #define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname, optval, optlen, \
> diff --git a/net/socket.c b/net/socket.c
> index fcbdd5bc47ac..0b465dc8a789 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -2363,6 +2363,7 @@ int do_sock_getsockopt(struct socket *sock, bool compat, int level,
>   		       int optname, sockptr_t optval, sockptr_t optlen)
>   {
>   	int max_optlen __maybe_unused;
> +	bool enabled __maybe_unused;
>   	const struct proto_ops *ops;
>   	int err;
>   
> @@ -2371,7 +2372,7 @@ int do_sock_getsockopt(struct socket *sock, bool compat, int level,
>   		return err;
>   
>   	if (!compat)
> -		max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
> +		max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled);

Here, 'enabled' is actually assigned with a value in the macro. I am not sure
whether this is a common practice or not. At least from macro, it is not clear
about this.

Maybe we can do
	max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, &enabled);

The &enabled signals that its value could change. And indeed
the macro will store the proper value to &enabled properly.

Just my 2 cents.

>   
>   	ops = READ_ONCE(sock->ops);
>   	if (level == SOL_SOCKET) {
> @@ -2390,7 +2391,7 @@ int do_sock_getsockopt(struct socket *sock, bool compat, int level,
>   	if (!compat)
>   		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
>   						     optval, optlen, max_optlen,
> -						     err);
> +						     err, enabled);
>   
>   	return err;
>   }
Alexei Starovoitov Aug. 21, 2024, 9:01 p.m. UTC | #2
On Wed, Aug 21, 2024 at 2:30 AM Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:
>
> The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can change
> between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
>
> If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
> "true" between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> `BPF_CGROUP_RUN_PROG_GETSOCKOPT`, `BPF_CGROUP_RUN_PROG_GETSOCKOPT` will
> receive an -EFAULT from `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
> due to `get_user()` was not reached in `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
>
> Scenario shown as below:
>
>            `process A`                      `process B`
>            -----------                      ------------
>   BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
>                                             enable CGROUP_GETSOCKOPT
>   BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
>
> To prevent this, invoke `cgroup_bpf_enabled()` only once and cache the
> result in a newly added local variable `enabled`.
> Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then check their
> condition using the same `enabled` variable as the condition variable,
> instead of using the return values from `cgroup_bpf_enabled` called by
> themselves as the condition variable(which could yield different results).
> This ensures that either both `BPF_CGROUP_*` macros pass the condition
> or neither does.
>
> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
> Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
> Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> ---
>
> Chagnes from v1 to v2: https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@mediatek.com/
>   Instead of using cgroup_lock in the fastpath, invoke cgroup_bpf_enabled
>   only once and cache the value in the newly added variable `enabled`.
>   `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check their
>   condition with the new variable `enable`, ensuring that either they both
>   passing the condition or both do not.
>
> Chagnes from v2 to v3: https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@mediatek.com/
>   Hide cgroup_bpf_enabled in the macro, and some modifications to adapt
>   the coding style.
>
> Chagnes from v3 to v4: https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@mediatek.com/
>   Add bpf tag to subject, and Fixes tag in body.
>
> ---
>  include/linux/bpf-cgroup.h | 15 ++++++++-------
>  net/socket.c               |  5 +++--
>  2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
> index fb3c3e7181e6..5afa2ac76aae 100644
> --- a/include/linux/bpf-cgroup.h
> +++ b/include/linux/bpf-cgroup.h
> @@ -390,20 +390,20 @@ static inline bool cgroup_bpf_sock_enabled(struct sock *sk,
>         __ret;                                                                 \
>  })
>
> -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)                              \
> +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled)                     \
>  ({                                                                            \
>         int __ret = 0;                                                         \
> -       if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))                             \
> +       enabled = cgroup_bpf_enabled(CGROUP_GETSOCKOPT);                       \
> +       if (enabled)


I suspect the compiler generates slow code after such a patch.
pw-bot: cr

What is the problem with double cgroup_bpf_enabled() check?
yes it might return two different values, so?
Tze-nan Wu Aug. 22, 2024, 3:16 a.m. UTC | #3
On Wed, 2024-08-21 at 14:01 -0700, Alexei Starovoitov wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Wed, Aug 21, 2024 at 2:30 AM Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> wrote:
> >
> > The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can
> change
> > between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
> >
> > If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
> > "true" between the invocations of
> `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`, `BPF_CGROUP_RUN_PROG_GETSOCKOPT`
> will
> > receive an -EFAULT from
> `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
> > due to `get_user()` was not reached in
> `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
> >
> > Scenario shown as below:
> >
> >            `process A`                      `process B`
> >            -----------                      ------------
> >   BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
> >                                             enable
> CGROUP_GETSOCKOPT
> >   BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
> >
> > To prevent this, invoke `cgroup_bpf_enabled()` only once and cache
> the
> > result in a newly added local variable `enabled`.
> > Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then check
> their
> > condition using the same `enabled` variable as the condition
> variable,
> > instead of using the return values from `cgroup_bpf_enabled` called
> by
> > themselves as the condition variable(which could yield different
> results).
> > This ensures that either both `BPF_CGROUP_*` macros pass the
> condition
> > or neither does.
> >
> > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt
> hooks")
> > Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
> > Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
> > Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > ---
> >
> > Chagnes from v1 to v2: 
> https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@mediatek.com/
> >   Instead of using cgroup_lock in the fastpath, invoke
> cgroup_bpf_enabled
> >   only once and cache the value in the newly added variable
> `enabled`.
> >   `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check
> their
> >   condition with the new variable `enable`, ensuring that either
> they both
> >   passing the condition or both do not.
> >
> > Chagnes from v2 to v3: 
> https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@mediatek.com/
> >   Hide cgroup_bpf_enabled in the macro, and some modifications to
> adapt
> >   the coding style.
> >
> > Chagnes from v3 to v4: 
> https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@mediatek.com/
> >   Add bpf tag to subject, and Fixes tag in body.
> >
> > ---
> >  include/linux/bpf-cgroup.h | 15 ++++++++-------
> >  net/socket.c               |  5 +++--
> >  2 files changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-
> cgroup.h
> > index fb3c3e7181e6..5afa2ac76aae 100644
> > --- a/include/linux/bpf-cgroup.h
> > +++ b/include/linux/bpf-cgroup.h
> > @@ -390,20 +390,20 @@ static inline bool
> cgroup_bpf_sock_enabled(struct sock *sk,
> >         __ret;                                                     
>             \
> >  })
> >
> > -#define
> BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)                             
>  \
> > +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> enabled)                     \
> >  ({                                                                
>             \
> >         int __ret =
> 0;                                                         \
> > -       if
> (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))                             \
> > +       enabled =
> cgroup_bpf_enabled(CGROUP_GETSOCKOPT);                       \
> > +       if (enabled)
> 
> 
> I suspect the compiler generates slow code after such a patch.
> pw-bot: cr
> 
> What is the problem with double cgroup_bpf_enabled() check?
> yes it might return two different values, so?

Depending on where the -EFAULT occurs, the problem could be different.
In our case, the -EFAULT is returned from getsockopt() during a
"bootup-critical property setting" flow in Android. As a result, the
property setting fails due it get -EFAULT from getsockopt(), causing
the device to fail the boot process.

Should the userspace caller always anticipate an -EFAULT from
getsockopt() if there's another process enables CGROUP_GETSOCKOPT
(possibly through the bpf() syscall) at the same time?
If that's the case, then I will handle this in userspace.

Thanks,
--tze-nan
Tze-nan Wu Aug. 22, 2024, 3:28 a.m. UTC | #4
On Wed, 2024-08-21 at 11:44 -0700, Yonghong Song wrote:
>  	 
>  
> On 8/21/24 2:30 AM, Tze-nan Wu wrote:
> > The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can
> change
> > between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
> >
> > If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false" to
> > "true" between the invocations of
> `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`, `BPF_CGROUP_RUN_PROG_GETSOCKOPT`
> will
> > receive an -EFAULT from
> `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
> > due to `get_user()` was not reached in
> `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
> >
> > Scenario shown as below:
> >
> >             `process A`                      `process B`
> >             -----------                      ------------
> >    BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
> >                                              enable
> CGROUP_GETSOCKOPT
> >    BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
> >
> > To prevent this, invoke `cgroup_bpf_enabled()` only once and cache
> the
> > result in a newly added local variable `enabled`.
> > Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then check
> their
> > condition using the same `enabled` variable as the condition
> variable,
> > instead of using the return values from `cgroup_bpf_enabled` called
> by
> > themselves as the condition variable(which could yield different
> results).
> > This ensures that either both `BPF_CGROUP_*` macros pass the
> condition
> > or neither does.
> >
> > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt
> hooks")
> > Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
> > Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
> > Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > ---
> >
> > Chagnes from v1 to v2: 
> https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@mediatek.com/
> >    Instead of using cgroup_lock in the fastpath, invoke
> cgroup_bpf_enabled
> >    only once and cache the value in the newly added variable
> `enabled`.
> >    `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check
> their
> >    condition with the new variable `enable`, ensuring that either
> they both
> >    passing the condition or both do not.
> >
> > Chagnes from v2 to v3: 
> https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@mediatek.com/
> >    Hide cgroup_bpf_enabled in the macro, and some modifications to
> adapt
> >    the coding style.
> >
> > Chagnes from v3 to v4: 
> https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@mediatek.com/
> >    Add bpf tag to subject, and Fixes tag in body.
> >
> > ---
> >   include/linux/bpf-cgroup.h | 15 ++++++++-------
> >   net/socket.c               |  5 +++--
> >   2 files changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-
> cgroup.h
> > index fb3c3e7181e6..5afa2ac76aae 100644
> > --- a/include/linux/bpf-cgroup.h
> > +++ b/include/linux/bpf-cgroup.h
> > @@ -390,20 +390,20 @@ static inline bool
> cgroup_bpf_sock_enabled(struct sock *sk,
> >   __ret;       \
> >   })
> >   
> > -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)       \
> > +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled)       \
> >   ({       \
> >   int __ret = 0;       \
> > -if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))       \
> > +enabled = cgroup_bpf_enabled(CGROUP_GETSOCKOPT);       \
> > +if (enabled)       \
> >   copy_from_sockptr(&__ret, optlen, sizeof(int));       \
> >   __ret;       \
> >   })
> >   
> >   #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
> optval, optlen,   \
> > -       max_optlen, retval)       \
> > +       max_optlen, retval, enabled)       \
> >   ({       \
> >   int __ret = retval;       \
> > -if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&       \
> > -    cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))       \
> > +if (enabled && cgroup_bpf_sock_enabled(sock,
> CGROUP_GETSOCKOPT))       \
> >   if (!(sock)->sk_prot->bpf_bypass_getsockopt ||       \
> >       !INDIRECT_CALL_INET_1((sock)->sk_prot->bpf_bypass_getsockopt, 
> \
> >   tcp_bpf_bypass_getsockopt,       \
> > @@ -518,9 +518,10 @@ static inline int
> bpf_percpu_cgroup_storage_update(struct bpf_map *map,
> >   #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; })
> >   #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(atype, major, minor,
> access) ({ 0; })
> >   #define
> BPF_CGROUP_RUN_PROG_SYSCTL(head,table,write,buf,count,pos) ({ 0; })
> > -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) ({ 0; })
> > +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled) ({ 0; })
> >   #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
> optval, \
> > -       optlen, max_optlen, retval) ({ retval; })
> > +       optlen, max_optlen, retval, \
> > +       enabled) ({ retval; })
> >   #define BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sock, level, optname,
> optval, \
> >       optlen, retval) ({ retval; })
> >   #define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname,
> optval, optlen, \
> > diff --git a/net/socket.c b/net/socket.c
> > index fcbdd5bc47ac..0b465dc8a789 100644
> > --- a/net/socket.c
> > +++ b/net/socket.c
> > @@ -2363,6 +2363,7 @@ int do_sock_getsockopt(struct socket *sock,
> bool compat, int level,
> >          int optname, sockptr_t optval, sockptr_t optlen)
> >   {
> >   int max_optlen __maybe_unused;
> > +bool enabled __maybe_unused;
> >   const struct proto_ops *ops;
> >   int err;
> >   
> > @@ -2371,7 +2372,7 @@ int do_sock_getsockopt(struct socket *sock,
> bool compat, int level,
> >   return err;
> >   
> >   if (!compat)
> > -max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
> > +max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled);
> 
> Here, 'enabled' is actually assigned with a value in the macro. I am
> not sure
> whether this is a common practice or not. At least from macro, it is
> not clear
> about this.
> 
> Maybe we can do
> max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, &enabled);
> 
> The &enabled signals that its value could change. And indeed
> the macro will store the proper value to &enabled properly.
> 
> Just my 2 cents.
> 
Thanks for the suggestion.
Will take the suggestion in v5 if this patch is truely needed,
looks like this patch could possibly lead to regression issue.


> >   
> >   ops = READ_ONCE(sock->ops);
> >   if (level == SOL_SOCKET) {
> > @@ -2390,7 +2391,7 @@ int do_sock_getsockopt(struct socket *sock,
> bool compat, int level,
> >   if (!compat)
> >   err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
> >        optval, optlen, max_optlen,
> > -     err);
> > +     err, enabled);
> >   
> >   return err;
> >   }
Tze-nan Wu Aug. 22, 2024, 7:01 a.m. UTC | #5
On Thu, 2024-08-22 at 11:16 +0800, Tze-nan Wu wrote:
> On Wed, 2024-08-21 at 14:01 -0700, Alexei Starovoitov wrote:
> >  	 
> > External email : Please do not click links or open attachments
> > until
> > you have verified the sender or the content.
> >  On Wed, Aug 21, 2024 at 2:30 AM Tze-nan Wu <
> > Tze-nan.Wu@mediatek.com>
> > wrote:
> > > 
> > > The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can
> > 
> > change
> > > between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
> > > 
> > > If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false"
> > > to
> > > "true" between the invocations of
> > 
> > `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`,
> > > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`
> > 
> > will
> > > receive an -EFAULT from
> > 
> > `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
> > > due to `get_user()` was not reached in
> > 
> > `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
> > > 
> > > Scenario shown as below:
> > > 
> > >            `process A`                      `process B`
> > >            -----------                      ------------
> > >   BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
> > >                                             enable
> > 
> > CGROUP_GETSOCKOPT
> > >   BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
> > > 
> > > To prevent this, invoke `cgroup_bpf_enabled()` only once and
> > > cache
> > 
> > the
> > > result in a newly added local variable `enabled`.
> > > Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then
> > > check
> > 
> > their
> > > condition using the same `enabled` variable as the condition
> > 
> > variable,
> > > instead of using the return values from `cgroup_bpf_enabled`
> > > called
> > 
> > by
> > > themselves as the condition variable(which could yield different
> > 
> > results).
> > > This ensures that either both `BPF_CGROUP_*` macros pass the
> > 
> > condition
> > > or neither does.
> > > 
> > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt
> > 
> > hooks")
> > > Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
> > > Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
> > > Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > > Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > > ---
> > > 
> > > Chagnes from v1 to v2: 
> > 
> > 
https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@mediatek.com/
> > >   Instead of using cgroup_lock in the fastpath, invoke
> > 
> > cgroup_bpf_enabled
> > >   only once and cache the value in the newly added variable
> > 
> > `enabled`.
> > >   `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check
> > 
> > their
> > >   condition with the new variable `enable`, ensuring that either
> > 
> > they both
> > >   passing the condition or both do not.
> > > 
> > > Chagnes from v2 to v3: 
> > 
> > 
https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@mediatek.com/
> > >   Hide cgroup_bpf_enabled in the macro, and some modifications to
> > 
> > adapt
> > >   the coding style.
> > > 
> > > Chagnes from v3 to v4: 
> > 
> > 
https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@mediatek.com/
> > >   Add bpf tag to subject, and Fixes tag in body.
> > > 
> > > ---
> > >  include/linux/bpf-cgroup.h | 15 ++++++++-------
> > >  net/socket.c               |  5 +++--
> > >  2 files changed, 11 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-
> > 
> > cgroup.h
> > > index fb3c3e7181e6..5afa2ac76aae 100644
> > > --- a/include/linux/bpf-cgroup.h
> > > +++ b/include/linux/bpf-cgroup.h
> > > @@ -390,20 +390,20 @@ static inline bool
> > 
> > cgroup_bpf_sock_enabled(struct sock *sk,
> > >         __ret;                                                   
> > >   
> > 
> >             \
> > >  })
> > > 
> > > -#define
> > 
> > BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)                           
> >   
> >  \
> > > +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> > 
> > enabled)                     \
> > >  ({                                                              
> > >   
> > 
> >             \
> > >         int __ret =
> > 
> > 0;                                                         \
> > > -       if
> > 
> > (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))                            
> >  \
> > > +       enabled =
> > 
> > cgroup_bpf_enabled(CGROUP_GETSOCKOPT);                       \
> > > +       if (enabled)
> > 
> > 
> > I suspect the compiler generates slow code after such a patch.
> > pw-bot: cr
> > 
> > What is the problem with double cgroup_bpf_enabled() check?
> > yes it might return two different values, so?

> Depending on where the -EFAULT occurs, the problem could be
> different.
> In our case, the -EFAULT is returned from getsockopt() during a
> "bootup-critical property setting" flow in Android. As a result, the
> property setting fails due it get -EFAULT from getsockopt(), causing
> the device to fail the boot process.
> 
> Should the userspace caller always anticipate an -EFAULT from
> getsockopt() if there's another process enables CGROUP_GETSOCKOPT
> (possibly through the bpf() syscall) at the same time?
> If that's the case, then I will handle this in userspace.
> 

BTW, If this should be handled in kernel, modification shown below
could fix the issue without breaking the "static_branch" usage in both
macros:


+++ /include/linux/bpf-cgroup.h:
    -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
    +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, compat)
     ({
            int __ret = 0;
            if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
                copy_from_sockptr(&__ret, optlen, sizeof(int));
     +      else
     +          *compat = true;
            __ret;
     })

    #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
optval, optlen, max_optlen, retval)
     ({
         int __ret = retval;
    -    if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&
    -        cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
    +    if (cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
             if (!(sock)->sk_prot->bpf_bypass_getsockopt ||
               ...

  +++ /net/socket.c:
    int do_sock_getsockopt(struct socket *sock, bool compat, int level,
     {
        ...
        ...
    +     /* The meaning of `compat` variable could be changed here
    +      * to indicate if cgroup_bpf_enabled(CGROUP_SOCK_OPS) is
false.
    +      */
        if (!compat)
    -       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
    +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
&compat);

> Thanks,
> --tze-nan
Alexei Starovoitov Aug. 22, 2024, 4 p.m. UTC | #6
On Thu, Aug 22, 2024 at 12:02 AM Tze-nan Wu (吳澤南)
<Tze-nan.Wu@mediatek.com> wrote:
>
>
> BTW, If this should be handled in kernel, modification shown below
> could fix the issue without breaking the "static_branch" usage in both
> macros:
>
>
> +++ /include/linux/bpf-cgroup.h:
>     -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
>     +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, compat)
>      ({
>             int __ret = 0;
>             if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
>                 copy_from_sockptr(&__ret, optlen, sizeof(int));
>      +      else
>      +          *compat = true;
>             __ret;
>      })
>
>     #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
> optval, optlen, max_optlen, retval)
>      ({
>          int __ret = retval;
>     -    if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&
>     -        cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
>     +    if (cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
>              if (!(sock)->sk_prot->bpf_bypass_getsockopt ||
>                ...
>
>   +++ /net/socket.c:
>     int do_sock_getsockopt(struct socket *sock, bool compat, int level,
>      {
>         ...
>         ...
>     +     /* The meaning of `compat` variable could be changed here
>     +      * to indicate if cgroup_bpf_enabled(CGROUP_SOCK_OPS) is
> false.
>     +      */
>         if (!compat)
>     -       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
>     +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> &compat);

This is better, but it's still quite a hack. Let's not override it.
We can have another bool, but the question:
do we really need BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN  ?
copy_from_sockptr(&__ret, optlen, sizeof(int));
should be fast enough to do it unconditionally.
What are we saving here?

Stan ?


>
> > Thanks,
> > --tze-nan
>
> *********** MEDIATEK Confidentiality Notice ***********

Pls fix your mailer. Such a footer is not appropriate for the public
mailing list.
Stanislav Fomichev Aug. 24, 2024, 2:04 a.m. UTC | #7
On 08/22, Alexei Starovoitov wrote:
> On Thu, Aug 22, 2024 at 12:02 AM Tze-nan Wu (吳澤南)
> <Tze-nan.Wu@mediatek.com> wrote:
> >
> >
> > BTW, If this should be handled in kernel, modification shown below
> > could fix the issue without breaking the "static_branch" usage in both
> > macros:
> >
> >
> > +++ /include/linux/bpf-cgroup.h:
> >     -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
> >     +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, compat)
> >      ({
> >             int __ret = 0;
> >             if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
> >                 copy_from_sockptr(&__ret, optlen, sizeof(int));
> >      +      else
> >      +          *compat = true;
> >             __ret;
> >      })
> >
> >     #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
> > optval, optlen, max_optlen, retval)
> >      ({
> >          int __ret = retval;
> >     -    if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&
> >     -        cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
> >     +    if (cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
> >              if (!(sock)->sk_prot->bpf_bypass_getsockopt ||
> >                ...
> >
> >   +++ /net/socket.c:
> >     int do_sock_getsockopt(struct socket *sock, bool compat, int level,
> >      {
> >         ...
> >         ...
> >     +     /* The meaning of `compat` variable could be changed here
> >     +      * to indicate if cgroup_bpf_enabled(CGROUP_SOCK_OPS) is
> > false.
> >     +      */
> >         if (!compat)
> >     -       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
> >     +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> > &compat);
> 
> This is better, but it's still quite a hack. Let's not override it.
> We can have another bool, but the question:
> do we really need BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN  ?
> copy_from_sockptr(&__ret, optlen, sizeof(int));
> should be fast enough to do it unconditionally.
> What are we saving here?
> 
> Stan ?

Agreed, most likely nobody would notice :-)
Tze-nan Wu Aug. 29, 2024, 12:44 p.m. UTC | #8
On Fri, 2024-08-23 at 19:04 -0700, Stanislav Fomichev wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On 08/22, Alexei Starovoitov wrote:
> > On Thu, Aug 22, 2024 at 12:02 AM Tze-nan Wu (吳澤南)
> > <Tze-nan.Wu@mediatek.com> wrote:
> > >
> > >
> > > BTW, If this should be handled in kernel, modification shown
> below
> > > could fix the issue without breaking the "static_branch" usage in
> both
> > > macros:
> > >
> > >
> > > +++ /include/linux/bpf-cgroup.h:
> > >     -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
> > >     +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, compat)
> > >      ({
> > >             int __ret = 0;
> > >             if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
> > >                 copy_from_sockptr(&__ret, optlen, sizeof(int));
> > >      +      else
> > >      +          *compat = true;
> > >             __ret;
> > >      })
> > >
> > >     #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
> > > optval, optlen, max_optlen, retval)
> > >      ({
> > >          int __ret = retval;
> > >     -    if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&
> > >     -        cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
> > >     +    if (cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
> > >              if (!(sock)->sk_prot->bpf_bypass_getsockopt ||
> > >                ...
> > >
> > >   +++ /net/socket.c:
> > >     int do_sock_getsockopt(struct socket *sock, bool compat, int
> level,
> > >      {
> > >         ...
> > >         ...
> > >     +     /* The meaning of `compat` variable could be changed
> here
> > >     +      * to indicate if cgroup_bpf_enabled(CGROUP_SOCK_OPS)
> is
> > > false.
> > >     +      */
> > >         if (!compat)
> > >     -       max_optlen =
> BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
> > >     +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> > > &compat);
> > 
> > This is better, but it's still quite a hack. Let's not override it.
> > We can have another bool, but the question:
> > do we really need BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN  ?
> > copy_from_sockptr(&__ret, optlen, sizeof(int));
> > should be fast enough to do it unconditionally.
> > What are we saving here?
> > 
> > Stan ?
> 
> Agreed, most likely nobody would notice :-)

Sorry for my late reply, just have the mailer fixed.

If it is feasible to make the `copy_from_sockptr` unconditionally,
should I submit a new patch that resolve the issue by removing
`BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`? Patch A shown as below.

  +++ /net/socket.c:
   int do_sock_getsockopt(...)
   {
  -     int max_optlen __maybe_unused;
  +     int max_optlen __maybe_unused = 0;
        const struct proto_ops *ops;
        int err;
  ...
  ...
        if (!compat) <== wonder if we should keep the condition here?
  -         max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
  +         copy_from_sockptr(&max_optlen, optlen, sizeof(int));

        ops = READ_ONCE(sock->ops);
        if (level == SOL_SOCKET) {
-----------------------------------------

Or perhaps adding another variable "enabled" is the preferable way?
As it keeps the static_branch behavior.
Patch B shown as below:

  +++ /include/linux/bpf-cgroup.h:
  -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
  +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled)
   ({
        int __ret = 0;
        if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
            copy_from_sockptr(&__ret, optlen, sizeof(int));
  +     else
  +         *enabled = false;
        __ret;
   })

  +++ /net/socket.c:
   int do_sock_getsockopt(...)
   {
  +   bool enabled __maybe_unused = !compat;
      int max_optlen __maybe_unused;
      const struct proto_ops *ops;
      int err;
      if (!compat)
  -       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
  +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
&enabled);

      ops = READ_ONCE(sock->ops);
      ...
      ...
  -   if (!compat)
  +   if (enabled)
          err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(...);
-----------------------------------------

Any comments would be appreciated.
--Tze-nan
Alexei Starovoitov Aug. 29, 2024, 4:27 p.m. UTC | #9
On Thu, Aug 29, 2024 at 5:45 AM Tze-nan Wu (吳澤南)
<Tze-nan.Wu@mediatek.com> wrote:
>
> On Fri, 2024-08-23 at 19:04 -0700, Stanislav Fomichev wrote:
> >
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> >  On 08/22, Alexei Starovoitov wrote:
> > > On Thu, Aug 22, 2024 at 12:02 AM Tze-nan Wu (吳澤南)
> > > <Tze-nan.Wu@mediatek.com> wrote:
> > > >
> > > >
> > > > BTW, If this should be handled in kernel, modification shown
> > below
> > > > could fix the issue without breaking the "static_branch" usage in
> > both
> > > > macros:
> > > >
> > > >
> > > > +++ /include/linux/bpf-cgroup.h:
> > > >     -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
> > > >     +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, compat)
> > > >      ({
> > > >             int __ret = 0;
> > > >             if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
> > > >                 copy_from_sockptr(&__ret, optlen, sizeof(int));
> > > >      +      else
> > > >      +          *compat = true;
> > > >             __ret;
> > > >      })
> > > >
> > > >     #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
> > > > optval, optlen, max_optlen, retval)
> > > >      ({
> > > >          int __ret = retval;
> > > >     -    if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&
> > > >     -        cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
> > > >     +    if (cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
> > > >              if (!(sock)->sk_prot->bpf_bypass_getsockopt ||
> > > >                ...
> > > >
> > > >   +++ /net/socket.c:
> > > >     int do_sock_getsockopt(struct socket *sock, bool compat, int
> > level,
> > > >      {
> > > >         ...
> > > >         ...
> > > >     +     /* The meaning of `compat` variable could be changed
> > here
> > > >     +      * to indicate if cgroup_bpf_enabled(CGROUP_SOCK_OPS)
> > is
> > > > false.
> > > >     +      */
> > > >         if (!compat)
> > > >     -       max_optlen =
> > BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
> > > >     +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> > > > &compat);
> > >
> > > This is better, but it's still quite a hack. Let's not override it.
> > > We can have another bool, but the question:
> > > do we really need BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN  ?
> > > copy_from_sockptr(&__ret, optlen, sizeof(int));
> > > should be fast enough to do it unconditionally.
> > > What are we saving here?
> > >
> > > Stan ?
> >
> > Agreed, most likely nobody would notice :-)
>
> Sorry for my late reply, just have the mailer fixed.
>
> If it is feasible to make the `copy_from_sockptr` unconditionally,
> should I submit a new patch that resolve the issue by removing
> `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`? Patch A shown as below.
>
>   +++ /net/socket.c:
>    int do_sock_getsockopt(...)
>    {
>   -     int max_optlen __maybe_unused;
>   +     int max_optlen __maybe_unused = 0;
>         const struct proto_ops *ops;
>         int err;
>   ...
>   ...
>         if (!compat) <== wonder if we should keep the condition here?
>   -         max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
>   +         copy_from_sockptr(&max_optlen, optlen, sizeof(int));

This one.
And delete the macro from bpf-cgroup.h
diff mbox series

Patch

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index fb3c3e7181e6..5afa2ac76aae 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -390,20 +390,20 @@  static inline bool cgroup_bpf_sock_enabled(struct sock *sk,
 	__ret;								       \
 })
 
-#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)			       \
+#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled)		       \
 ({									       \
 	int __ret = 0;							       \
-	if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))			       \
+	enabled = cgroup_bpf_enabled(CGROUP_GETSOCKOPT);		       \
+	if (enabled)							       \
 		copy_from_sockptr(&__ret, optlen, sizeof(int));		       \
 	__ret;								       \
 })
 
 #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, optlen,   \
-				       max_optlen, retval)		       \
+				       max_optlen, retval, enabled)	       \
 ({									       \
 	int __ret = retval;						       \
-	if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&			       \
-	    cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))		       \
+	if (enabled && cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))       \
 		if (!(sock)->sk_prot->bpf_bypass_getsockopt ||		       \
 		    !INDIRECT_CALL_INET_1((sock)->sk_prot->bpf_bypass_getsockopt, \
 					tcp_bpf_bypass_getsockopt,	       \
@@ -518,9 +518,10 @@  static inline int bpf_percpu_cgroup_storage_update(struct bpf_map *map,
 #define BPF_CGROUP_RUN_PROG_SOCK_OPS(sock_ops) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(atype, major, minor, access) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_SYSCTL(head,table,write,buf,count,pos) ({ 0; })
-#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen) ({ 0; })
+#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled) ({ 0; })
 #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname, optval, \
-				       optlen, max_optlen, retval) ({ retval; })
+				       optlen, max_optlen, retval, \
+				       enabled) ({ retval; })
 #define BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sock, level, optname, optval, \
 					    optlen, retval) ({ retval; })
 #define BPF_CGROUP_RUN_PROG_SETSOCKOPT(sock, level, optname, optval, optlen, \
diff --git a/net/socket.c b/net/socket.c
index fcbdd5bc47ac..0b465dc8a789 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2363,6 +2363,7 @@  int do_sock_getsockopt(struct socket *sock, bool compat, int level,
 		       int optname, sockptr_t optval, sockptr_t optlen)
 {
 	int max_optlen __maybe_unused;
+	bool enabled __maybe_unused;
 	const struct proto_ops *ops;
 	int err;
 
@@ -2371,7 +2372,7 @@  int do_sock_getsockopt(struct socket *sock, bool compat, int level,
 		return err;
 
 	if (!compat)
-		max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
+		max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, enabled);
 
 	ops = READ_ONCE(sock->ops);
 	if (level == SOL_SOCKET) {
@@ -2390,7 +2391,7 @@  int do_sock_getsockopt(struct socket *sock, bool compat, int level,
 	if (!compat)
 		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
 						     optval, optlen, max_optlen,
-						     err);
+						     err, enabled);
 
 	return err;
 }