diff mbox series

[bpf-next,v2,1/4] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen

Message ID 20230427200409.1785263-2-sdf@google.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 22 this patch: 22
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 20 this patch: 20
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 22 this patch: 22
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 32 lines checked
netdev/kdoc success Errors and warnings before: 12 this patch: 12
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 fail Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16

Commit Message

Stanislav Fomichev April 27, 2023, 8:04 p.m. UTC
With the way the hooks implemented right now, we have a special
condition: optval larger than PAGE_SIZE will expose only first 4k into
BPF; any modifications to the optval are ignored. If the BPF program
doesn't handle this condition by resetting optlen to 0,
the userspace will get EFAULT.

The intention of the EFAULT was to make it apparent to the
developers that the program is doing something wrong.
However, this inadvertently might affect production workloads
with the BPF programs that are not too careful (i.e., returning EFAULT
for perfectly valid setsockopt/getsockopt calls).

Let's try to minimize the chance of BPF program screwing up userspace
by ignoring the output of those BPF programs (instead of returning
EFAULT to the userspace). pr_info_once those cases to
the dmesg to help with figuring out what's going wrong.

Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 kernel/bpf/cgroup.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Martin KaFai Lau May 1, 2023, 5:51 a.m. UTC | #1
On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>   		.optname = optname,
>   		.current_task = current,
>   	};
> +	int orig_optlen;
>   	int ret;
>   
> +	orig_optlen = max_optlen;

For getsockopt, when the kernel's getsockopt finished successfully (the 
following 'if (!retval)' case), how about also setting orig_optlen to the kernel 
returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel 
returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something 
 > PAGE_SIZE, -EFAULT will be returned.

>   	ctx.optlen = max_optlen;
>   	max_optlen = sockopt_alloc_buf(&ctx, max_optlen, &buf);
>   	if (max_optlen < 0)
> @@ -1922,6 +1929,11 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>   		goto out;
>   
>   	if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
> +		if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
> +			pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
> +				     ctx.optlen, max_optlen);
> +			goto out;
> +		}
>   		ret = -EFAULT;
>   		goto out;
>   	}
Stanislav Fomichev May 1, 2023, 4:55 p.m. UTC | #2
On Sun, Apr 30, 2023 at 10:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> > @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
> >               .optname = optname,
> >               .current_task = current,
> >       };
> > +     int orig_optlen;
> >       int ret;
> >
> > +     orig_optlen = max_optlen;
>
> For getsockopt, when the kernel's getsockopt finished successfully (the
> following 'if (!retval)' case), how about also setting orig_optlen to the kernel
> returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel
> returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something
>  > PAGE_SIZE, -EFAULT will be returned.

Wouldn't it defeat the purpose? Or am I missing something?

ctx.optlen would still be 8096, not 1024, right (regardless of what
the kernel returns)?
So it would trigger EFAULT case which we try to avoid.
Martin KaFai Lau May 1, 2023, 6:58 p.m. UTC | #3
On 5/1/23 9:55 AM, Stanislav Fomichev wrote:
> On Sun, Apr 30, 2023 at 10:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
>>> @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>>>                .optname = optname,
>>>                .current_task = current,
>>>        };
>>> +     int orig_optlen;
>>>        int ret;
>>>
>>> +     orig_optlen = max_optlen;
>>
>> For getsockopt, when the kernel's getsockopt finished successfully (the
>> following 'if (!retval)' case), how about also setting orig_optlen to the kernel
>> returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel
>> returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something
>>   > PAGE_SIZE, -EFAULT will be returned.
> 
> Wouldn't it defeat the purpose? Or am I missing something?
> 
> ctx.optlen would still be 8096, not 1024, right (regardless of what
> the kernel returns)?
> So it would trigger EFAULT case which we try to avoid.

My understanding is the ctx.optlen should be 1024 after the 'if (!retval)' 
statement.

The 'int __user *optlen' arg has the kernel returned optlen (1024). The 'int 
max_optlen' arg has the original user's optlen (8096).

int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
				       int optname, char __user *optval,
				       int __user *optlen /* 1024 */,
				       int max_optlen /* 8096 */,
				       int retval)
{
	/* ... */

	orig_optlen = max_optlen; /* orig_optlen == 8096 */
	ctx.optlen = max_optlen;  /* ctx.optlen == 8096 */

	
	if (!retval) {
		/* If kernel getsockopt finished successfully,
		 * copy whatever was returned to the user back
		 * into our temporary buffer. Set optlen to the
		 * one that kernel returned as well to let
		 * BPF programs inspect the value.
		 */

		if (get_user(ctx.optlen, optlen)) {
			ret = -EFAULT;
			goto out;
		}

		/* ctx.optlen == 1024 */

		orig_optlen = ctx.optlen;
	}

	/* ... */
}
Stanislav Fomichev May 1, 2023, 7:33 p.m. UTC | #4
On Mon, May 1, 2023 at 11:58 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 5/1/23 9:55 AM, Stanislav Fomichev wrote:
> > On Sun, Apr 30, 2023 at 10:52 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 4/27/23 1:04 PM, Stanislav Fomichev wrote:
> >>> @@ -1881,8 +1886,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
> >>>                .optname = optname,
> >>>                .current_task = current,
> >>>        };
> >>> +     int orig_optlen;
> >>>        int ret;
> >>>
> >>> +     orig_optlen = max_optlen;
> >>
> >> For getsockopt, when the kernel's getsockopt finished successfully (the
> >> following 'if (!retval)' case), how about also setting orig_optlen to the kernel
> >> returned 'optlen'. For example, the user's orig_optlen is 8096 and the kernel
> >> returned optlen is 1024. If the bpf prog still sets the ctx.optlen to something
> >>   > PAGE_SIZE, -EFAULT will be returned.
> >
> > Wouldn't it defeat the purpose? Or am I missing something?
> >
> > ctx.optlen would still be 8096, not 1024, right (regardless of what
> > the kernel returns)?
> > So it would trigger EFAULT case which we try to avoid.
>
> My understanding is the ctx.optlen should be 1024 after the 'if (!retval)'
> statement.

Ah, you're right, thanks! Will add your suggestion.


> The 'int __user *optlen' arg has the kernel returned optlen (1024). The 'int
> max_optlen' arg has the original user's optlen (8096).
>
> int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
>                                        int optname, char __user *optval,
>                                        int __user *optlen /* 1024 */,
>                                        int max_optlen /* 8096 */,
>                                        int retval)
> {
>         /* ... */
>
>         orig_optlen = max_optlen; /* orig_optlen == 8096 */
>         ctx.optlen = max_optlen;  /* ctx.optlen == 8096 */
>
>
>         if (!retval) {
>                 /* If kernel getsockopt finished successfully,
>                  * copy whatever was returned to the user back
>                  * into our temporary buffer. Set optlen to the
>                  * one that kernel returned as well to let
>                  * BPF programs inspect the value.
>                  */
>
>                 if (get_user(ctx.optlen, optlen)) {
>                         ret = -EFAULT;
>                         goto out;
>                 }
>
>                 /* ctx.optlen == 1024 */
>
>                 orig_optlen = ctx.optlen;
>         }
>
>         /* ... */
> }
diff mbox series

Patch

diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index a06e118a9be5..e041159a1ce0 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -1826,6 +1826,11 @@  int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
 		ret = 1;
 	} else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
 		/* optlen is out of bounds */
+		if (*optlen > PAGE_SIZE && ctx.optlen >= 0) {
+			pr_info_once("bpf setsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
+				     ctx.optlen, max_optlen);
+			goto out;
+		}
 		ret = -EFAULT;
 	} else {
 		/* optlen within bounds, run kernel handler */
@@ -1881,8 +1886,10 @@  int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 		.optname = optname,
 		.current_task = current,
 	};
+	int orig_optlen;
 	int ret;
 
+	orig_optlen = max_optlen;
 	ctx.optlen = max_optlen;
 	max_optlen = sockopt_alloc_buf(&ctx, max_optlen, &buf);
 	if (max_optlen < 0)
@@ -1922,6 +1929,11 @@  int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
 		goto out;
 
 	if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
+		if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
+			pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
+				     ctx.optlen, max_optlen);
+			goto out;
+		}
 		ret = -EFAULT;
 		goto out;
 	}