diff mbox series

[RFC,3/8] openat2: explicitly return -E2BIG for (usize > PAGE_SIZE)

Message ID 20240902-extensible-structs-check_fields-v1-3-545e93ede2f2@cyphar.com (mailing list archive)
State New
Headers show
Series extensible syscalls: CHECK_FIELDS to allow for easier feature detection | expand

Commit Message

Aleksa Sarai Sept. 2, 2024, 7:06 a.m. UTC
While we do currently return -EFAULT in this case, it seems prudent to
follow the behaviour of other syscalls like clone3. It seems quite
unlikely that anyone depends on this error code being EFAULT, but we can
always revert this if it turns out to be an issue.

Cc: <stable@vger.kernel.org> # v5.6+
Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 fs/open.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Arnd Bergmann Sept. 2, 2024, 9:09 a.m. UTC | #1
On Mon, Sep 2, 2024, at 07:06, Aleksa Sarai wrote:
> While we do currently return -EFAULT in this case, it seems prudent to
> follow the behaviour of other syscalls like clone3. It seems quite
> unlikely that anyone depends on this error code being EFAULT, but we can
> always revert this if it turns out to be an issue.

Right, it's probably a good idea to have a limit there rather than
having a busy loop with a user-provided length when the only bound is
the available virtual memory.

>  	if (unlikely(usize < OPEN_HOW_SIZE_VER0))
>  		return -EINVAL;
> +	if (unlikely(usize > PAGE_SIZE))
> +		return -E2BIG;
> 

Is PAGE_SIZE significant here? If there is a need to enforce a limit,
I would expect this to be the same regardless of kernel configuration,
since the structure layout is also independent of the configuration.

Where is the current -EFAULT for users passing more than a page?
I only see it for reads beyond the VMA, but not e.g. when checking
terabytes of zero pages from an anonymous mapping.

    Arnd
Aleksa Sarai Sept. 2, 2024, 4:08 p.m. UTC | #2
On 2024-09-02, Arnd Bergmann <arnd@arndb.de> wrote:
> On Mon, Sep 2, 2024, at 07:06, Aleksa Sarai wrote:
> > While we do currently return -EFAULT in this case, it seems prudent to
> > follow the behaviour of other syscalls like clone3. It seems quite
> > unlikely that anyone depends on this error code being EFAULT, but we can
> > always revert this if it turns out to be an issue.
> 
> Right, it's probably a good idea to have a limit there rather than
> having a busy loop with a user-provided length when the only bound is
> the available virtual memory.
> 
> >  	if (unlikely(usize < OPEN_HOW_SIZE_VER0))
> >  		return -EINVAL;
> > +	if (unlikely(usize > PAGE_SIZE))
> > +		return -E2BIG;
> > 
> 
> Is PAGE_SIZE significant here? If there is a need to enforce a limit,
> I would expect this to be the same regardless of kernel configuration,
> since the structure layout is also independent of the configuration.

PAGE_SIZE is what clone3, perf_event_open, sched_setattr, bpf, etc all
use. The idea was that PAGE_SIZE is the absolute limit of any reasonable
extensible structure size because we are never going to have argument
structures that are larger than a page (I think this was discussed in
the original copy_struct_from_user() patchset thread in late 2019, but I
can't find the reference at the moment.)

I simply forgot to add this when I first submitted openat2, the original
intention was to just match the other syscalls.

> Where is the current -EFAULT for users passing more than a page?
> I only see it for reads beyond the VMA, but not e.g. when checking
> terabytes of zero pages from an anonymous mapping.

I meant that we in practice return -EFAULT if you pass a really large
size (because you end up running off the end of mapped memory). There is
no explicit -EFAULT for large sizes, which is exactly the problem. :P

> 
>     Arnd
Arnd Bergmann Sept. 2, 2024, 7:23 p.m. UTC | #3
On Mon, Sep 2, 2024, at 16:08, Aleksa Sarai wrote:
>> >  	if (unlikely(usize < OPEN_HOW_SIZE_VER0))
>> >  		return -EINVAL;
>> > +	if (unlikely(usize > PAGE_SIZE))
>> > +		return -E2BIG;
>> > 
>> 
>> Is PAGE_SIZE significant here? If there is a need to enforce a limit,
>> I would expect this to be the same regardless of kernel configuration,
>> since the structure layout is also independent of the configuration.
>
> PAGE_SIZE is what clone3, perf_event_open, sched_setattr, bpf, etc all
> use. The idea was that PAGE_SIZE is the absolute limit of any reasonable
> extensible structure size because we are never going to have argument
> structures that are larger than a page (I think this was discussed in
> the original copy_struct_from_user() patchset thread in late 2019, but I
> can't find the reference at the moment.)
>
> I simply forgot to add this when I first submitted openat2, the original
> intention was to just match the other syscalls.

Ok, I see. I guess it makes sense to keep this one consistent with the
other ones, but we may want to revisit this in the future and
come up with something that is independent of CONFIG_PAGE_SIZE.

>> Where is the current -EFAULT for users passing more than a page?
>> I only see it for reads beyond the VMA, but not e.g. when checking
>> terabytes of zero pages from an anonymous mapping.
>
> I meant that we in practice return -EFAULT if you pass a really large
> size (because you end up running off the end of mapped memory). There is
> no explicit -EFAULT for large sizes, which is exactly the problem. :P

Got it, thanks.

     Arnd
diff mbox series

Patch

diff --git a/fs/open.c b/fs/open.c
index 22adbef7ecc2..30bfcddd505d 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1458,6 +1458,8 @@  SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename,
 
 	if (unlikely(usize < OPEN_HOW_SIZE_VER0))
 		return -EINVAL;
+	if (unlikely(usize > PAGE_SIZE))
+		return -E2BIG;
 
 	err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize);
 	if (err)