Message ID | 66c29336.050a0220.395e9a.76bf@mx.google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/mmap.c: make brk() check RLIMIT_AS before page-aligning requested amount | expand |
On Sun, Aug 18, 2024 at 08:35:00PM GMT, Kartavya Vashishtha wrote: > Currently, brk() only checks against RLIMIT_DATA when validating whether > the requested amount of memory is valid with respect to rlimit. > > RLIMIT_AS is checked later in the `may_expand_vm` call in `do_brk_flags`, > but that call occurs after aligning the new brk to a page boundary, making > the following possible: > > 1. Allocate a non-page-sized amount of memory with brk() > > 2. brk() will internally page-align the requested amount, and allocate > the necessary amount of pages. > > 3. Set RLIMIT_AS to 1 byte using setrlimit. > > 4. Calling brk() again with a small increment (such that it does not > overflow to the next page) will succeed. > > This violates setrlimit RLIMIT_AS, since the call succeeds despite a > 1 byte limit. > > The following code snippet reproduces this behavior: > > ``` > > int main() { > void * mem = malloc(4096); > sbrk(32); > > // set RLIMIT_AS for the processe's address space to 1 byte > // This causes all future calls to sbrk to fail > struct rlimit lim; > getrlimit(RLIMIT_AS, &lim); > lim.rlim_cur = 1; > printf("lim.rlim_max: %ld\n", lim.rlim_max); > setrlimit(RLIMIT_AS, &lim); > > printf("Mallocing an additional 8 bytes, which requires more > "memory from sbrk, but sbrk SHOULD fail\n"); > void * ptr = sbrk(8); > printf("sbrk result: %p\n", ptr); > if (ptr != -1) { > printf("sbrk unexpectedly passed\n"); > } else { > printf("sbrk expectedly failed\n"); > } > free(mem); > } > ``` > > Signed-off-by: Kartavya Vashishtha <sendtokartavya@gmail.com> > --- > mm/mmap.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index d0dfc85b209b..5f7fc6591323 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -253,8 +253,8 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) > * segment grow beyond its set limit the in case where the limit is > * not page aligned -Ram Gupta > */ > - if (check_data_rlimit(rlimit(RLIMIT_DATA), brk, mm->start_brk, > - mm->end_data, mm->start_data)) > + if (check_data_rlimit(min(rlimit(RLIMIT_AS), rlimit(RLIMIT_DATA)), > + brk, mm->start_brk, mm->end_data, mm->start_data)) > goto out; > > newbrk = PAGE_ALIGN(brk); > -- > 2.34.1 > > Thanks for your patch (I think your first? :), I don't think this is correct however. The address space usage is not changing between these calls, it's staying exactly the same (in fact, your efforts to avoid going over a page boundary ensure this). All memory allocations in the kernel are performed at page granularity, as this is the only way to actually allocate and map memory. As per the man page: Lowering the soft limit for a resource below the process's current consumption of that resource will succeed (but will prevent the process from further increasing its consumption of the resource). So the limit will not prevent usage of existing resource (i.e. the address space usage). It seems that we make an effort with RLIMIT_DATA to actually proactively prevent usage beyond the limit, again as per the man page: RLIMIT_DATA This is the maximum size of the process's data segment (initialized data, uninitialized data, and heap). The limit is specified in bytes, and is rounded down to the system page size. This limit affects calls to brk(2), sbrk(2), and (since Linux 4.7) mmap(2), which fail with the error ENOMEM upon encountering the soft limit of this resource. Note 'encountering' the soft limit of this resource. In the case of RLIMIT_AS: RLIMIT_AS ... This limit affects calls to brk(2), mmap(2), and mremap(2), which fail with the error ENOMEM upon exceeding this limit. ... Note 'exceeding' this limit.
diff --git a/mm/mmap.c b/mm/mmap.c index d0dfc85b209b..5f7fc6591323 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -253,8 +253,8 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) * segment grow beyond its set limit the in case where the limit is * not page aligned -Ram Gupta */ - if (check_data_rlimit(rlimit(RLIMIT_DATA), brk, mm->start_brk, - mm->end_data, mm->start_data)) + if (check_data_rlimit(min(rlimit(RLIMIT_AS), rlimit(RLIMIT_DATA)), + brk, mm->start_brk, mm->end_data, mm->start_data)) goto out; newbrk = PAGE_ALIGN(brk);
Currently, brk() only checks against RLIMIT_DATA when validating whether the requested amount of memory is valid with respect to rlimit. RLIMIT_AS is checked later in the `may_expand_vm` call in `do_brk_flags`, but that call occurs after aligning the new brk to a page boundary, making the following possible: 1. Allocate a non-page-sized amount of memory with brk() 2. brk() will internally page-align the requested amount, and allocate the necessary amount of pages. 3. Set RLIMIT_AS to 1 byte using setrlimit. 4. Calling brk() again with a small increment (such that it does not overflow to the next page) will succeed. This violates setrlimit RLIMIT_AS, since the call succeeds despite a 1 byte limit. The following code snippet reproduces this behavior: ``` int main() { void * mem = malloc(4096); sbrk(32); // set RLIMIT_AS for the processe's address space to 1 byte // This causes all future calls to sbrk to fail struct rlimit lim; getrlimit(RLIMIT_AS, &lim); lim.rlim_cur = 1; printf("lim.rlim_max: %ld\n", lim.rlim_max); setrlimit(RLIMIT_AS, &lim); printf("Mallocing an additional 8 bytes, which requires more "memory from sbrk, but sbrk SHOULD fail\n"); void * ptr = sbrk(8); printf("sbrk result: %p\n", ptr); if (ptr != -1) { printf("sbrk unexpectedly passed\n"); } else { printf("sbrk expectedly failed\n"); } free(mem); } ``` Signed-off-by: Kartavya Vashishtha <sendtokartavya@gmail.com> --- mm/mmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)