Message ID | 20250107-virtual_address_range-tests-v1-1-3834a2fb47fe@linutronix.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | selftests/mm: virtual_address_range: Two bugfixes and a cleanup | expand |
On 07/01/25 8:44 pm, Thomas Weißschuh wrote: > If not enough physical memory is available the kernel may fail mmap(); > see __vm_enough_memory() and vm_commit_limit(). > In that case the logic in validate_complete_va_space() does not make > sense and will even incorrectly fail. > Instead skip the test if no mmap() succeeded. > > Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") > Cc: stable@vger.kernel.org > Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> > > --- > The logic in __vm_enough_memory() seems weird. > It describes itself as "Check that a process has enough memory to > allocate a new virtual mapping", however it never checks the current > memory usage of the process. > So it only disallows large mappings. But many small mappings taking the > same amount of memory are allowed; and then even automatically merged > into one big mapping. > --- > tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c > index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644 > --- a/tools/testing/selftests/mm/virtual_address_range.c > +++ b/tools/testing/selftests/mm/virtual_address_range.c > @@ -178,6 +178,12 @@ int main(int argc, char *argv[]) > validate_addr(ptr[i], 0); > } > lchunks = i; > + > + if (!lchunks) { > + ksft_test_result_skip("Not enough memory for a single chunk\n"); > + ksft_finished(); > + } > + > hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *)); > if (hptr == NULL) { > ksft_test_result_skip("Memory constraint not fulfilled\n"); > I do not know about __vm_enough_memory(), but I am going by your description: You say that the kernel may fail mmap() when enough physical memory is not there, but it may happen that we have already done 100 mmap()'s, and then the kernel fails mmap(), so if (!lchunks) won't be able to handle this case. Basically, lchunks == 0 is not a complete indicator of kernel failing mmap(). The basic assumption of the test is that any process should be able to exhaust its virtual address space, and running the test under memory pressure and the kernel violating this behaviour defeats the point of the test I think?
On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote: > > On 07/01/25 8:44 pm, Thomas Weißschuh wrote: > > If not enough physical memory is available the kernel may fail mmap(); > > see __vm_enough_memory() and vm_commit_limit(). > > In that case the logic in validate_complete_va_space() does not make > > sense and will even incorrectly fail. > > Instead skip the test if no mmap() succeeded. > > > > Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") > > Cc: stable@vger.kernel.org > > Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> > > > > --- > > The logic in __vm_enough_memory() seems weird. > > It describes itself as "Check that a process has enough memory to > > allocate a new virtual mapping", however it never checks the current > > memory usage of the process. > > So it only disallows large mappings. But many small mappings taking the > > same amount of memory are allowed; and then even automatically merged > > into one big mapping. > > --- > > tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c > > index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644 > > --- a/tools/testing/selftests/mm/virtual_address_range.c > > +++ b/tools/testing/selftests/mm/virtual_address_range.c > > @@ -178,6 +178,12 @@ int main(int argc, char *argv[]) > > validate_addr(ptr[i], 0); > > } > > lchunks = i; > > + > > + if (!lchunks) { > > + ksft_test_result_skip("Not enough memory for a single chunk\n"); > > + ksft_finished(); > > + } > > + > > hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *)); > > if (hptr == NULL) { > > ksft_test_result_skip("Memory constraint not fulfilled\n"); > > > > I do not know about __vm_enough_memory(), but I am going by your description: > You say that the kernel may fail mmap() when enough physical memory is not > there, but it may happen that we have already done 100 mmap()'s, and then > the kernel fails mmap(), so if (!lchunks) won't be able to handle this case. > Basically, lchunks == 0 is not a complete indicator of kernel failing mmap(). __vm_enough_memory() only checks the size of each single mmap() on its own. It does not actually check the current memory or address space usage of the process. This seems a bit weird, as indicated in my after-the-fold explanation. > The basic assumption of the test is that any process should be able to exhaust > its virtual address space, and running the test under memory pressure and the > kernel violating this behaviour defeats the point of the test I think? The assumption is correct, as soon as one mapping succeeds the others will also succeed, until the actual address space is exhausted. Looking at it again, __vm_enough_memory() is only called for writable mappings, so it would be possible to use only readable mappings in the test. The test will still fail with OOM, as the many PTEs need more than 1GiB of physical memory anyways, but at least that produces a usable error message. However I'm not sure if this would violate other test assumptions.
On 08.01.25 09:05, Thomas Weißschuh wrote: > On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote: >> >> On 07/01/25 8:44 pm, Thomas Weißschuh wrote: >>> If not enough physical memory is available the kernel may fail mmap(); >>> see __vm_enough_memory() and vm_commit_limit(). >>> In that case the logic in validate_complete_va_space() does not make >>> sense and will even incorrectly fail. >>> Instead skip the test if no mmap() succeeded. >>> >>> Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") >>> Cc: stable@vger.kernel.org CC stable on tests is ... odd. >>> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> >>> >>> --- >>> The logic in __vm_enough_memory() seems weird. >>> It describes itself as "Check that a process has enough memory to >>> allocate a new virtual mapping", however it never checks the current >>> memory usage of the process. >>> So it only disallows large mappings. But many small mappings taking the >>> same amount of memory are allowed; and then even automatically merged >>> into one big mapping. >>> --- >>> tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c >>> index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644 >>> --- a/tools/testing/selftests/mm/virtual_address_range.c >>> +++ b/tools/testing/selftests/mm/virtual_address_range.c >>> @@ -178,6 +178,12 @@ int main(int argc, char *argv[]) >>> validate_addr(ptr[i], 0); >>> } >>> lchunks = i; >>> + >>> + if (!lchunks) { >>> + ksft_test_result_skip("Not enough memory for a single chunk\n"); >>> + ksft_finished(); >>> + } >>> + >>> hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *)); >>> if (hptr == NULL) { >>> ksft_test_result_skip("Memory constraint not fulfilled\n"); >>> >> >> I do not know about __vm_enough_memory(), but I am going by your description: >> You say that the kernel may fail mmap() when enough physical memory is not >> there, but it may happen that we have already done 100 mmap()'s, and then >> the kernel fails mmap(), so if (!lchunks) won't be able to handle this case. >> Basically, lchunks == 0 is not a complete indicator of kernel failing mmap(). > > __vm_enough_memory() only checks the size of each single mmap() on its > own. It does not actually check the current memory or address space > usage of the process. > This seems a bit weird, as indicated in my after-the-fold explanation. > >> The basic assumption of the test is that any process should be able to exhaust >> its virtual address space, and running the test under memory pressure and the >> kernel violating this behaviour defeats the point of the test I think? > > The assumption is correct, as soon as one mapping succeeds the others > will also succeed, until the actual address space is exhausted. > > Looking at it again, __vm_enough_memory() is only called for writable > mappings, so it would be possible to use only readable mappings in the > test. The test will still fail with OOM, as the many PTEs need more than > 1GiB of physical memory anyways, but at least that produces a usable > error message. > However I'm not sure if this would violate other test assumptions. > Note that with MAP_NORESRVE, most setups we care about will allow mapping as much as you want, but on access OOM will fire. So one could require that /proc/sys/vm/overcommit_memory is setup properly and use MAP_NORESRVE. Reading from anonymous memory will populate the shared zeropage. To mitigate OOM from "too many page tables", one could simply unmap the pieces as they are verified (or MAP_FIXED over them, to free page tables).
On Wed, Jan 08, 2025 at 02:36:57PM +0100, David Hildenbrand wrote: > On 08.01.25 09:05, Thomas Weißschuh wrote: > > On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote: > > > > > > On 07/01/25 8:44 pm, Thomas Weißschuh wrote: > > > > If not enough physical memory is available the kernel may fail mmap(); > > > > see __vm_enough_memory() and vm_commit_limit(). > > > > In that case the logic in validate_complete_va_space() does not make > > > > sense and will even incorrectly fail. > > > > Instead skip the test if no mmap() succeeded. > > > > > > > > Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") > > > > Cc: stable@vger.kernel.org > > CC stable on tests is ... odd. I thought it was fairly common, but it isn't. Will drop it. > > > > Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> > > > > > > > > --- > > > > The logic in __vm_enough_memory() seems weird. > > > > It describes itself as "Check that a process has enough memory to > > > > allocate a new virtual mapping", however it never checks the current > > > > memory usage of the process. > > > > So it only disallows large mappings. But many small mappings taking the > > > > same amount of memory are allowed; and then even automatically merged > > > > into one big mapping. > > > > --- > > > > tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++ > > > > 1 file changed, 6 insertions(+) > > > > > > > > diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c > > > > index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644 > > > > --- a/tools/testing/selftests/mm/virtual_address_range.c > > > > +++ b/tools/testing/selftests/mm/virtual_address_range.c > > > > @@ -178,6 +178,12 @@ int main(int argc, char *argv[]) > > > > validate_addr(ptr[i], 0); > > > > } > > > > lchunks = i; > > > > + > > > > + if (!lchunks) { > > > > + ksft_test_result_skip("Not enough memory for a single chunk\n"); > > > > + ksft_finished(); > > > > + } > > > > + > > > > hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *)); > > > > if (hptr == NULL) { > > > > ksft_test_result_skip("Memory constraint not fulfilled\n"); > > > > > > > > > > I do not know about __vm_enough_memory(), but I am going by your description: > > > You say that the kernel may fail mmap() when enough physical memory is not > > > there, but it may happen that we have already done 100 mmap()'s, and then > > > the kernel fails mmap(), so if (!lchunks) won't be able to handle this case. > > > Basically, lchunks == 0 is not a complete indicator of kernel failing mmap(). > > > > __vm_enough_memory() only checks the size of each single mmap() on its > > own. It does not actually check the current memory or address space > > usage of the process. > > This seems a bit weird, as indicated in my after-the-fold explanation. > > > > > The basic assumption of the test is that any process should be able to exhaust > > > its virtual address space, and running the test under memory pressure and the > > > kernel violating this behaviour defeats the point of the test I think? > > > > The assumption is correct, as soon as one mapping succeeds the others > > will also succeed, until the actual address space is exhausted. > > > > Looking at it again, __vm_enough_memory() is only called for writable > > mappings, so it would be possible to use only readable mappings in the > > test. The test will still fail with OOM, as the many PTEs need more than > > 1GiB of physical memory anyways, but at least that produces a usable > > error message. > > However I'm not sure if this would violate other test assumptions. > > > > Note that with MAP_NORESRVE, most setups we care about will allow mapping as > much as you want, but on access OOM will fire. Thanks for the hint. > So one could require that /proc/sys/vm/overcommit_memory is setup properly > and use MAP_NORESRVE. Isn't the check for lchunks == 0 essentially exactly this? > Reading from anonymous memory will populate the shared zeropage. To mitigate > OOM from "too many page tables", one could simply unmap the pieces as they > are verified (or MAP_FIXED over them, to free page tables). The code has to figure out if a verified region was created by mmap(), otherwise an munmap() could crash the process. As the entries from /proc/self/maps may have been merged and (I assume) the ordering of mappings is not guaranteed, some bespoke logic to establish the link will be needed. Is it fine to rely on CONFIG_ANON_VMA_NAME? That would make it much easier to implement. Using MAP_NORESERVE and eager munmap()s, the testcase works nicely even in very low physical memory conditions. Thomas
On 08.01.25 17:13, Thomas Weißschuh wrote: > On Wed, Jan 08, 2025 at 02:36:57PM +0100, David Hildenbrand wrote: >> On 08.01.25 09:05, Thomas Weißschuh wrote: >>> On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote: >>>> >>>> On 07/01/25 8:44 pm, Thomas Weißschuh wrote: >>>>> If not enough physical memory is available the kernel may fail mmap(); >>>>> see __vm_enough_memory() and vm_commit_limit(). >>>>> In that case the logic in validate_complete_va_space() does not make >>>>> sense and will even incorrectly fail. >>>>> Instead skip the test if no mmap() succeeded. >>>>> >>>>> Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") >>>>> Cc: stable@vger.kernel.org >> >> CC stable on tests is ... odd. > > I thought it was fairly common, but it isn't. > Will drop it. As it's not really a "kernel BUG", it's rather uncommon. >> >> Note that with MAP_NORESRVE, most setups we care about will allow mapping as >> much as you want, but on access OOM will fire. > > Thanks for the hint. > >> So one could require that /proc/sys/vm/overcommit_memory is setup properly >> and use MAP_NORESRVE. > > Isn't the check for lchunks == 0 essentially exactly this? I assume paired with MAP_NORESERVE? Maybe, but it could be better to have something that says "if overcommit_memory is not setup properly I will SKIP this test", but otherwise I expect this to work and will FAIL if it doesn't". Or would you expect to run into lchunks == 0 even if overcommit_memory is setup properly and MAP_NORESERVE is used? (very very low memory that we cannot even create all the VMAs?) > >> Reading from anonymous memory will populate the shared zeropage. To mitigate >> OOM from "too many page tables", one could simply unmap the pieces as they >> are verified (or MAP_FIXED over them, to free page tables). > > The code has to figure out if a verified region was created by mmap(), > otherwise an munmap() could crash the process. > As the entries from /proc/self/maps may have been merged and (I assume) Yes, and partial unmap (in chunk granularity?) would split them again. > the ordering of mappings is not guaranteed, some bespoke logic to establish > the link will be needed. My thinking was that you simply process one /proc/self/maps entry in some chunks. After processing a chunk, you munmap() it. So you would process + munmap in chunks. > > Is it fine to rely on CONFIG_ANON_VMA_NAME? > That would make it much easier to implement. Can you elaborate how you would do it? > > Using MAP_NORESERVE and eager munmap()s, the testcase works nicely even > in very low physical memory conditions. Cool.
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644 --- a/tools/testing/selftests/mm/virtual_address_range.c +++ b/tools/testing/selftests/mm/virtual_address_range.c @@ -178,6 +178,12 @@ int main(int argc, char *argv[]) validate_addr(ptr[i], 0); } lchunks = i; + + if (!lchunks) { + ksft_test_result_skip("Not enough memory for a single chunk\n"); + ksft_finished(); + } + hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *)); if (hptr == NULL) { ksft_test_result_skip("Memory constraint not fulfilled\n");
If not enough physical memory is available the kernel may fail mmap(); see __vm_enough_memory() and vm_commit_limit(). In that case the logic in validate_complete_va_space() does not make sense and will even incorrectly fail. Instead skip the test if no mmap() succeeded. Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> --- The logic in __vm_enough_memory() seems weird. It describes itself as "Check that a process has enough memory to allocate a new virtual mapping", however it never checks the current memory usage of the process. So it only disallows large mappings. But many small mappings taking the same amount of memory are allowed; and then even automatically merged into one big mapping. --- tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++ 1 file changed, 6 insertions(+)