Message ID | 20240220081327.2678-1-dwagner@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [blktests,v0] nvme/029: reserve hugepages for lager allocations | expand |
Thanks for the fix. On Feb 20, 2024 / 09:13, Daniel Wagner wrote: > The test is issuing larger IO workload. This depends on being able to > allocate larger chucks of linear memory. nvme-cli used to use libhugetbl I guess, s/chuck/chunk/ s/libhugetbl/libhugetlb/ > to automatically allocate the HugeTBL pool. Though nvme-cli dropped the s/HugeTBL/HugeTLB/ > dependency on the library, thus the test needs to provision the system > accordingly. > > Reported-by: Yi Zhang <yi.zhang@redhat.com> > Tested-by: Yi Zhang <yi.zhang@redhat.com> > Signed-off-by: Daniel Wagner <dwagner@suse.de> I found the problem was reported and discussed in a GitHub issue [1]. How about to add a "Link:" tag for reference? [1] https://github.com/linux-nvme/nvme-cli/issues/2218 > --- > tests/nvme/029 | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/tests/nvme/029 b/tests/nvme/029 > index db6e8b91f707..7833fd29e235 100755 > --- a/tests/nvme/029 > +++ b/tests/nvme/029 > @@ -54,6 +54,7 @@ test() { > _setup_nvmet > > local nvmedev > + local reset_nr_hugepages=false > > _nvmet_target_setup > > @@ -62,6 +63,11 @@ test() { > nvmedev=$(_find_nvme_dev "${def_subsysnqn}") > _check_uuid "${nvmedev}" > > + if [[ "$(cat /proc/sys/vm/nr_hugepages)" -eq 0 ]]; then > + echo 20 > /proc/sys/vm/nr_hugepages > + reset_nr_hugepages=true > + fi I found this changes makes the test case fail when the kernel does not have CONFIG_HUGETLBFS. Without the config, /proc/sys/vm/nr_hugepages does not exist. When CONFIG_HUGETLBFS is not defined, should we skip this test case? If this is the case, we can add "_have_kernel_option HUGETLBFS" in requires(). If not, we should check existence of /proc/sys/vm/nr_hugepages before touching it, like: if [[ -r /proc/sys/vm/nr_hugepages && "$(cat /proc/sys/vm/nr_hugepages)" -eq 0 ]]; then Also I suggest to add in-line comment to help understanding why nr_hugepages sysfs needs change. Something like, # nvme-cli may fail to allocate linear memory for rather large IO buffers. # Increase nr_hugepages to allow nvme-cli to try the linear memory allocation # from HugeTLB pool. > + > local dev="/dev/${nvmedev}n1" > test_user_io "$dev" 1 512 > "$FULL" 2>&1 || echo FAIL > test_user_io "$dev" 1 511 > "$FULL" 2>&1 || echo FAIL > @@ -70,6 +76,10 @@ test() { > test_user_io "$dev" 511 1023 > "$FULL" 2>&1 || echo FAIL > test_user_io "$dev" 511 1025 > "$FULL" 2>&1 || echo FAIL > > + if [[ ${reset_nr_hugepages} = true ]]; then > + echo 0 > /proc/sys/vm/nr_hugepages > + fi > + > _nvme_disconnect_subsys "${def_subsysnqn}" >> "$FULL" 2>&1 > > _nvmet_target_cleanup > -- > 2.43.1 >
On Wed, Feb 21, 2024 at 06:22:29AM +0000, Shinichiro Kawasaki wrote: > I found this changes makes the test case fail when the kernel does not have > CONFIG_HUGETLBFS. Without the config, /proc/sys/vm/nr_hugepages does not > exist. > > When CONFIG_HUGETLBFS is not defined, should we skip this test case? Obviously, we should aim for really solid test cases. Though it is not guaranteed that the test will pass even with CONFIG_HUGETLBS enabled. I suspect we would need to make some more preparation steps that the allocation has a high change to pass. Though I haven't really looked into what the necessary steps would be. The sysfs exposes a few more knobs to play with. > If this is > the case, we can add "_have_kernel_option HUGETLBFS" in requires(). If not, we > should check existence of /proc/sys/vm/nr_hugepages before touching it, like: > > if [[ -r /proc/sys/vm/nr_hugepages && > "$(cat /proc/sys/vm/nr_hugepages)" -eq 0 ]]; then Sure, I'll add this and also fix the typos in the commit message. > > Also I suggest to add in-line comment to help understanding why nr_hugepages > sysfs needs change. Something like, > > # nvme-cli may fail to allocate linear memory for rather large IO buffers. > # Increase nr_hugepages to allow nvme-cli to try the linear memory allocation > # from HugeTLB pool. Ok.
On 2/21/24 16:37, Daniel Wagner wrote: > On Wed, Feb 21, 2024 at 06:22:29AM +0000, Shinichiro Kawasaki wrote: >> I found this changes makes the test case fail when the kernel does not have >> CONFIG_HUGETLBFS. Without the config, /proc/sys/vm/nr_hugepages does not >> exist. >> >> When CONFIG_HUGETLBFS is not defined, should we skip this test case? > > Obviously, we should aim for really solid test cases. Though it is not > guaranteed that the test will pass even with CONFIG_HUGETLBS enabled. I > suspect we would need to make some more preparation steps that the > allocation has a high change to pass. Though I haven't really looked > into what the necessary steps would be. The sysfs exposes a few more > knobs to play with. "echo 3 > /proc/sys/vm/drop_caches" before mounting hugetlbfs should allow for the big pages to fit... Still no guarantees but likely that will lower setup failure frequency. > >> If this is >> the case, we can add "_have_kernel_option HUGETLBFS" in requires(). If not, we >> should check existence of /proc/sys/vm/nr_hugepages before touching it, like: >> >> if [[ -r /proc/sys/vm/nr_hugepages && >> "$(cat /proc/sys/vm/nr_hugepages)" -eq 0 ]]; then > > Sure, I'll add this and also fix the typos in the commit message. > >> >> Also I suggest to add in-line comment to help understanding why nr_hugepages >> sysfs needs change. Something like, >> >> # nvme-cli may fail to allocate linear memory for rather large IO buffers. >> # Increase nr_hugepages to allow nvme-cli to try the linear memory allocation >> # from HugeTLB pool. > > Ok. >
diff --git a/tests/nvme/029 b/tests/nvme/029 index db6e8b91f707..7833fd29e235 100755 --- a/tests/nvme/029 +++ b/tests/nvme/029 @@ -54,6 +54,7 @@ test() { _setup_nvmet local nvmedev + local reset_nr_hugepages=false _nvmet_target_setup @@ -62,6 +63,11 @@ test() { nvmedev=$(_find_nvme_dev "${def_subsysnqn}") _check_uuid "${nvmedev}" + if [[ "$(cat /proc/sys/vm/nr_hugepages)" -eq 0 ]]; then + echo 20 > /proc/sys/vm/nr_hugepages + reset_nr_hugepages=true + fi + local dev="/dev/${nvmedev}n1" test_user_io "$dev" 1 512 > "$FULL" 2>&1 || echo FAIL test_user_io "$dev" 1 511 > "$FULL" 2>&1 || echo FAIL @@ -70,6 +76,10 @@ test() { test_user_io "$dev" 511 1023 > "$FULL" 2>&1 || echo FAIL test_user_io "$dev" 511 1025 > "$FULL" 2>&1 || echo FAIL + if [[ ${reset_nr_hugepages} = true ]]; then + echo 0 > /proc/sys/vm/nr_hugepages + fi + _nvme_disconnect_subsys "${def_subsysnqn}" >> "$FULL" 2>&1 _nvmet_target_cleanup