Message ID | 20240814071424.2655666-1-jeffxu@chromium.org (mailing list archive) |
---|---|
Headers | show |
Series | mremap refactor: check src address for vma boundaries first. | expand |
* jeffxu@chromium.org <jeffxu@chromium.org> [240814 03:14]: > From: Jeff Xu <jeffxu@chromium.org> > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > refactor the code to check src address range before doing anything on > the destination, i.e. destination won't be unmapped, if src address > failed the boundaries check. > > This also allows us to remove can_modify_mm from mremap.c, since > the src address must be single VMA, can_modify_vma is used. I don't think sending out a separate patch to address the same thing as the patch you said you were testing [1] is the correct approach. You had already sent suggestions on mremap changes - why send this patch set instead of making another suggestion? Maybe send your selftest to be included with the initial patch set would work? Does this test pass with the other patch set? [1] https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuRJ-g@mail.mail.com/
On Wed, Aug 14, 2024 at 7:40 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * jeffxu@chromium.org <jeffxu@chromium.org> [240814 03:14]: > > From: Jeff Xu <jeffxu@chromium.org> > > > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > > refactor the code to check src address range before doing anything on > > the destination, i.e. destination won't be unmapped, if src address > > failed the boundaries check. > > > > This also allows us to remove can_modify_mm from mremap.c, since > > the src address must be single VMA, can_modify_vma is used. > > I don't think sending out a separate patch to address the same thing as > the patch you said you were testing [1] is the correct approach. You > had already sent suggestions on mremap changes - why send this patch set > instead of making another suggestion? > As indicated in the cover letter, this patch aims to improve mremap performance while preserving existing mseal's semantics. And this patch can go in-dependantly regardless of in-loop out-loop discussion. [1] link in your email is broken, but I assume you meant Pedro's V1/V2 of in-loop change. In-loop change has a semantic/regression risk to mseal, and will take longer time to review/test/prove and bake. We can leave in-loop discussion in Pedro's thread, I hope the V3 of Pedro's patch adds more testing coverage and addresses existing comments in V2. Thanks -Jeff -Jeff
* Jeff Xu <jeffxu@chromium.org> [240814 12:57]: > On Wed, Aug 14, 2024 at 7:40 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > > > * jeffxu@chromium.org <jeffxu@chromium.org> [240814 03:14]: > > > From: Jeff Xu <jeffxu@chromium.org> > > > > > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > > > refactor the code to check src address range before doing anything on > > > the destination, i.e. destination won't be unmapped, if src address > > > failed the boundaries check. > > > > > > This also allows us to remove can_modify_mm from mremap.c, since > > > the src address must be single VMA, can_modify_vma is used. > > > > I don't think sending out a separate patch to address the same thing as > > the patch you said you were testing [1] is the correct approach. You > > had already sent suggestions on mremap changes - why send this patch set > > instead of making another suggestion? > > > As indicated in the cover letter, this patch aims to improve mremap > performance while preserving existing mseal's semantics. They are not worth preserving. > And this > patch can go in-dependantly regardless of in-loop out-loop discussion. No, it conflicts with the other mremap patch as it changes the same code - in a very similar way. > > [1] link in your email is broken, but I assume you meant Pedro's V1/V2 > of in-loop change. Yes, the email where you delayed discussing the fix so that you could test it. Which brings up the question you didn't answer and deleted: Does your testing pass on those patches? > In-loop change has a semantic/regression risk to > mseal, and will take longer time to review/test/prove and bake. There are no uses, so the risk is minimal. > We can leave in-loop discussion in Pedro's thread, No, it is directly linked to these patches as this should have just been a comment on a patch in that series. > I hope the V3 of > Pedro's patch adds more testing coverage and addresses existing > comments in V2. The majority of the comments to V2 are mine, you only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test". You are holding us hostage by asking for more testing but not sharing what is and is not valid for mseal() - or even answering questions on tests you run. Splitting a vma doesn't change the memory, but that's not allowed for some reason. These patches should be rejected in favour of fixing the feature like it should have been written in the first place. Anything less is just to simplify backports and avoiding testing - "avoiding the business logic". Liam [1] https://lore.kernel.org/all/CALmYWFvURJBgyFw7x5qrL4CqoZjy92NeFAS750XaLxO7o7Cv9A@mail.gmail.com/
On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > The majority of the comments to V2 are mine, you only told us that > splitting a sealed vma is wrong (after I asked you directly to answer) > and then you made a comment about testing of the patch set. Besides the > direct responses to me, your comment was "wait for me to test". > Please share this link for " Besides the direct responses to me, your comment was "wait for me to test". Or pop up that email by responding to it, to remind me. Thanks. > You are holding us hostage by asking for more testing but not sharing > what is and is not valid for mseal() - or even answering questions on > tests you run. https://docs.kernel.org/process/submitting-patches.html#don-t-get-discouraged-or-impatient > These patches should be rejected in favour of fixing the feature like it > should have been written in the first place. This is not ture. Without removing arch_unmap, it is impossible to implement in-loop. And I have mentioned this during initial discussion of mseal patch, as well as when Pedro expressed the interest on in-loop approach. If you like reference, I can find the links for you. I'm glad that arch_unmap is removed now and resulting in much cleaner code, it has always been a question/mysterial to me ever since I read that code. Thanks to Linus's leadership and Michael Ellerman's quick response, this is now resolved. Best regards, -Jeff
* Jeff Xu <jeffxu@chromium.org> [240814 23:46]: > On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett > <Liam.Howlett@oracle.com> wrote: > > The majority of the comments to V2 are mine, you only told us that > > splitting a sealed vma is wrong (after I asked you directly to answer) > > and then you made a comment about testing of the patch set. Besides the > > direct responses to me, your comment was "wait for me to test". > > > Please share this link for " Besides the direct responses to me, your > comment was "wait for me to test". > Or pop up that email by responding to it, to remind me. Thanks. [1]. > > > You are holding us hostage by asking for more testing but not sharing > > what is and is not valid for mseal() - or even answering questions on > > tests you run. > https://docs.kernel.org/process/submitting-patches.html#don-t-get-discouraged-or-impatient If you are implying that I'm impatient, I can assure you that is not the feeling driving these emails. You are just trying to push a patch through that changes the exact code that you said you would test but didn't say how, and you said the testing of another patch was insufficient but didn't say why. Then you send out this fix. > > > These patches should be rejected in favour of fixing the feature like it > > should have been written in the first place. > This is not ture. Yes, it is. > > Without removing arch_unmap, it is impossible to implement in-loop. arch_unmap() is going away, besides.. arch_unmap() could fail today and leave the ppc vdso pointing to NULL, mseal() would introduce a even less likely case of this happening. I asked you about this in v10 [2]. I elaborated in my response, but I doubt you got that far in the email. > And I have mentioned this during initial discussion of mseal patch, as > well as when Pedro expressed the interest on in-loop approach. If you > like reference, I can find the links for you. So the main concern is that ppc is going to mseal the vdso, then fail to unmap it? It would have been better to put a check in the arch_unmap() code in ppc to avoid that - but it will never happen. > > I'm glad that arch_unmap is removed now and resulting in much cleaner > code, If you care at all about cleaner code, please move the mseal check to where it should be - or stop getting in the way of others moving it. > it has always been a question/mysterial to me ever since I read > that code. You could have also looked into what arch_unmap() did, or why it was where it is today. If you had, you would have found that arch_unmap() could be moved lower in the function and allowed in-loop approach - but you didn't bother to find out what it was about. Liam ... [1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuRJ-g@mail.gmail.com/ [2]. https://lore.kernel.org/lkml/3rpmzsxiwo5t2uq7xy5inizbtaasotjtzocxbayw5ntgk5a2rx@jkccjg5mbqqh/
On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Jeff Xu <jeffxu@chromium.org> [240814 23:46]: > > On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett > > <Liam.Howlett@oracle.com> wrote: > > > The majority of the comments to V2 are mine, you only told us that > > > splitting a sealed vma is wrong (after I asked you directly to answer) > > > and then you made a comment about testing of the patch set. Besides the > > > direct responses to me, your comment was "wait for me to test". > > > > > Please share this link for " Besides the direct responses to me, your > > comment was "wait for me to test". > > Or pop up that email by responding to it, to remind me. Thanks. > > [1]. That is responding to Andrew, to indicate V2 patch has dependency on arch_munmap in PPC. And I will review/test the code, I will respond to Andrew directly. PS Your statement above is entirely false, and out of context. " You only told us that splitting a sealed vma is wrong (after I asked you directly to answer) and then you made a comment about testing of the patch set. Besides the direct responses to me, your comment was "wait for me to test". If you will excuse me, I would rather spend time on code/test and other duties than responding to your false accusation. Best regards, -Jeff > > Liam > > ... > > [1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuRJ-g@mail.gmail.com/ > [2]. https://lore.kernel.org/lkml/3rpmzsxiwo5t2uq7xy5inizbtaasotjtzocxbayw5ntgk5a2rx@jkccjg5mbqqh/
On Wed, Aug 14, 2024 at 12:14 AM <jeffxu@chromium.org> wrote: > > From: Jeff Xu <jeffxu@chromium.org> > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > refactor the code to check src address range before doing anything on > the destination, i.e. destination won't be unmapped, if src address > failed the boundaries check. > > This also allows us to remove can_modify_mm from mremap.c, since > the src address must be single VMA, can_modify_vma is used. > > It is likely this will improve the performance on mremap, previously > the code does sealing check using can_modify_mm for the src address range, > and the new code removed the loop (used by can_modify_mm). > > In order to verify this patch doesn't regress on mremap, I added tests in > mseal_test, the test patch can be applied before mremap refactor patch or > checkin independently. > > Also this patch doesn't change mseal's existing schematic: if sealing fail, > user can expect the src/dst address isn't updated. So this patch can be > applied regardless if we decided to go with current out-of-loop approach > or in-loop approach currently in discussion. > > Regarding the perf test report by stress-ng [1] title: > 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression > > The test is using below for testing: > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 > > I can't repro this using ChromeOS, the pagemove test shows large value > of stddev and stderr, and can't reasonably refect the performance impact. > > For example: I write a c program [2] to run the above pagemove test 10 times > and calculate the stddev, stderr, for 3 commits: > > 1> before mseal feature is added: > Ops/sec: > Mean : 3564.40 > Std Dev : 2737.35 (76.80% of Mean) > Std Err : 865.63 (24.29% of Mean) > > 2> after mseal feature is added: > Ops/sec: > Mean : 2703.84 > Std Dev : 2085.13 (77.12% of Mean) > Std Err : 659.38 (24.39% of Mean) > > 3> after current patch (mremap refactor) > Ops/sec: > Mean : 3603.67 > Std Dev : 2422.22 (67.22% of Mean) > Std Err : 765.97 (21.26% of Mean) > > The result shows 21%-24% stderr, this means whatever perf improvment/impact > there might be won't be measured correctly by this test. > > This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. > And I reboot the machine before each test, and take the first 10 runs with > run_stress_ng 10 > > (I will run longer duration to see if test still shows large stdDev,StdErr) > I took more samples (100 run ), the stddev/stderr is smaller, however still not at a range that can reasonably measure the perf improvement here. The tests were taken using the same machine as (10 times run above) and exact the same steps: i.e. change to certain kernel commit, reboot test device, take the first test result. 1> Before mseal feature is added: Statistics: Ops/sec: Mean : 1733.26 Std Dev : 842.13 (48.59% of Mean) Std Err : 84.21 (4.86% of Mean) 2> After mseal feature is added Statistics: Ops/sec: Mean : 1701.53 Std Dev : 1017.29 (59.79% of Mean) Std Err : 101.73 (5.98% of Mean) 3> After mremap refactor (this patch) Statistics: Ops/sec: Mean : 1097.04 Std Dev : 860.67 (78.45% of Mean) Std Err : 86.07 (7.85% of Mean) Summary: even when the stderr is down to 4%-%8 percentage range, the stddev is still too big. Hence, there are other unknown, random variables that impact this test. -Jeff > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > Jeff Xu (2): > mseal:selftest mremap across VMA boundaries. > mseal: refactor mremap to remove can_modify_mm > > mm/internal.h | 24 ++ > mm/mremap.c | 77 +++---- > mm/mseal.c | 17 -- > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > 4 files changed, 353 insertions(+), 58 deletions(-) > > -- > 2.46.0.76.ge559c4bf1a-goog >
* Jeff Xu <jeffxu@google.com> [240815 13:23]: > On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > > > * Jeff Xu <jeffxu@chromium.org> [240814 23:46]: > > > On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett > > > <Liam.Howlett@oracle.com> wrote: > > > > The majority of the comments to V2 are mine, you only told us that > > > > splitting a sealed vma is wrong (after I asked you directly to answer) > > > > and then you made a comment about testing of the patch set. Besides the > > > > direct responses to me, your comment was "wait for me to test". > > > > > > > Please share this link for " Besides the direct responses to me, your > > > comment was "wait for me to test". > > > Or pop up that email by responding to it, to remind me. Thanks. > > > > [1]. > > That is responding to Andrew, to indicate V2 patch has dependency on > arch_munmap in PPC. And I will review/test the code, I will respond to > Andrew directly. > > PS Your statement above is entirely false, and out of context. > > " You only told us that splitting a sealed vma is wrong (after I asked > you directly to answer) and then you made a comment about testing of > the patch set. Besides the direct responses to me, your comment was > "wait for me to test". [1] has your "wait for me to test" to hold up a patch set, [2] has you answering my direct question to you and making the untested comment to someone else. So, entirely true. Liam [1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuRJ-g@mail.gmail.com/ [2]. https://lore.kernel.org/all/CALmYWFvURJBgyFw7x5qrL4CqoZjy92NeFAS750XaLxO7o7Cv9A@mail.gmail.com/
Hi Oliver, On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu <jeffxu@chromium.org> wrote: > > On Wed, Aug 14, 2024 at 12:14 AM <jeffxu@chromium.org> wrote: > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > > refactor the code to check src address range before doing anything on > > the destination, i.e. destination won't be unmapped, if src address > > failed the boundaries check. > > > > This also allows us to remove can_modify_mm from mremap.c, since > > the src address must be single VMA, can_modify_vma is used. > > > > It is likely this will improve the performance on mremap, previously > > the code does sealing check using can_modify_mm for the src address range, > > and the new code removed the loop (used by can_modify_mm). > > > > In order to verify this patch doesn't regress on mremap, I added tests in > > mseal_test, the test patch can be applied before mremap refactor patch or > > checkin independently. > > > > Also this patch doesn't change mseal's existing schematic: if sealing fail, > > user can expect the src/dst address isn't updated. So this patch can be > > applied regardless if we decided to go with current out-of-loop approach > > or in-loop approach currently in discussion. > > > > Regarding the perf test report by stress-ng [1] title: > > 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression > > > > The test is using below for testing: > > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 > > > > I can't repro this using ChromeOS, the pagemove test shows large value > > of stddev and stderr, and can't reasonably refect the performance impact. > > > > For example: I write a c program [2] to run the above pagemove test 10 times > > and calculate the stddev, stderr, for 3 commits: > > > > 1> before mseal feature is added: > > Ops/sec: > > Mean : 3564.40 > > Std Dev : 2737.35 (76.80% of Mean) > > Std Err : 865.63 (24.29% of Mean) > > > > 2> after mseal feature is added: > > Ops/sec: > > Mean : 2703.84 > > Std Dev : 2085.13 (77.12% of Mean) > > Std Err : 659.38 (24.39% of Mean) > > > > 3> after current patch (mremap refactor) > > Ops/sec: > > Mean : 3603.67 > > Std Dev : 2422.22 (67.22% of Mean) > > Std Err : 765.97 (21.26% of Mean) > > > > The result shows 21%-24% stderr, this means whatever perf improvment/impact > > there might be won't be measured correctly by this test. > > > > This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. > > And I reboot the machine before each test, and take the first 10 runs with > > run_stress_ng 10 > > > > (I will run longer duration to see if test still shows large stdDev,StdErr) > > > I took more samples (100 run ), the stddev/stderr is smaller, however > still not at a range that can reasonably measure the perf improvement > here. > > The tests were taken using the same machine as (10 times run above) > and exact the same steps: i.e. change to certain kernel commit, reboot > test device, take the first test result. > > 1> Before mseal feature is added: > Statistics: > Ops/sec: > Mean : 1733.26 > Std Dev : 842.13 (48.59% of Mean) > Std Err : 84.21 (4.86% of Mean) > > 2> After mseal feature is added > Statistics: > Ops/sec: > Mean : 1701.53 > Std Dev : 1017.29 (59.79% of Mean) > Std Err : 101.73 (5.98% of Mean) > > 3> After mremap refactor (this patch) > Statistics: > Ops/sec: > Mean : 1097.04 > Std Dev : 860.67 (78.45% of Mean) > Std Err : 86.07 (7.85% of Mean) > > Summary: even when the stderr is down to 4%-%8 percentage range, the > stddev is still too big. > > Hence, there are other unknown, random variables that impact this test. > I could not repro the 4% degradation with my test machine (Chromebook), this can be entirely due to the specific test and this test machine. Do you think it is possible to do a few more tests ? This time I like to have a larger sample size (100 run) stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 Please run the test for each commit following the exact steps, e.g. reboot the machine, run the test, get the first 100 results for sample. Please don't select or drop any unstable report because then the data will be biased. If possible, please includes stddiv and stderr for the data (or raw data if not possible, and I will do post-processing) for 3 commits: -> this patch. -> after mseal feature -> before mseal feature Thank you for your time and assistance in helping me on understanding this issue. Best regards, -Jeff > -Jeff > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > Jeff Xu (2): > > mseal:selftest mremap across VMA boundaries. > > mseal: refactor mremap to remove can_modify_mm > > > > mm/internal.h | 24 ++ > > mm/mremap.c | 77 +++---- > > mm/mseal.c | 17 -- > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > -- > > 2.46.0.76.ge559c4bf1a-goog > >
On Thu, Aug 15, 2024 at 1:14 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > * Jeff Xu <jeffxu@google.com> [240815 13:23]: > > On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > > > > > * Jeff Xu <jeffxu@chromium.org> [240814 23:46]: > > > > On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett > > > > <Liam.Howlett@oracle.com> wrote: > > > > > The majority of the comments to V2 are mine, you only told us that > > > > > splitting a sealed vma is wrong (after I asked you directly to answer) > > > > > and then you made a comment about testing of the patch set. Besides the > > > > > direct responses to me, your comment was "wait for me to test". > > > > > > > > > Please share this link for " Besides the direct responses to me, your > > > > comment was "wait for me to test". > > > > Or pop up that email by responding to it, to remind me. Thanks. > > > > > > [1]. > > > > That is responding to Andrew, to indicate V2 patch has dependency on > > arch_munmap in PPC. And I will review/test the code, I will respond to > > Andrew directly. > > > > PS Your statement above is entirely false, and out of context. > > > > " You only told us that splitting a sealed vma is wrong (after I asked > > you directly to answer) and then you made a comment about testing of > > the patch set. Besides the direct responses to me, your comment was > > "wait for me to test". > > [1] has your "wait for me to test" to hold up a patch set, [2] has you > answering my direct question to you and making the untested comment to > someone else. > This is the last time that I'm trying to clarify this. [1] is my response to Andrew and Pedro. [2] is my comments about V2 lack of test , i.e. no selftest change, no extra tests added. -Jeff > So, entirely true. > > Liam > > [1]. https://lore.kernel.org/all/CALmYWFs0v07z5vheDt1h3hD+3--yr6Va0ZuQeaATo+-8MuRJ-g@mail.gmail.com/ > [2]. https://lore.kernel.org/all/CALmYWFvURJBgyFw7x5qrL4CqoZjy92NeFAS750XaLxO7o7Cv9A@mail.gmail.com/
* Jeff Xu <jeffxu@chromium.org> [240815 16:23]: > On Thu, Aug 15, 2024 at 1:14 PM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > > > * Jeff Xu <jeffxu@google.com> [240815 13:23]: > > > On Thu, Aug 15, 2024 at 9:50 AM Liam R. Howlett <Liam.Howlett@oracle.com> wrote: > > > > > > > > * Jeff Xu <jeffxu@chromium.org> [240814 23:46]: > > > > > On Wed, Aug 14, 2024 at 12:55 PM Liam R. Howlett > > > > > <Liam.Howlett@oracle.com> wrote: > > > > > > The majority of the comments to V2 are mine, you only told us that > > > > > > splitting a sealed vma is wrong (after I asked you directly to answer) > > > > > > and then you made a comment about testing of the patch set. Besides the > > > > > > direct responses to me, your comment was "wait for me to test". > > > > > > > > > > > Please share this link for " Besides the direct responses to me, your > > > > > comment was "wait for me to test". > > > > > Or pop up that email by responding to it, to remind me. Thanks. > > > > > > > > [1]. > > > > > > That is responding to Andrew, to indicate V2 patch has dependency on > > > arch_munmap in PPC. And I will review/test the code, I will respond to > > > Andrew directly. > > > > > > PS Your statement above is entirely false, and out of context. > > > > > > " You only told us that splitting a sealed vma is wrong (after I asked > > > you directly to answer) and then you made a comment about testing of > > > the patch set. Besides the direct responses to me, your comment was > > > "wait for me to test". > > > > [1] has your "wait for me to test" to hold up a patch set, [2] has you > > answering my direct question to you and making the untested comment to > > someone else. > > > This is the last time that I'm trying to clarify this. > [1] is my response to Andrew and Pedro. That doesn't change what you said, or what you are doing. > [2] is my comments about V2 lack of test , i.e. no selftest change, no > extra tests added. But they pass the tests that exist. Maybe you should take a step back, and look at both solutions. There is a competing set of patches that fixes the same problem in a similar way that was sent out before these patches, and those patches address the entire problem with the mseal() approach. Instead of helping make the complete solution work as you think it should, you are making the design problem worse and can't seem to verify your patches actually fix the regression. Liam
hi, Jeff, On Thu, Aug 15, 2024 at 01:19:06PM -0700, Jeff Xu wrote: > Hi Oliver, > > On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu <jeffxu@chromium.org> wrote: > > > > On Wed, Aug 14, 2024 at 12:14 AM <jeffxu@chromium.org> wrote: > > > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > > > refactor the code to check src address range before doing anything on > > > the destination, i.e. destination won't be unmapped, if src address > > > failed the boundaries check. > > > > > > This also allows us to remove can_modify_mm from mremap.c, since > > > the src address must be single VMA, can_modify_vma is used. > > > > > > It is likely this will improve the performance on mremap, previously > > > the code does sealing check using can_modify_mm for the src address range, > > > and the new code removed the loop (used by can_modify_mm). > > > > > > In order to verify this patch doesn't regress on mremap, I added tests in > > > mseal_test, the test patch can be applied before mremap refactor patch or > > > checkin independently. > > > > > > Also this patch doesn't change mseal's existing schematic: if sealing fail, > > > user can expect the src/dst address isn't updated. So this patch can be > > > applied regardless if we decided to go with current out-of-loop approach > > > or in-loop approach currently in discussion. > > > > > > Regarding the perf test report by stress-ng [1] title: > > > 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression > > > > > > The test is using below for testing: > > > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 > > > > > > I can't repro this using ChromeOS, the pagemove test shows large value > > > of stddev and stderr, and can't reasonably refect the performance impact. > > > > > > For example: I write a c program [2] to run the above pagemove test 10 times > > > and calculate the stddev, stderr, for 3 commits: > > > > > > 1> before mseal feature is added: > > > Ops/sec: > > > Mean : 3564.40 > > > Std Dev : 2737.35 (76.80% of Mean) > > > Std Err : 865.63 (24.29% of Mean) > > > > > > 2> after mseal feature is added: > > > Ops/sec: > > > Mean : 2703.84 > > > Std Dev : 2085.13 (77.12% of Mean) > > > Std Err : 659.38 (24.39% of Mean) > > > > > > 3> after current patch (mremap refactor) > > > Ops/sec: > > > Mean : 3603.67 > > > Std Dev : 2422.22 (67.22% of Mean) > > > Std Err : 765.97 (21.26% of Mean) > > > > > > The result shows 21%-24% stderr, this means whatever perf improvment/impact > > > there might be won't be measured correctly by this test. > > > > > > This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. > > > And I reboot the machine before each test, and take the first 10 runs with > > > run_stress_ng 10 > > > > > > (I will run longer duration to see if test still shows large stdDev,StdErr) > > > > > I took more samples (100 run ), the stddev/stderr is smaller, however > > still not at a range that can reasonably measure the perf improvement > > here. > > > > The tests were taken using the same machine as (10 times run above) > > and exact the same steps: i.e. change to certain kernel commit, reboot > > test device, take the first test result. > > > > 1> Before mseal feature is added: > > Statistics: > > Ops/sec: > > Mean : 1733.26 > > Std Dev : 842.13 (48.59% of Mean) > > Std Err : 84.21 (4.86% of Mean) > > > > 2> After mseal feature is added > > Statistics: > > Ops/sec: > > Mean : 1701.53 > > Std Dev : 1017.29 (59.79% of Mean) > > Std Err : 101.73 (5.98% of Mean) > > > > 3> After mremap refactor (this patch) > > Statistics: > > Ops/sec: > > Mean : 1097.04 > > Std Dev : 860.67 (78.45% of Mean) > > Std Err : 86.07 (7.85% of Mean) > > > > Summary: even when the stderr is down to 4%-%8 percentage range, the > > stddev is still too big. > > > > Hence, there are other unknown, random variables that impact this test. > > > I could not repro the 4% degradation with my test machine > (Chromebook), this can be entirely due to the specific test and this > test machine. > > Do you think it is possible to do a few more tests ? This time I like > to have a larger sample size (100 run) > > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 > > Please run the test for each commit following the exact steps, e.g. > reboot the machine, run the test, get the first 100 results for > sample. Please don't select or drop any unstable report because then > the data will be biased. If possible, please includes stddiv and > stderr for the data (or raw data if not possible, and I will do > post-processing) > > for 3 commits: > -> this patch. what's the base of it? could I directly apply this patch upon the commit what you said "after mseal feature" as below? > -> after mseal feature > -> before mseal feature could you exlictly point to two commit-id? > > Thank you for your time and assistance in helping me on understanding > this issue. due to resource constraint, please expect that we need several days to finish this test request. > > Best regards, > -Jeff > > > -Jeff > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > Jeff Xu (2): > > > mseal:selftest mremap across VMA boundaries. > > > mseal: refactor mremap to remove can_modify_mm > > > > > > mm/internal.h | 24 ++ > > > mm/mremap.c | 77 +++---- > > > mm/mseal.c | 17 -- > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > -- > > > 2.46.0.76.ge559c4bf1a-goog > > >
Hi Oliver On Thu, Aug 15, 2024 at 7:39 PM Oliver Sang <oliver.sang@intel.com> wrote: > > hi, Jeff, > > On Thu, Aug 15, 2024 at 01:19:06PM -0700, Jeff Xu wrote: > > Hi Oliver, > > > > On Thu, Aug 15, 2024 at 11:16 AM Jeff Xu <jeffxu@chromium.org> wrote: > > > > > > On Wed, Aug 14, 2024 at 12:14 AM <jeffxu@chromium.org> wrote: > > > > > > > > From: Jeff Xu <jeffxu@chromium.org> > > > > > > > > mremap doesn't allow relocate, expand, shrink across VMA boundaries, > > > > refactor the code to check src address range before doing anything on > > > > the destination, i.e. destination won't be unmapped, if src address > > > > failed the boundaries check. > > > > > > > > This also allows us to remove can_modify_mm from mremap.c, since > > > > the src address must be single VMA, can_modify_vma is used. > > > > > > > > It is likely this will improve the performance on mremap, previously > > > > the code does sealing check using can_modify_mm for the src address range, > > > > and the new code removed the loop (used by can_modify_mm). > > > > > > > > In order to verify this patch doesn't regress on mremap, I added tests in > > > > mseal_test, the test patch can be applied before mremap refactor patch or > > > > checkin independently. > > > > > > > > Also this patch doesn't change mseal's existing schematic: if sealing fail, > > > > user can expect the src/dst address isn't updated. So this patch can be > > > > applied regardless if we decided to go with current out-of-loop approach > > > > or in-loop approach currently in discussion. > > > > > > > > Regarding the perf test report by stress-ng [1] title: > > > > 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression > > > > > > > > The test is using below for testing: > > > > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 > > > > > > > > I can't repro this using ChromeOS, the pagemove test shows large value > > > > of stddev and stderr, and can't reasonably refect the performance impact. > > > > > > > > For example: I write a c program [2] to run the above pagemove test 10 times > > > > and calculate the stddev, stderr, for 3 commits: > > > > > > > > 1> before mseal feature is added: > > > > Ops/sec: > > > > Mean : 3564.40 > > > > Std Dev : 2737.35 (76.80% of Mean) > > > > Std Err : 865.63 (24.29% of Mean) > > > > > > > > 2> after mseal feature is added: > > > > Ops/sec: > > > > Mean : 2703.84 > > > > Std Dev : 2085.13 (77.12% of Mean) > > > > Std Err : 659.38 (24.39% of Mean) > > > > > > > > 3> after current patch (mremap refactor) > > > > Ops/sec: > > > > Mean : 3603.67 > > > > Std Dev : 2422.22 (67.22% of Mean) > > > > Std Err : 765.97 (21.26% of Mean) > > > > > > > > The result shows 21%-24% stderr, this means whatever perf improvment/impact > > > > there might be won't be measured correctly by this test. > > > > > > > > This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. > > > > And I reboot the machine before each test, and take the first 10 runs with > > > > run_stress_ng 10 > > > > > > > > (I will run longer duration to see if test still shows large stdDev,StdErr) > > > > > > > I took more samples (100 run ), the stddev/stderr is smaller, however > > > still not at a range that can reasonably measure the perf improvement > > > here. > > > > > > The tests were taken using the same machine as (10 times run above) > > > and exact the same steps: i.e. change to certain kernel commit, reboot > > > test device, take the first test result. > > > > > > 1> Before mseal feature is added: > > > Statistics: > > > Ops/sec: > > > Mean : 1733.26 > > > Std Dev : 842.13 (48.59% of Mean) > > > Std Err : 84.21 (4.86% of Mean) > > > > > > 2> After mseal feature is added > > > Statistics: > > > Ops/sec: > > > Mean : 1701.53 > > > Std Dev : 1017.29 (59.79% of Mean) > > > Std Err : 101.73 (5.98% of Mean) > > > > > > 3> After mremap refactor (this patch) > > > Statistics: > > > Ops/sec: > > > Mean : 1097.04 > > > Std Dev : 860.67 (78.45% of Mean) > > > Std Err : 86.07 (7.85% of Mean) > > > > > > Summary: even when the stderr is down to 4%-%8 percentage range, the > > > stddev is still too big. > > > > > > Hence, there are other unknown, random variables that impact this test. > > > > > I could not repro the 4% degradation with my test machine > > (Chromebook), this can be entirely due to the specific test and this > > test machine. > > > > Do you think it is possible to do a few more tests ? This time I like > > to have a larger sample size (100 run) > > > > stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 > > > > Please run the test for each commit following the exact steps, e.g. > > reboot the machine, run the test, get the first 100 results for > > sample. Please don't select or drop any unstable report because then > > the data will be biased. If possible, please includes stddiv and > > stderr for the data (or raw data if not possible, and I will do > > post-processing) > > > > for 3 commits: > > -> this patch. > > what's the base of it? could I directly apply this patch upon the commit > what you said "after mseal feature" as below? > > > -> after mseal feature > > -> before mseal feature > > could you exlictly point to two commit-id? sure this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall > > > > Thank you for your time and assistance in helping me on understanding > > this issue. > > due to resource constraint, please expect that we need several days to finish > this test request. No problem. Thanks for your help! -Jeff > > > > Best regards, > > -Jeff > > > > > -Jeff > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > Jeff Xu (2): > > > > mseal:selftest mremap across VMA boundaries. > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > mm/internal.h | 24 ++ > > > > mm/mremap.c | 77 +++---- > > > > mm/mseal.c | 17 -- > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > -- > > > > 2.46.0.76.ge559c4bf1a-goog > > > >
hi, Jeff, On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote: > Hi Oliver [...] > > could you exlictly point to two commit-id? > sure > > this patch > 8be7258a: mseal: add mseal syscall > ff388fe5c: mseal: wire up mseal syscall I failed to apply this patch set to "8be7258a: mseal: add mseal syscall" to avoid the impact of other changes, better to apply the patch upon 8be7258a directly. if you prefer other base for this patch, please let us know. then we will supply the results for 4 commits in fact: this patch the base of this patch 8be7258a: mseal: add mseal syscall ff388fe5c: mseal: wire up mseal syscall > > > > > > > Thank you for your time and assistance in helping me on understanding > > > this issue. > > > > due to resource constraint, please expect that we need several days to finish > > this test request. > No problem. > > Thanks for your help! > -Jeff > > > > > > > Best regards, > > > -Jeff > > > > > > > -Jeff > > > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > > > > Jeff Xu (2): > > > > > mseal:selftest mremap across VMA boundaries. > > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > > > mm/internal.h | 24 ++ > > > > > mm/mremap.c | 77 +++---- > > > > > mm/mseal.c | 17 -- > > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > > > -- > > > > > 2.46.0.76.ge559c4bf1a-goog > > > > >
hi, Jeff, On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote: > hi, Jeff, > > On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote: > > Hi Oliver > > [...] > > > > could you exlictly point to two commit-id? > > sure > > > > this patch > > 8be7258a: mseal: add mseal syscall > > ff388fe5c: mseal: wire up mseal syscall > > I failed to apply this patch set to "8be7258a: mseal: add mseal syscall" look your patch set again [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries just for kselftests and I can apply [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm upon "8be7258a: mseal: add mseal syscall" cleanly so I will start test for this [PATCH v1 2/2] BTW, I will firstly use our default setting - "60s testtime; reboot between each run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c then we could give you an update kind of quickly. as some private mail discussed, you want some special run method, could you elaborate them here? thanks > > to avoid the impact of other changes, better to apply the patch upon 8be7258a > directly. > > if you prefer other base for this patch, please let us know. then we will > supply the results for 4 commits in fact: > > this patch > the base of this patch > 8be7258a: mseal: add mseal syscall > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > > > > Thank you for your time and assistance in helping me on understanding > > > > this issue. > > > > > > due to resource constraint, please expect that we need several days to finish > > > this test request. > > No problem. > > > > Thanks for your help! > > -Jeff > > > > > > > > > > Best regards, > > > > -Jeff > > > > > > > > > -Jeff > > > > > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > > > > > > > Jeff Xu (2): > > > > > > mseal:selftest mremap across VMA boundaries. > > > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > > > > > mm/internal.h | 24 ++ > > > > > > mm/mremap.c | 77 +++---- > > > > > > mm/mseal.c | 17 -- > > > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > > > > > -- > > > > > > 2.46.0.76.ge559c4bf1a-goog > > > > > >
hi, Jeff, On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote: > hi, Jeff, > > On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote: > > hi, Jeff, > > > > On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote: > > > Hi Oliver > > > > [...] > > > > > > could you exlictly point to two commit-id? > > > sure > > > > > > this patch > > > 8be7258a: mseal: add mseal syscall > > > ff388fe5c: mseal: wire up mseal syscall > > > > I failed to apply this patch set to "8be7258a: mseal: add mseal syscall" > > look your patch set again > [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries > just for kselftests > > and I can apply > [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm > upon "8be7258a: mseal: add mseal syscall" cleanly > > so I will start test for this [PATCH v1 2/2] > > BTW, I will firstly use our default setting - "60s testtime; reboot between each > run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c > then we could give you an update kind of quickly. > > as some private mail discussed, you want some special run method, could you > elaborate them here? thanks here is a quick update before you give us more details about special run method. by our default run method (60s testtime; reboot between each run; run 10 times), your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could resolve regression partically. ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 time.system_time 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time 41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time 3.349e+10 ± 4% +3.0% 3.447e+10 ± 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc 3.295e+10 ± 3% +2.9% 3.392e+10 ± 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions 1.674e+11 ± 3% +1.8% 1.704e+11 ± 2% +3.3% 1.73e+11 perf-stat.ps.instructions 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.97 ± 3% -0.2 1.81 ± 3% -0.1 1.88 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 3.11 ± 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 2.98 -0.1 2.85 -0.1 2.87 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.27 ± 2% -0.1 1.15 ± 4% -0.1 1.19 ± 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.76 ± 2% -0.1 1.65 ± 2% -0.1 1.66 ± 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 0.55 -0.1 0.46 ± 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.60 ± 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.80 ± 2% -0.1 0.74 ± 2% -0.0 0.76 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.69 ± 2% -0.1 0.64 ± 3% -0.0 0.66 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 0.68 ± 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.95 ± 2% -0.0 0.90 ± 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 0.76 ± 2% -0.0 0.72 -0.0 0.72 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.65 -0.0 0.62 ± 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 2.76 +0.0 2.78 ± 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 +0.9 0.90 ± 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb 2.98 -0.3 2.69 ± 4% -0.2 2.80 ± 6% perf-profile.children.cycles-pp.__memcpy 3.46 ± 2% -0.2 3.25 -0.1 3.36 ± 3% perf-profile.children.cycles-pp.mod_objcg_state 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range 3.47 -0.2 3.29 -0.2 3.32 ± 2% perf-profile.children.cycles-pp.down_write 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook 4.23 -0.2 4.07 -0.1 4.08 ± 2% perf-profile.children.cycles-pp.anon_vma_clone 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load 2.24 -0.1 2.12 ± 2% -0.1 2.13 ± 3% perf-profile.children.cycles-pp.vma_prepare 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func 2.04 ± 2% -0.1 1.93 -0.1 1.96 ± 2% perf-profile.children.cycles-pp.allocate_slab 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write 0.97 ± 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs 1.40 -0.1 1.32 -0.1 1.34 ± 2% perf-profile.children.cycles-pp.shuffle_freelist 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock 0.49 ± 2% -0.1 0.43 -0.1 0.43 ± 2% perf-profile.children.cycles-pp.setup_object 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range 1.07 ± 2% -0.1 1.02 ± 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap 0.70 ± 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop 0.51 ± 3% -0.0 0.47 -0.0 0.49 ± 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv 0.44 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials 0.45 ± 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock 0.48 -0.0 0.45 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.31 ± 3% -0.0 0.28 ± 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.mas_put_in_tree 0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu 0.46 -0.0 0.44 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented 0.39 -0.0 0.37 -0.0 0.38 ± 2% perf-profile.children.cycles-pp.down_write_killable 0.29 -0.0 0.27 ± 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu 0.26 -0.0 0.24 ± 2% -0.0 0.25 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.mas_wr_append 0.30 ± 2% -0.0 0.28 ± 2% -0.0 0.29 ± 2% perf-profile.children.cycles-pp.__vm_enough_memory 0.32 -0.0 0.30 ± 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock 2.83 +0.0 2.85 ± 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.08 ± 5% +0.0 0.10 ± 3% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise 0.00 +0.1 0.09 ± 3% +0.1 0.10 ± 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 1.96 -0.2 1.77 ± 4% -0.1 1.84 ± 6% perf-profile.self.cycles-pp.__memcpy 2.36 -0.1 2.25 ± 2% -0.1 2.25 ± 3% perf-profile.self.cycles-pp.down_write 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk 1.44 -0.1 1.36 -0.1 1.38 ± 2% perf-profile.self.cycles-pp.__call_rcu_common 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state 0.69 ± 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs 1.14 ± 2% -0.1 1.08 -0.0 1.09 ± 2% perf-profile.self.cycles-pp.shuffle_freelist 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap 0.51 ± 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr 0.62 -0.1 0.56 ± 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node 0.94 -0.0 0.90 -0.0 0.91 ± 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch 0.94 ± 2% -0.0 0.89 ± 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma 0.43 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region 0.36 ± 4% -0.0 0.33 ± 3% -0.0 0.34 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma 0.43 ± 3% -0.0 0.40 -0.0 0.41 ± 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup 0.47 -0.0 0.45 -0.0 0.46 ± 2% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.37 ± 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.27 -0.0 0.25 ± 2% -0.0 0.25 ± 3% perf-profile.self.cycles-pp.mas_put_in_tree 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.27 ± 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu 0.24 ± 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.mas_alloc_nodes 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock 0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.26 -0.0 0.24 ± 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.flush_tlb_func 0.20 ± 2% -0.0 0.19 -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.06 -0.0 0.05 ± 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy 0.06 ± 6% +0.0 0.07 -0.0 0.06 ± 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.11 ± 4% +0.0 0.12 ± 4% +0.0 0.12 ± 4% perf-profile.self.cycles-pp.free_pgd_range 0.21 +0.0 0.22 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk > > > > > > to avoid the impact of other changes, better to apply the patch upon 8be7258a > > directly. > > > > if you prefer other base for this patch, please let us know. then we will > > supply the results for 4 commits in fact: > > > > this patch > > the base of this patch > > 8be7258a: mseal: add mseal syscall > > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > > > > > > > > Thank you for your time and assistance in helping me on understanding > > > > > this issue. > > > > > > > > due to resource constraint, please expect that we need several days to finish > > > > this test request. > > > No problem. > > > > > > Thanks for your help! > > > -Jeff > > > > > > > > > > > > > Best regards, > > > > > -Jeff > > > > > > > > > > > -Jeff > > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > > > > > > > > > > Jeff Xu (2): > > > > > > > mseal:selftest mremap across VMA boundaries. > > > > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > > > > > > > mm/internal.h | 24 ++ > > > > > > > mm/mremap.c | 77 +++---- > > > > > > > mm/mseal.c | 17 -- > > > > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > > > > > > > -- > > > > > > > 2.46.0.76.ge559c4bf1a-goog > > > > > > >
hi, Jeff, here is a update per your test request. we extented the runtime to 600 seconds, and run 10 times for each commit. ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s*** commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 1.886e+08 ± 0% -5.0% 1.792e+08 ± 0% -3.4% 1.821e+08 ± 0% stress-ng.pagemove.ops 314345 ± 0% -5.0% 298656 ± 0% -3.4% 303565 ± 0% stress-ng.pagemove.ops_per_sec the score of stress-ng.pagemove.ops_per_sec has some difference with 60s run (list as below for comparison). but the trend is similar. ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***60s*** commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 18386219 ± 0% -5.0% 17474214 ± 0% -2.9% 17850959 ± 0% stress-ng.pagemove.ops 306421 ± 0% -5.0% 291207 ± 0% -2.9% 297490 ± 0% stress-ng.pagemove.ops_per_sec since the data is stable, %stddev shows as "± 0%" in both above tables. let me give out the detail data for 600s runs. for ff388fe5c4 ("mseal: wire up mseal syscall") "stress-ng.pagemove.ops": [ 188545955, 188681834, 188907282, 188345009, 188729465, 188312187, 188897283, 188209713, 188425965, 189026136 ], "stress-ng.pagemove.ops_per_sec": [ 314242.1, 314467.13, 314841.5, 313907.19, 314548.11, 313852.5, 314827.84, 313680.74, 314042.14, 315042.79 ], for 8be7258aad ("mseal: add mseal syscall") "stress-ng.pagemove.ops": [ 179127848, 179401350, 179350278, 179023817, 179106624, 179535213, 178936504, 178870141, 179462171, 179136065 ], "stress-ng.pagemove.ops_per_sec": [ 298545.54, 299000.95, 298915.62, 298371.45, 298509.15, 299223.65, 298226.74, 298115.08, 299101.23, 298558.74 ], for 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" "stress-ng.pagemove.ops": [ 182188207, 182288813, 182483678, 181980233, 182249440, 181837961, 182155893, 181699445, 182347580, 182174597 ], "stress-ng.pagemove.ops_per_sec": [ 303643.28, 303814.05, 304138.38, 303298.9, 303747.33, 303060.84, 303592.48, 302831.56, 303909.81, 303622.07 ], for 600s run, below is the full comparion. ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s*** commit: ff388fe5c4 ("mseal: wire up mseal syscall") 8be7258aad ("mseal: add mseal syscall") 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 4667 ± 0% -2.4% 4553 ± 0% -1.6% 4593 ± 0% vmstat.system.cs 4.192e+08 ± 0% -4.3% 4.012e+08 ± 0% -2.8% 4.075e+08 ± 0% proc-vmstat.numa_hit 4.192e+08 ± 0% -4.3% 4.011e+08 ± 0% -2.8% 4.074e+08 ± 0% proc-vmstat.numa_local 7.843e+08 ± 0% -4.3% 7.504e+08 ± 0% -2.8% 7.623e+08 ± 0% proc-vmstat.pgalloc_normal 7.836e+08 ± 0% -4.3% 7.498e+08 ± 0% -2.8% 7.616e+08 ± 0% proc-vmstat.pgfree 1174825 ± 0% -2.6% 1143891 ± 0% -1.7% 1155336 ± 0% time.involuntary_context_switches 5082 ± 0% +1.3% 5147 ± 0% +0.9% 5126 ± 0% time.percent_of_cpu_this_job_got 29840 ± 0% +1.4% 30267 ± 0% +1.0% 30133 ± 0% time.system_time 663.58 ± 1% -5.7% 625.54 ± 1% -4.3% 635.17 ± 0% time.user_time 1.886e+08 ± 0% -5.0% 1.792e+08 ± 0% -3.4% 1.821e+08 ± 0% stress-ng.pagemove.ops 314345 ± 0% -5.0% 298656 ± 0% -3.4% 303565 ± 0% stress-ng.pagemove.ops_per_sec 212508 ± 0% -4.3% 203280 ± 0% -3.1% 205831 ± 0% stress-ng.pagemove.page_remaps_per_sec 1174825 ± 0% -2.6% 1143891 ± 0% -1.7% 1155336 ± 0% stress-ng.time.involuntary_context_switches 5082 ± 0% +1.3% 5147 ± 0% +0.9% 5126 ± 0% stress-ng.time.percent_of_cpu_this_job_got 29840 ± 0% +1.4% 30267 ± 0% +1.0% 30133 ± 0% stress-ng.time.system_time 663.58 ± 1% -5.7% 625.54 ± 1% -4.3% 635.17 ± 0% stress-ng.time.user_time 1.00 ± 0% -7.1% 0.93 ± 0% -4.9% 0.95 ± 0% perf-stat.i.MPKI 3.487e+10 ± 0% +3.5% 3.607e+10 ± 0% +2.4% 3.57e+10 ± 0% perf-stat.i.branch-instructions 0.21 ± 0% -0.0 0.19 ± 3% -0.0 0.20 ± 0% perf-stat.i.branch-miss-rate% 1.763e+08 ± 0% -5.0% 1.675e+08 ± 0% -3.4% 1.704e+08 ± 0% perf-stat.i.cache-misses 2.342e+08 ± 0% -4.9% 2.228e+08 ± 0% -3.3% 2.264e+08 ± 0% perf-stat.i.cache-references 4650 ± 0% -2.4% 4537 ± 0% -1.5% 4578 ± 0% perf-stat.i.context-switches 1.11 ± 0% -2.2% 1.09 ± 0% -1.6% 1.10 ± 0% perf-stat.i.cpi 172.66 ± 0% -2.8% 167.77 ± 0% -1.8% 169.52 ± 0% perf-stat.i.cpu-migrations 1121 ± 0% +5.2% 1180 ± 0% +3.5% 1160 ± 0% perf-stat.i.cycles-between-cache-misses 1.772e+11 ± 0% +2.2% 1.812e+11 ± 0% +1.6% 1.801e+11 ± 0% perf-stat.i.instructions 0.90 ± 0% +2.3% 0.92 ± 0% +1.6% 0.91 ± 0% perf-stat.i.ipc 0.99 ± 0% -7.1% 0.92 ± 0% -4.9% 0.95 ± 0% perf-stat.overall.MPKI 0.21 ± 0% -0.0 0.19 ± 3% -0.0 0.20 ± 0% perf-stat.overall.branch-miss-rate% 1.11 ± 0% -2.2% 1.09 ± 0% -1.6% 1.10 ± 0% perf-stat.overall.cpi 1120 ± 0% +5.2% 1179 ± 0% +3.5% 1159 ± 0% perf-stat.overall.cycles-between-cache-misses 0.90 ± 0% +2.3% 0.92 ± 0% +1.6% 0.91 ± 0% perf-stat.overall.ipc 3.48e+10 ± 0% +3.5% 3.6e+10 ± 0% +2.4% 3.563e+10 ± 0% perf-stat.ps.branch-instructions 1.759e+08 ± 0% -5.0% 1.672e+08 ± 0% -3.4% 1.7e+08 ± 0% perf-stat.ps.cache-misses 2.338e+08 ± 0% -4.9% 2.224e+08 ± 0% -3.3% 2.26e+08 ± 0% perf-stat.ps.cache-references 4642 ± 0% -2.4% 4529 ± 0% -1.5% 4570 ± 0% perf-stat.ps.context-switches 172.30 ± 0% -2.8% 167.43 ± 0% -1.8% 169.17 ± 0% perf-stat.ps.cpu-migrations 1.769e+11 ± 0% +2.3% 1.808e+11 ± 0% +1.6% 1.797e+11 ± 0% perf-stat.ps.instructions 1.063e+14 ± 0% +2.3% 1.087e+14 ± 0% +1.7% 1.081e+14 ± 0% perf-stat.total.instructions 74.86 ± 0% -2.1 72.76 ± 0% -0.8 74.06 ± 0% perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 36.72 ± 0% -1.7 35.04 ± 0% -1.2 35.54 ± 0% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 24.93 ± 0% -1.4 23.54 ± 0% -0.8 24.12 ± 0% perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.91 ± 0% -1.1 18.79 ± 0% -0.7 19.17 ± 0% perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 14.71 ± 0% -0.8 13.90 ± 0% -0.4 14.30 ± 0% perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 10.82 ± 2% -0.6 10.22 ± 2% -0.6 10.25 ± 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.81 ± 2% -0.6 10.21 ± 2% -0.6 10.24 ± 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 10.81 ± 2% -0.6 10.21 ± 2% -0.6 10.24 ± 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork 10.80 ± 2% -0.6 10.21 ± 2% -0.6 10.23 ± 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 10.76 ± 2% -0.6 10.17 ± 2% -0.6 10.20 ± 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn 1.49 ± 1% -0.5 0.98 ± 0% -0.5 1.00 ± 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 7.86 ± 0% -0.4 7.48 ± 0% -0.3 7.59 ± 0% perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.72 ± 0% -0.4 6.37 ± 0% -0.2 6.49 ± 0% perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.06 ± 2% -0.3 5.71 ± 2% -0.3 5.73 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd 6.11 ± 0% -0.3 5.77 ± 0% -0.2 5.90 ± 0% perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 6.11 ± 0% -0.3 5.78 ± 1% -0.2 5.90 ± 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.50 ± 0% -0.3 5.19 ± 0% -0.2 5.31 ± 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap 5.52 ± 0% -0.3 5.22 ± 0% -0.2 5.35 ± 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap 5.15 ± 0% -0.3 4.86 ± 0% -0.2 4.97 ± 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap 5.77 ± 0% -0.3 5.48 ± 0% -0.2 5.58 ± 0% perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 5.16 ± 0% -0.3 4.88 ± 0% -0.1 5.01 ± 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma 4.72 ± 2% -0.3 4.44 ± 2% -0.3 4.45 ± 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs 4.64 ± 0% -0.3 4.38 ± 0% -0.1 4.51 ± 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma 4.07 ± 0% -0.2 3.84 ± 0% -0.2 3.92 ± 0% perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 3.96 ± 1% -0.2 3.76 ± 1% -0.1 3.88 ± 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma 3.54 ± 0% -0.2 3.34 ± 0% -0.1 3.41 ± 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap 38.68 ± 0% -0.2 38.49 ± 0% +0.4 39.05 ± 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.55 ± 1% -0.2 0.36 ± 65% -0.0 0.52 ± 1% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 3.41 ± 0% -0.2 3.22 ± 0% -0.1 3.28 ± 0% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap 1.35 ± 0% -0.2 1.17 ± 0% -0.1 1.23 ± 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 2.22 ± 0% -0.2 2.05 ± 0% -0.1 2.12 ± 0% perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.27 ± 0% -0.2 2.10 ± 0% -0.1 2.15 ± 0% perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 3.25 ± 0% -0.2 3.08 ± 0% -0.1 3.14 ± 0% perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.12 ± 2% -0.2 2.97 ± 2% -0.1 3.04 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.96 ± 0% -0.1 0.82 ± 1% -0.1 0.87 ± 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to 2.98 ± 1% -0.1 2.84 ± 1% -0.1 2.89 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma 8.19 ± 0% -0.1 8.05 ± 0% -0.1 8.04 ± 0% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 3.13 ± 0% -0.1 3.00 ± 0% -0.1 3.06 ± 0% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.53 ± 1% -0.1 0.41 ± 50% -0.2 0.30 ± 81% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown_vmflags.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap 1.73 ± 2% -0.1 1.61 ± 2% -0.0 1.70 ± 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap 2.14 ± 2% -0.1 2.02 ± 2% -0.0 2.09 ± 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap 2.46 ± 0% -0.1 2.34 ± 0% -0.1 2.38 ± 0% perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma 2.04 ± 0% -0.1 1.93 ± 0% -0.1 1.96 ± 0% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.85 ± 0% -0.1 1.74 ± 0% -0.1 1.78 ± 0% perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 2.22 ± 0% -0.1 2.12 ± 0% -0.1 2.15 ± 0% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables 1.40 ± 0% -0.1 1.30 ± 0% -0.1 1.33 ± 0% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap 0.56 ± 1% -0.1 0.46 ± 33% -0.0 0.54 ± 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma 1.80 ± 2% -0.1 1.70 ± 2% -0.1 1.74 ± 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 2.43 ± 0% -0.1 2.33 ± 0% -0.1 2.37 ± 0% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 1.25 ± 0% -0.1 1.15 ± 1% -0.1 1.19 ± 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap 0.94 ± 1% -0.1 0.86 ± 0% -0.1 0.87 ± 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap 1.38 ± 0% -0.1 1.30 ± 0% -0.1 1.33 ± 1% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma 1.22 ± 0% -0.1 1.14 ± 0% -0.1 1.17 ± 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma 1.28 ± 0% -0.1 1.21 ± 0% -0.0 1.23 ± 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 1.54 ± 1% -0.1 1.46 ± 0% -0.0 1.49 ± 0% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 1.15 ± 0% -0.1 1.08 ± 1% -0.1 1.09 ± 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap 0.73 ± 1% -0.1 0.67 ± 1% -0.0 0.69 ± 1% perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap 0.72 ± 0% -0.1 0.66 ± 1% -0.0 0.69 ± 1% perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap 1.64 ± 1% -0.1 1.58 ± 0% -0.1 1.58 ± 0% perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.78 ± 1% -0.1 0.72 ± 1% -0.0 0.75 ± 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma 0.63 ± 1% -0.1 0.57 ± 1% -0.0 0.60 ± 1% perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma 0.69 ± 2% -0.1 0.63 ± 4% -0.0 0.66 ± 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma 0.60 ± 1% -0.1 0.54 ± 1% -0.0 0.58 ± 1% perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.79 ± 2% -0.1 0.74 ± 3% -0.0 0.75 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge 1.12 ± 0% -0.0 1.08 ± 0% -0.0 1.09 ± 1% perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap 0.67 ± 1% -0.0 0.62 ± 1% -0.0 0.63 ± 1% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.77 ± 1% -0.0 0.72 ± 1% -0.0 0.73 ± 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge 1.01 ± 1% -0.0 0.96 ± 0% -0.0 0.98 ± 0% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 0.86 ± 0% -0.0 0.81 ± 1% -0.0 0.83 ± 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 0.82 ± 1% -0.0 0.78 ± 1% -0.0 0.79 ± 1% perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 1.01 ± 0% -0.0 0.97 ± 0% -0.0 0.98 ± 0% perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.98 ± 1% -0.0 0.94 ± 0% -0.0 0.94 ± 1% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.78 ± 0% -0.0 0.74 ± 1% -0.0 0.75 ± 1% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap 0.68 ± 0% -0.0 0.64 ± 1% -0.0 0.65 ± 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma 0.68 ± 1% -0.0 0.64 ± 1% -0.0 0.64 ± 1% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap 0.89 ± 1% -0.0 0.85 ± 1% -0.0 0.86 ± 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.62 ± 1% -0.0 0.58 ± 2% -0.0 0.59 ± 1% perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 0.62 ± 1% -0.0 0.58 ± 1% -0.0 0.59 ± 1% perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.76 ± 1% -0.0 0.72 ± 1% -0.0 0.73 ± 1% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma 1.01 ± 0% -0.0 0.97 ± 1% -0.0 0.98 ± 1% perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap 0.64 ± 1% -0.0 0.60 ± 1% -0.0 0.61 ± 1% perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma 0.88 ± 1% -0.0 0.85 ± 0% -0.0 0.85 ± 0% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.69 ± 1% -0.0 0.66 ± 1% -0.0 0.67 ± 0% perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap 0.59 ± 1% -0.0 0.56 ± 1% -0.0 0.56 ± 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap 0.82 ± 1% -0.0 0.82 ± 1% -0.0 0.79 ± 1% perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 0.76 ± 1% +0.1 0.83 ± 0% +0.1 0.84 ± 0% perf-profile.calltrace.cycles-pp.__madvise 0.67 ± 1% +0.1 0.73 ± 1% +0.1 0.75 ± 1% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise 0.63 ± 1% +0.1 0.70 ± 1% +0.1 0.71 ± 0% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.62 ± 1% +0.1 0.69 ± 1% +0.1 0.71 ± 0% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 0.66 ± 1% +0.1 0.73 ± 1% +0.1 0.74 ± 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise 87.57 ± 0% +0.6 88.14 ± 0% +0.5 88.09 ± 0% perf-profile.calltrace.cycles-pp.mremap 84.74 ± 0% +0.7 85.47 ± 0% +0.6 85.37 ± 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap 84.58 ± 0% +0.7 85.32 ± 0% +0.6 85.22 ± 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 83.64 ± 0% +0.8 84.41 ± 0% +0.7 84.30 ± 0% perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 0.00 ± -1% +0.9 0.86 ± 0% +0.9 0.92 ± 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap 0.00 ± -1% +0.9 0.87 ± 0% +0.0 0.00 ± -1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap 0.00 ± -1% +0.9 0.91 ± 2% +0.9 0.92 ± 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma 0.00 ± -1% +1.1 1.09 ± 0% +0.0 0.00 ± -1% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 0.00 ± -1% +1.2 1.21 ± 0% +1.3 1.29 ± 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to 2.10 ± 0% +1.5 3.61 ± 0% +1.7 3.79 ± 0% perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 ± -1% +1.5 1.51 ± 1% +1.5 1.52 ± 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap 1.60 ± 0% +1.5 3.13 ± 0% +1.7 3.31 ± 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 0.00 ± -1% +1.6 1.60 ± 0% +0.0 0.00 ± -1% perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 ± -1% +1.7 1.73 ± 0% +1.8 1.84 ± 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap 0.00 ± -1% +2.0 2.00 ± 1% +2.0 2.04 ± 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 5.35 ± 0% +3.0 8.37 ± 0% +1.6 6.92 ± 0% perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap 75.03 ± 0% -2.1 72.92 ± 0% -0.8 74.22 ± 0% perf-profile.children.cycles-pp.move_vma 36.94 ± 0% -1.7 35.25 ± 0% -1.2 35.75 ± 0% perf-profile.children.cycles-pp.do_vmi_align_munmap 25.01 ± 0% -1.4 23.61 ± 0% -0.8 24.19 ± 0% perf-profile.children.cycles-pp.copy_vma 20.00 ± 0% -1.1 18.88 ± 0% -0.7 19.26 ± 0% perf-profile.children.cycles-pp.__split_vma 19.92 ± 0% -1.1 18.84 ± 0% -0.8 19.14 ± 0% perf-profile.children.cycles-pp.handle_softirqs 19.90 ± 0% -1.1 18.82 ± 0% -0.8 19.12 ± 0% perf-profile.children.cycles-pp.rcu_core 19.88 ± 0% -1.1 18.80 ± 0% -0.8 19.10 ± 0% perf-profile.children.cycles-pp.rcu_do_batch 17.57 ± 0% -0.9 16.66 ± 0% -0.6 16.94 ± 0% perf-profile.children.cycles-pp.kmem_cache_free 15.29 ± 0% -0.9 14.43 ± 0% -0.5 14.75 ± 0% perf-profile.children.cycles-pp.kmem_cache_alloc_noprof 15.11 ± 0% -0.8 14.27 ± 0% -0.4 14.68 ± 0% perf-profile.children.cycles-pp.vma_merge 12.15 ± 0% -0.7 11.46 ± 0% -0.5 11.65 ± 0% perf-profile.children.cycles-pp.__slab_free 12.11 ± 0% -0.7 11.43 ± 0% -0.4 11.71 ± 0% perf-profile.children.cycles-pp.mas_wr_store_entry 11.90 ± 0% -0.7 11.24 ± 0% -0.4 11.50 ± 0% perf-profile.children.cycles-pp.mas_store_prealloc 10.82 ± 2% -0.6 10.22 ± 2% -0.6 10.25 ± 2% perf-profile.children.cycles-pp.smpboot_thread_fn 10.81 ± 2% -0.6 10.21 ± 2% -0.6 10.24 ± 2% perf-profile.children.cycles-pp.run_ksoftirqd 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.children.cycles-pp.kthread 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.children.cycles-pp.ret_from_fork 10.85 ± 2% -0.6 10.26 ± 2% -0.6 10.28 ± 2% perf-profile.children.cycles-pp.ret_from_fork_asm 10.85 ± 0% -0.6 10.26 ± 0% -0.4 10.47 ± 0% perf-profile.children.cycles-pp.vm_area_dup 9.81 ± 0% -0.5 9.28 ± 0% -0.3 9.52 ± 0% perf-profile.children.cycles-pp.mas_wr_node_store 8.38 ± 1% -0.5 7.90 ± 1% -0.2 8.13 ± 1% perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook 7.98 ± 0% -0.4 7.58 ± 0% -0.3 7.70 ± 0% perf-profile.children.cycles-pp.move_page_tables 6.66 ± 0% -0.4 6.29 ± 0% -0.2 6.43 ± 0% perf-profile.children.cycles-pp.vma_complete 5.12 ± 0% -0.3 4.79 ± 0% -0.2 4.88 ± 0% perf-profile.children.cycles-pp.mas_preallocate 6.05 ± 0% -0.3 5.72 ± 0% -0.2 5.82 ± 0% perf-profile.children.cycles-pp.vm_area_free_rcu_cb 5.85 ± 0% -0.3 5.56 ± 0% -0.2 5.66 ± 0% perf-profile.children.cycles-pp.move_ptes 3.51 ± 1% -0.2 3.28 ± 2% -0.1 3.37 ± 1% perf-profile.children.cycles-pp.mod_objcg_state 3.45 ± 0% -0.2 3.24 ± 0% -0.2 3.30 ± 0% perf-profile.children.cycles-pp.___slab_alloc 2.91 ± 0% -0.2 2.71 ± 0% -0.1 2.78 ± 0% perf-profile.children.cycles-pp.mas_alloc_nodes 3.47 ± 0% -0.2 3.27 ± 0% -0.1 3.34 ± 0% perf-profile.children.cycles-pp.flush_tlb_mm_range 3.43 ± 1% -0.2 3.24 ± 1% -0.1 3.35 ± 2% perf-profile.children.cycles-pp.down_write 2.44 ± 0% -0.2 2.25 ± 0% -0.1 2.32 ± 0% perf-profile.children.cycles-pp.find_vma_prev 4.24 ± 1% -0.2 4.06 ± 1% -0.1 4.11 ± 1% perf-profile.children.cycles-pp.anon_vma_clone 3.35 ± 0% -0.2 3.18 ± 0% -0.1 3.24 ± 0% perf-profile.children.cycles-pp.mas_store_gfp 2.21 ± 1% -0.2 2.05 ± 0% -0.1 2.10 ± 0% perf-profile.children.cycles-pp.__cond_resched 3.32 ± 0% -0.2 3.17 ± 1% -0.1 3.24 ± 0% perf-profile.children.cycles-pp.__memcg_slab_free_hook 8.26 ± 0% -0.1 8.12 ± 0% -0.1 8.11 ± 0% perf-profile.children.cycles-pp.unmap_region 2.22 ± 1% -0.1 2.08 ± 1% -0.1 2.16 ± 3% perf-profile.children.cycles-pp.vma_prepare 2.67 ± 0% -0.1 2.54 ± 0% -0.1 2.58 ± 0% perf-profile.children.cycles-pp.mtree_load 3.18 ± 0% -0.1 3.05 ± 0% -0.1 3.11 ± 0% perf-profile.children.cycles-pp.unmap_vmas 2.46 ± 0% -0.1 2.34 ± 0% -0.1 2.38 ± 0% perf-profile.children.cycles-pp.rcu_cblist_dequeue 2.50 ± 0% -0.1 2.39 ± 0% -0.1 2.43 ± 0% perf-profile.children.cycles-pp.flush_tlb_func 2.11 ± 1% -0.1 2.00 ± 1% -0.1 2.02 ± 1% perf-profile.children.cycles-pp.__call_rcu_common 2.04 ± 1% -0.1 1.93 ± 1% -0.1 1.95 ± 1% perf-profile.children.cycles-pp.allocate_slab 1.77 ± 1% -0.1 1.66 ± 0% -0.1 1.69 ± 1% perf-profile.children.cycles-pp.mas_wr_walk 1.87 ± 0% -0.1 1.77 ± 0% -0.1 1.80 ± 0% perf-profile.children.cycles-pp.vma_link 2.24 ± 0% -0.1 2.13 ± 0% -0.1 2.17 ± 0% perf-profile.children.cycles-pp.native_flush_tlb_one_user 1.85 ± 1% -0.1 1.74 ± 0% -0.1 1.79 ± 2% perf-profile.children.cycles-pp.up_write 2.48 ± 0% -0.1 2.38 ± 0% -0.1 2.42 ± 0% perf-profile.children.cycles-pp.unmap_page_range 0.97 ± 2% -0.1 0.88 ± 1% -0.1 0.90 ± 1% perf-profile.children.cycles-pp.rcu_all_qs 1.04 ± 0% -0.1 0.95 ± 1% -0.0 0.99 ± 1% perf-profile.children.cycles-pp.mas_prev 1.24 ± 0% -0.1 1.16 ± 0% -0.1 1.19 ± 0% perf-profile.children.cycles-pp.mas_prev_slot 0.93 ± 0% -0.1 0.85 ± 1% -0.0 0.88 ± 1% perf-profile.children.cycles-pp.mas_prev_setup 1.39 ± 1% -0.1 1.31 ± 1% -0.1 1.33 ± 1% perf-profile.children.cycles-pp.shuffle_freelist 1.52 ± 0% -0.1 1.45 ± 0% -0.0 1.48 ± 0% perf-profile.children.cycles-pp.mas_update_gap 1.58 ± 1% -0.1 1.50 ± 0% -0.0 1.53 ± 0% perf-profile.children.cycles-pp.zap_pmd_range 0.87 ± 1% -0.1 0.80 ± 0% -0.1 0.82 ± 1% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.68 ± 1% -0.1 1.62 ± 0% -0.1 1.62 ± 0% perf-profile.children.cycles-pp.__get_unmapped_area 0.90 ± 1% -0.1 0.84 ± 0% -0.0 0.86 ± 1% perf-profile.children.cycles-pp.percpu_counter_add_batch 0.62 ± 1% -0.1 0.56 ± 1% -0.0 0.60 ± 1% perf-profile.children.cycles-pp.security_mmap_addr 0.49 ± 1% -0.1 0.44 ± 1% -0.1 0.44 ± 1% perf-profile.children.cycles-pp.setup_object 1.02 ± 0% -0.1 0.97 ± 1% -0.0 0.99 ± 0% perf-profile.children.cycles-pp.mas_leaf_max_gap 0.98 ± 1% -0.0 0.93 ± 1% -0.0 0.94 ± 1% perf-profile.children.cycles-pp.mas_pop_node 1.22 ± 1% -0.0 1.18 ± 1% -0.0 1.19 ± 1% perf-profile.children.cycles-pp.__pte_offset_map_lock 0.45 ± 2% -0.0 0.40 ± 2% -0.0 0.41 ± 1% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.18 ± 0% -0.0 1.13 ± 0% -0.0 1.15 ± 1% perf-profile.children.cycles-pp.clear_bhb_loop 1.08 ± 1% -0.0 1.03 ± 0% -0.0 1.05 ± 0% perf-profile.children.cycles-pp.zap_pte_range 1.04 ± 0% -0.0 1.00 ± 0% -0.0 1.01 ± 0% perf-profile.children.cycles-pp.vma_to_resize 0.58 ± 1% -0.0 0.53 ± 1% -0.0 0.54 ± 1% perf-profile.children.cycles-pp.mas_wr_end_piv 0.34 ± 2% -0.0 0.30 ± 5% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.get_partial_node 0.64 ± 1% -0.0 0.61 ± 2% -0.0 0.61 ± 1% perf-profile.children.cycles-pp.get_old_pud 0.62 ± 0% -0.0 0.59 ± 0% -0.0 0.59 ± 1% perf-profile.children.cycles-pp.__put_partials 1.14 ± 0% -0.0 1.10 ± 1% -0.0 1.12 ± 1% perf-profile.children.cycles-pp.mt_find 0.90 ± 0% -0.0 0.87 ± 0% -0.0 0.87 ± 0% perf-profile.children.cycles-pp.userfaultfd_unmap_complete 0.61 ± 1% -0.0 0.58 ± 1% -0.0 0.59 ± 0% perf-profile.children.cycles-pp.entry_SYSCALL_64 0.32 ± 2% -0.0 0.29 ± 3% -0.0 0.30 ± 4% perf-profile.children.cycles-pp.security_vm_enough_memory_mm 0.54 ± 1% -0.0 0.52 ± 1% -0.0 0.52 ± 1% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.55 ± 1% -0.0 0.52 ± 1% -0.0 0.54 ± 1% perf-profile.children.cycles-pp.refill_obj_stock 0.45 ± 1% -0.0 0.43 ± 2% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.__alloc_pages_noprof 0.43 ± 1% -0.0 0.41 ± 2% -0.0 0.41 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist 0.17 ± 1% -0.0 0.15 ± 3% -0.0 0.16 ± 1% perf-profile.children.cycles-pp.get_any_partial 0.32 ± 1% -0.0 0.30 ± 1% -0.0 0.30 ± 1% perf-profile.children.cycles-pp.pte_offset_map_nolock 0.40 ± 0% -0.0 0.38 ± 1% -0.0 0.39 ± 1% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.28 ± 2% -0.0 0.26 ± 2% -0.0 0.27 ± 1% perf-profile.children.cycles-pp.khugepaged_enter_vma 0.32 ± 1% -0.0 0.30 ± 1% -0.0 0.30 ± 2% perf-profile.children.cycles-pp.mas_wr_store_setup 0.19 ± 4% -0.0 0.17 ± 4% -0.0 0.18 ± 6% perf-profile.children.cycles-pp.cap_vm_enough_memory 0.29 ± 1% -0.0 0.27 ± 2% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.tlb_gather_mmu 0.09 ± 4% -0.0 0.07 ± 6% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.vma_dup_policy 0.16 ± 3% -0.0 0.14 ± 2% -0.0 0.14 ± 2% perf-profile.children.cycles-pp.mas_wr_append 0.22 ± 2% -0.0 0.20 ± 3% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist 0.20 ± 2% -0.0 0.18 ± 2% -0.0 0.19 ± 3% perf-profile.children.cycles-pp.__thp_vma_allowable_orders 0.24 ± 2% -0.0 0.23 ± 2% -0.0 0.23 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk 0.44 ± 1% +0.0 0.45 ± 1% +0.0 0.46 ± 1% perf-profile.children.cycles-pp.mremap_userfaultfd_prep 0.85 ± 1% +0.0 0.85 ± 1% -0.0 0.81 ± 1% perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags 0.13 ± 3% +0.0 0.14 ± 3% +0.0 0.15 ± 2% perf-profile.children.cycles-pp.free_pgd_range 0.08 ± 8% +0.0 0.10 ± 3% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags 0.78 ± 1% +0.1 0.84 ± 0% +0.1 0.86 ± 0% perf-profile.children.cycles-pp.__madvise 0.63 ± 1% +0.1 0.70 ± 1% +0.1 0.72 ± 0% perf-profile.children.cycles-pp.__x64_sys_madvise 0.63 ± 1% +0.1 0.70 ± 0% +0.1 0.71 ± 0% perf-profile.children.cycles-pp.do_madvise 0.00 ± -1% +0.1 0.09 ± 0% +0.1 0.09 ± 5% perf-profile.children.cycles-pp.can_modify_mm_madv 1.32 ± 1% +0.1 1.46 ± 0% +0.2 1.50 ± 0% perf-profile.children.cycles-pp.mas_next_slot 87.96 ± 0% +0.6 88.52 ± 0% +0.5 88.48 ± 0% perf-profile.children.cycles-pp.mremap 85.91 ± 0% +0.8 86.69 ± 0% +0.7 86.61 ± 0% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 83.74 ± 0% +0.8 84.52 ± 0% +0.7 84.40 ± 0% perf-profile.children.cycles-pp.__do_sys_mremap 85.42 ± 0% +0.8 86.23 ± 0% +0.7 86.14 ± 0% perf-profile.children.cycles-pp.do_syscall_64 40.36 ± 0% +1.4 41.74 ± 0% +2.1 42.49 ± 0% perf-profile.children.cycles-pp.do_vmi_munmap 2.12 ± 0% +1.5 3.63 ± 0% +1.7 3.81 ± 0% perf-profile.children.cycles-pp.do_munmap 3.62 ± 0% +2.3 5.97 ± 0% +1.7 5.29 ± 0% perf-profile.children.cycles-pp.mas_walk 5.41 ± 0% +3.0 8.44 ± 0% +1.6 6.98 ± 0% perf-profile.children.cycles-pp.mremap_to 5.28 ± 0% +3.2 8.48 ± 0% +2.3 7.56 ± 0% perf-profile.children.cycles-pp.mas_find 0.00 ± -1% +5.4 5.45 ± 0% +3.9 3.94 ± 0% perf-profile.children.cycles-pp.can_modify_mm 11.51 ± 0% -0.6 10.86 ± 0% -0.5 11.04 ± 0% perf-profile.self.cycles-pp.__slab_free 4.23 ± 2% -0.2 4.00 ± 2% -0.1 4.13 ± 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 2.34 ± 1% -0.1 2.21 ± 1% -0.0 2.30 ± 3% perf-profile.self.cycles-pp.down_write 2.43 ± 0% -0.1 2.31 ± 0% -0.1 2.34 ± 0% perf-profile.self.cycles-pp.rcu_cblist_dequeue 2.34 ± 0% -0.1 2.24 ± 0% -0.1 2.27 ± 0% perf-profile.self.cycles-pp.mtree_load 2.21 ± 0% -0.1 2.11 ± 0% -0.1 2.14 ± 0% perf-profile.self.cycles-pp.native_flush_tlb_one_user 1.75 ± 0% -0.1 1.67 ± 0% -0.0 1.70 ± 0% perf-profile.self.cycles-pp.mod_objcg_state 1.54 ± 1% -0.1 1.46 ± 0% -0.0 1.50 ± 1% perf-profile.self.cycles-pp.up_write 1.52 ± 0% -0.1 1.44 ± 0% -0.1 1.46 ± 0% perf-profile.self.cycles-pp.mas_wr_walk 0.70 ± 3% -0.1 0.63 ± 1% -0.1 0.64 ± 1% perf-profile.self.cycles-pp.rcu_all_qs 1.43 ± 1% -0.1 1.36 ± 1% -0.1 1.36 ± 1% perf-profile.self.cycles-pp.__call_rcu_common 1.01 ± 0% -0.1 0.95 ± 0% -0.0 0.96 ± 0% perf-profile.self.cycles-pp.mas_preallocate 1.40 ± 1% -0.1 1.33 ± 1% -0.0 1.35 ± 0% perf-profile.self.cycles-pp.do_vmi_align_munmap 1.00 ± 0% -0.1 0.94 ± 0% -0.0 0.96 ± 0% perf-profile.self.cycles-pp.mas_prev_slot 1.14 ± 1% -0.1 1.08 ± 1% -0.0 1.10 ± 1% perf-profile.self.cycles-pp.shuffle_freelist 1.18 ± 0% -0.1 1.13 ± 0% -0.0 1.16 ± 0% perf-profile.self.cycles-pp.vma_merge 0.94 ± 1% -0.1 0.89 ± 2% -0.0 0.91 ± 1% perf-profile.self.cycles-pp.vm_area_free_rcu_cb 0.88 ± 0% -0.1 0.83 ± 1% -0.0 0.84 ± 0% perf-profile.self.cycles-pp.___slab_alloc 0.50 ± 1% -0.0 0.45 ± 2% -0.0 0.50 ± 1% perf-profile.self.cycles-pp.security_mmap_addr 0.77 ± 1% -0.0 0.72 ± 1% -0.0 0.74 ± 1% perf-profile.self.cycles-pp.percpu_counter_add_batch 0.45 ± 2% -0.0 0.40 ± 2% -0.0 0.41 ± 1% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.17 ± 0% -0.0 1.12 ± 0% -0.0 1.14 ± 1% perf-profile.self.cycles-pp.clear_bhb_loop 1.08 ± 1% -0.0 1.04 ± 1% -0.0 1.06 ± 1% perf-profile.self.cycles-pp.__cond_resched 1.50 ± 2% -0.0 1.46 ± 0% -0.0 1.48 ± 0% perf-profile.self.cycles-pp.kmem_cache_free 1.23 ± 0% -0.0 1.18 ± 0% -0.1 1.18 ± 0% perf-profile.self.cycles-pp.move_vma 0.68 ± 1% -0.0 0.64 ± 0% -0.0 0.65 ± 1% perf-profile.self.cycles-pp.__split_vma 0.80 ± 0% -0.0 0.76 ± 1% -0.0 0.77 ± 0% perf-profile.self.cycles-pp.mas_wr_store_entry 0.61 ± 2% -0.0 0.57 ± 2% -0.0 0.57 ± 6% perf-profile.self.cycles-pp.mremap 0.85 ± 1% -0.0 0.80 ± 1% -0.0 0.81 ± 1% perf-profile.self.cycles-pp.mas_pop_node 0.44 ± 0% -0.0 0.40 ± 1% -0.0 0.40 ± 1% perf-profile.self.cycles-pp.do_munmap 0.98 ± 0% -0.0 0.94 ± 1% -0.0 0.95 ± 0% perf-profile.self.cycles-pp.move_ptes 0.89 ± 0% -0.0 0.86 ± 0% -0.0 0.87 ± 0% perf-profile.self.cycles-pp.mas_leaf_max_gap 0.46 ± 1% -0.0 0.42 ± 1% -0.0 0.43 ± 1% perf-profile.self.cycles-pp.mas_wr_end_piv 0.89 ± 0% -0.0 0.86 ± 0% -0.0 0.87 ± 0% perf-profile.self.cycles-pp.mas_store_gfp 0.79 ± 0% -0.0 0.76 ± 1% -0.0 0.76 ± 0% perf-profile.self.cycles-pp.userfaultfd_unmap_complete 0.99 ± 0% -0.0 0.97 ± 0% -0.0 0.98 ± 0% perf-profile.self.cycles-pp.mt_find 0.87 ± 0% -0.0 0.84 ± 0% -0.0 0.84 ± 0% perf-profile.self.cycles-pp.move_page_tables 0.55 ± 2% -0.0 0.52 ± 1% -0.0 0.52 ± 1% perf-profile.self.cycles-pp.get_old_pud 0.50 ± 0% -0.0 0.47 ± 1% -0.0 0.48 ± 0% perf-profile.self.cycles-pp.find_vma_prev 0.61 ± 0% -0.0 0.58 ± 1% -0.0 0.59 ± 0% perf-profile.self.cycles-pp.unmap_region 0.66 ± 0% -0.0 0.63 ± 1% -0.0 0.64 ± 0% perf-profile.self.cycles-pp.mas_store_prealloc 0.27 ± 1% -0.0 0.25 ± 1% -0.0 0.26 ± 1% perf-profile.self.cycles-pp.mas_prev_setup 0.61 ± 1% -0.0 0.59 ± 1% -0.0 0.60 ± 1% perf-profile.self.cycles-pp.copy_vma 0.48 ± 0% -0.0 0.45 ± 1% -0.0 0.46 ± 1% perf-profile.self.cycles-pp.flush_tlb_mm_range 0.41 ± 1% -0.0 0.39 ± 1% -0.0 0.40 ± 1% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.48 ± 1% -0.0 0.46 ± 1% -0.0 0.47 ± 0% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.50 ± 1% -0.0 0.48 ± 1% -0.0 0.48 ± 1% perf-profile.self.cycles-pp.refill_obj_stock 0.47 ± 1% -0.0 0.46 ± 1% -0.0 0.45 ± 1% perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags 0.71 ± 0% -0.0 0.69 ± 1% -0.0 0.69 ± 1% perf-profile.self.cycles-pp.unmap_page_range 0.17 ± 4% -0.0 0.15 ± 4% -0.0 0.16 ± 3% perf-profile.self.cycles-pp.get_partial_node 0.24 ± 1% -0.0 0.22 ± 1% -0.0 0.23 ± 0% perf-profile.self.cycles-pp.mas_prev 0.45 ± 1% -0.0 0.43 ± 0% -0.0 0.44 ± 1% perf-profile.self.cycles-pp.mas_update_gap 0.53 ± 1% -0.0 0.51 ± 0% -0.0 0.51 ± 1% perf-profile.self.cycles-pp.mremap_to 0.21 ± 2% -0.0 0.19 ± 2% -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__get_unmapped_area 0.27 ± 1% -0.0 0.26 ± 1% -0.0 0.25 ± 1% perf-profile.self.cycles-pp.tlb_finish_mmu 0.18 ± 2% -0.0 0.17 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.rcu_do_batch 0.06 ± 0% -0.0 0.05 ± 0% -0.0 0.05 ± 0% perf-profile.self.cycles-pp.vma_dup_policy 0.12 ± 0% -0.0 0.11 ± 0% -0.0 0.11 ± 3% perf-profile.self.cycles-pp.mas_wr_append 0.14 ± 3% -0.0 0.13 ± 3% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.x64_sys_call 0.11 ± 0% +0.0 0.12 ± 0% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.free_pgd_range 0.06 ± 5% +0.0 0.07 ± 0% +0.0 0.06 ± 5% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags 0.21 ± 0% +0.0 0.22 ± 2% -0.0 0.21 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags 0.45 ± 1% +0.0 0.48 ± 2% +0.0 0.50 ± 1% perf-profile.self.cycles-pp.do_vmi_munmap 0.27 ± 1% +0.0 0.32 ± 2% -0.0 0.26 ± 1% perf-profile.self.cycles-pp.free_pgtables 0.36 ± 2% +0.1 0.44 ± 1% -0.0 0.35 ± 4% perf-profile.self.cycles-pp.unlink_anon_vmas 1.07 ± 1% +0.1 1.19 ± 0% +0.1 1.22 ± 0% perf-profile.self.cycles-pp.mas_next_slot 1.50 ± 0% +0.5 2.02 ± 0% +0.4 1.85 ± 0% perf-profile.self.cycles-pp.mas_find 0.00 ± -1% +1.4 1.38 ± 0% +0.9 0.92 ± 0% perf-profile.self.cycles-pp.can_modify_mm 3.15 ± 0% +2.1 5.26 ± 0% +1.5 4.62 ± 0% perf-profile.self.cycles-pp.mas_walk On Mon, Aug 19, 2024 at 02:35:40PM +0800, Oliver Sang wrote: > hi, Jeff, > > On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote: > > hi, Jeff, > > > > On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote: > > > hi, Jeff, > > > > > > On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote: > > > > Hi Oliver > > > > > > [...] > > > > > > > > could you exlictly point to two commit-id? > > > > sure > > > > > > > > this patch > > > > 8be7258a: mseal: add mseal syscall > > > > ff388fe5c: mseal: wire up mseal syscall > > > > > > I failed to apply this patch set to "8be7258a: mseal: add mseal syscall" > > > > look your patch set again > > [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries > > just for kselftests > > > > and I can apply > > [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm > > upon "8be7258a: mseal: add mseal syscall" cleanly > > > > so I will start test for this [PATCH v1 2/2] > > > > BTW, I will firstly use our default setting - "60s testtime; reboot between each > > run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c > > then we could give you an update kind of quickly. > > > > as some private mail discussed, you want some special run method, could you > > elaborate them here? thanks > > here is a quick update before you give us more details about special run method. > > by our default run method (60s testtime; reboot between each run; run 10 times), > your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could > resolve regression partically. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got > 2915 +1.5% 2959 +1.2% 2949 time.system_time > 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time > 41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit > 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local > 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal > 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree > 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops > 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec > 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got > 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time > 3.349e+10 ± 4% +3.0% 3.447e+10 ± 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions > 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi > 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc > 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI > 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi > 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses > 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc > 3.295e+10 ± 3% +2.9% 3.392e+10 ± 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions > 1.674e+11 ± 3% +1.8% 1.704e+11 ± 2% +3.3% 1.73e+11 perf-stat.ps.instructions > 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions > 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma > 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 1.97 ± 3% -0.2 1.81 ± 3% -0.1 1.88 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > 3.11 ± 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > 2.98 -0.1 2.85 -0.1 2.87 ± 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 1.27 ± 2% -0.1 1.15 ± 4% -0.1 1.19 ± 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap > 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > 1.76 ± 2% -0.1 1.65 ± 2% -0.1 1.66 ± 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > 0.55 -0.1 0.46 ± 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap > 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.60 ± 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.80 ± 2% -0.1 0.74 ± 2% -0.0 0.76 ± 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > 0.69 ± 2% -0.1 0.64 ± 3% -0.0 0.66 ± 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma > 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma > 0.68 ± 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.95 ± 2% -0.0 0.90 ± 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 0.76 ± 2% -0.0 0.72 -0.0 0.72 ± 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas > 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.62 ± 2% -0.0 0.59 ± 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > 0.65 -0.0 0.62 ± 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 2.76 +0.0 2.78 ± 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise > 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 +0.9 0.90 ± 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma > 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap > 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma > 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma > 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core > 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch > 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs > 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free > 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge > 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free > 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry > 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc > 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup > 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store > 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables > 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete > 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes > 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate > 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > 2.98 -0.3 2.69 ± 4% -0.2 2.80 ± 6% perf-profile.children.cycles-pp.__memcpy > 3.46 ± 2% -0.2 3.25 -0.1 3.36 ± 3% perf-profile.children.cycles-pp.mod_objcg_state > 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc > 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev > 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes > 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range > 3.47 -0.2 3.29 -0.2 3.32 ± 2% perf-profile.children.cycles-pp.down_write > 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook > 4.23 -0.2 4.07 -0.1 4.08 ± 2% perf-profile.children.cycles-pp.anon_vma_clone > 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region > 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp > 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched > 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas > 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common > 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load > 2.24 -0.1 2.12 ± 2% -0.1 2.13 ± 3% perf-profile.children.cycles-pp.vma_prepare > 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func > 2.04 ± 2% -0.1 1.93 -0.1 1.96 ± 2% perf-profile.children.cycles-pp.allocate_slab > 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue > 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range > 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user > 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk > 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link > 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write > 0.97 ± 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs > 1.40 -0.1 1.32 -0.1 1.34 ± 2% perf-profile.children.cycles-pp.shuffle_freelist > 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev > 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup > 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range > 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot > 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap > 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr > 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch > 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area > 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock > 0.49 ± 2% -0.1 0.43 -0.1 0.43 ± 2% perf-profile.children.cycles-pp.setup_object > 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range > 1.07 ± 2% -0.1 1.02 ± 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap > 0.70 ± 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret > 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop > 0.51 ± 3% -0.0 0.47 -0.0 0.49 ± 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert > 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize > 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv > 0.44 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find > 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials > 0.45 ± 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock > 0.48 -0.0 0.45 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range > 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 > 0.31 ± 3% -0.0 0.28 ± 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm > 0.33 ± 3% -0.0 0.30 ± 2% -0.0 0.31 ± 4% perf-profile.children.cycles-pp.mas_put_in_tree > 0.32 ± 2% -0.0 0.29 ± 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu > 0.46 -0.0 0.44 ± 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue > 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy > 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented > 0.39 -0.0 0.37 -0.0 0.38 ± 2% perf-profile.children.cycles-pp.down_write_killable > 0.29 -0.0 0.27 ± 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu > 0.26 -0.0 0.24 ± 2% -0.0 0.25 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > 0.16 ± 2% -0.0 0.14 ± 3% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.mas_wr_append > 0.30 ± 2% -0.0 0.28 ± 2% -0.0 0.29 ± 2% perf-profile.children.cycles-pp.__vm_enough_memory > 0.32 -0.0 0.30 ± 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock > 2.83 +0.0 2.85 ± 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas > 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags > 0.08 ± 5% +0.0 0.10 ± 3% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags > 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables > 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise > 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise > 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise > 0.00 +0.1 0.09 ± 3% +0.1 0.10 ± 5% perf-profile.children.cycles-pp.can_modify_mm_madv > 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot > 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap > 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap > 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap > 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk > 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to > 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find > 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm > 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free > 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > 1.96 -0.2 1.77 ± 4% -0.1 1.84 ± 6% perf-profile.self.cycles-pp.__memcpy > 2.36 -0.1 2.25 ± 2% -0.1 2.25 ± 3% perf-profile.self.cycles-pp.down_write > 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue > 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user > 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook > 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk > 1.44 -0.1 1.36 -0.1 1.38 ± 2% perf-profile.self.cycles-pp.__call_rcu_common > 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write > 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state > 0.69 ± 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs > 1.14 ± 2% -0.1 1.08 -0.0 1.09 ± 2% perf-profile.self.cycles-pp.shuffle_freelist > 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge > 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > 0.51 ± 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr > 0.62 -0.1 0.56 ± 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap > 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc > 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot > 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate > 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes > 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > 0.94 -0.0 0.90 -0.0 0.91 ± 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched > 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > 0.94 ± 2% -0.0 0.89 ± 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap > 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop > 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma > 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry > 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma > 0.43 ± 2% -0.0 0.40 ± 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free > 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv > 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp > 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region > 0.36 ± 4% -0.0 0.33 ± 3% -0.0 0.34 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret > 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud > 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find > 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma > 0.43 ± 3% -0.0 0.40 -0.0 0.41 ± 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert > 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev > 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range > 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup > 0.47 -0.0 0.45 -0.0 0.46 ± 2% perf-profile.self.cycles-pp.flush_tlb_mm_range > 0.37 ± 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock > 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.27 -0.0 0.25 ± 2% -0.0 0.25 ± 3% perf-profile.self.cycles-pp.mas_put_in_tree > 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock > 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.27 ± 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu > 0.24 ± 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev > 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.mas_alloc_nodes > 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock > 0.14 ± 3% -0.0 0.12 ± 2% -0.0 0.13 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > 0.26 -0.0 0.24 ± 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented > 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud > 0.28 -0.0 0.26 -0.0 0.27 ± 2% perf-profile.self.cycles-pp.flush_tlb_func > 0.20 ± 2% -0.0 0.19 -0.0 0.19 ± 2% perf-profile.self.cycles-pp.__get_unmapped_area > 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags > 0.06 -0.0 0.05 ± 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy > 0.06 ± 6% +0.0 0.07 -0.0 0.06 ± 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags > 0.11 ± 4% +0.0 0.12 ± 4% +0.0 0.12 ± 4% perf-profile.self.cycles-pp.free_pgd_range > 0.21 +0.0 0.22 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags > 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap > 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > 0.36 ± 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas > 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot > 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find > 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm > 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk > > > > > > > > > > > > to avoid the impact of other changes, better to apply the patch upon 8be7258a > > > directly. > > > > > > if you prefer other base for this patch, please let us know. then we will > > > supply the results for 4 commits in fact: > > > > > > this patch > > > the base of this patch > > > 8be7258a: mseal: add mseal syscall > > > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > > > > > > > > > > > > Thank you for your time and assistance in helping me on understanding > > > > > > this issue. > > > > > > > > > > due to resource constraint, please expect that we need several days to finish > > > > > this test request. > > > > No problem. > > > > > > > > Thanks for your help! > > > > -Jeff > > > > > > > > > > > > > > > > Best regards, > > > > > > -Jeff > > > > > > > > > > > > > -Jeff > > > > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > > > > > > > > > > > > > Jeff Xu (2): > > > > > > > > mseal:selftest mremap across VMA boundaries. > > > > > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > > > > > > > > > mm/internal.h | 24 ++ > > > > > > > > mm/mremap.c | 77 +++---- > > > > > > > > mm/mseal.c | 17 -- > > > > > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > > > > > > > > > -- > > > > > > > > 2.46.0.76.ge559c4bf1a-goog > > > > > > > >
Hi Oliver On Tue, Aug 20, 2024 at 11:19 PM Oliver Sang <oliver.sang@intel.com> wrote: > > hi, Jeff, > > here is a update per your test request. > > we extented the runtime to 600 seconds, and run 10 times for each commit. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s*** > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 1.886e+08 ą 0% -5.0% 1.792e+08 ą 0% -3.4% 1.821e+08 ą 0% stress-ng.pagemove.ops > 314345 ą 0% -5.0% 298656 ą 0% -3.4% 303565 ą 0% stress-ng.pagemove.ops_per_sec > Thanks for testing with more samples. The result is reasonable and consistent with the 60 seconds result. The -3.4% reflects the impact from munmap, which isn't covered by this patch. > > the score of stress-ng.pagemove.ops_per_sec has some difference with 60s > run (list as below for comparison). but the trend is similar. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***60s*** > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 18386219 ą 0% -5.0% 17474214 ą 0% -2.9% 17850959 ą 0% stress-ng.pagemove.ops > 306421 ą 0% -5.0% 291207 ą 0% -2.9% 297490 ą 0% stress-ng.pagemove.ops_per_sec > > > since the data is stable, %stddev shows as "ą 0%" in both above tables. > let me give out the detail data for 600s runs. > > for > ff388fe5c4 ("mseal: wire up mseal syscall") > > "stress-ng.pagemove.ops": [ > 188545955, > 188681834, > 188907282, > 188345009, > 188729465, > 188312187, > 188897283, > 188209713, > 188425965, > 189026136 > ], > "stress-ng.pagemove.ops_per_sec": [ > 314242.1, > 314467.13, > 314841.5, > 313907.19, > 314548.11, > 313852.5, > 314827.84, > 313680.74, > 314042.14, > 315042.79 > ], > > for > 8be7258aad ("mseal: add mseal syscall") > > "stress-ng.pagemove.ops": [ > 179127848, > 179401350, > 179350278, > 179023817, > 179106624, > 179535213, > 178936504, > 178870141, > 179462171, > 179136065 > ], > "stress-ng.pagemove.ops_per_sec": [ > 298545.54, > 299000.95, > 298915.62, > 298371.45, > 298509.15, > 299223.65, > 298226.74, > 298115.08, > 299101.23, > 298558.74 > ], > > for > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > "stress-ng.pagemove.ops": [ > 182188207, > 182288813, > 182483678, > 181980233, > 182249440, > 181837961, > 182155893, > 181699445, > 182347580, > 182174597 > ], > "stress-ng.pagemove.ops_per_sec": [ > 303643.28, > 303814.05, > 304138.38, > 303298.9, > 303747.33, > 303060.84, > 303592.48, > 302831.56, > 303909.81, > 303622.07 > ], > > > for 600s run, below is the full comparion. > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/***600s*** > > commit: > ff388fe5c4 ("mseal: wire up mseal syscall") > 8be7258aad ("mseal: add mseal syscall") > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 4667 ą 0% -2.4% 4553 ą 0% -1.6% 4593 ą 0% vmstat.system.cs > 4.192e+08 ą 0% -4.3% 4.012e+08 ą 0% -2.8% 4.075e+08 ą 0% proc-vmstat.numa_hit > 4.192e+08 ą 0% -4.3% 4.011e+08 ą 0% -2.8% 4.074e+08 ą 0% proc-vmstat.numa_local > 7.843e+08 ą 0% -4.3% 7.504e+08 ą 0% -2.8% 7.623e+08 ą 0% proc-vmstat.pgalloc_normal > 7.836e+08 ą 0% -4.3% 7.498e+08 ą 0% -2.8% 7.616e+08 ą 0% proc-vmstat.pgfree > 1174825 ą 0% -2.6% 1143891 ą 0% -1.7% 1155336 ą 0% time.involuntary_context_switches > 5082 ą 0% +1.3% 5147 ą 0% +0.9% 5126 ą 0% time.percent_of_cpu_this_job_got > 29840 ą 0% +1.4% 30267 ą 0% +1.0% 30133 ą 0% time.system_time > 663.58 ą 1% -5.7% 625.54 ą 1% -4.3% 635.17 ą 0% time.user_time > 1.886e+08 ą 0% -5.0% 1.792e+08 ą 0% -3.4% 1.821e+08 ą 0% stress-ng.pagemove.ops > 314345 ą 0% -5.0% 298656 ą 0% -3.4% 303565 ą 0% stress-ng.pagemove.ops_per_sec > 212508 ą 0% -4.3% 203280 ą 0% -3.1% 205831 ą 0% stress-ng.pagemove.page_remaps_per_sec > 1174825 ą 0% -2.6% 1143891 ą 0% -1.7% 1155336 ą 0% stress-ng.time.involuntary_context_switches > 5082 ą 0% +1.3% 5147 ą 0% +0.9% 5126 ą 0% stress-ng.time.percent_of_cpu_this_job_got > 29840 ą 0% +1.4% 30267 ą 0% +1.0% 30133 ą 0% stress-ng.time.system_time > 663.58 ą 1% -5.7% 625.54 ą 1% -4.3% 635.17 ą 0% stress-ng.time.user_time > 1.00 ą 0% -7.1% 0.93 ą 0% -4.9% 0.95 ą 0% perf-stat.i.MPKI > 3.487e+10 ą 0% +3.5% 3.607e+10 ą 0% +2.4% 3.57e+10 ą 0% perf-stat.i.branch-instructions > 0.21 ą 0% -0.0 0.19 ą 3% -0.0 0.20 ą 0% perf-stat.i.branch-miss-rate% > 1.763e+08 ą 0% -5.0% 1.675e+08 ą 0% -3.4% 1.704e+08 ą 0% perf-stat.i.cache-misses > 2.342e+08 ą 0% -4.9% 2.228e+08 ą 0% -3.3% 2.264e+08 ą 0% perf-stat.i.cache-references > 4650 ą 0% -2.4% 4537 ą 0% -1.5% 4578 ą 0% perf-stat.i.context-switches > 1.11 ą 0% -2.2% 1.09 ą 0% -1.6% 1.10 ą 0% perf-stat.i.cpi > 172.66 ą 0% -2.8% 167.77 ą 0% -1.8% 169.52 ą 0% perf-stat.i.cpu-migrations > 1121 ą 0% +5.2% 1180 ą 0% +3.5% 1160 ą 0% perf-stat.i.cycles-between-cache-misses > 1.772e+11 ą 0% +2.2% 1.812e+11 ą 0% +1.6% 1.801e+11 ą 0% perf-stat.i.instructions > 0.90 ą 0% +2.3% 0.92 ą 0% +1.6% 0.91 ą 0% perf-stat.i.ipc > 0.99 ą 0% -7.1% 0.92 ą 0% -4.9% 0.95 ą 0% perf-stat.overall.MPKI > 0.21 ą 0% -0.0 0.19 ą 3% -0.0 0.20 ą 0% perf-stat.overall.branch-miss-rate% > 1.11 ą 0% -2.2% 1.09 ą 0% -1.6% 1.10 ą 0% perf-stat.overall.cpi > 1120 ą 0% +5.2% 1179 ą 0% +3.5% 1159 ą 0% perf-stat.overall.cycles-between-cache-misses > 0.90 ą 0% +2.3% 0.92 ą 0% +1.6% 0.91 ą 0% perf-stat.overall.ipc > 3.48e+10 ą 0% +3.5% 3.6e+10 ą 0% +2.4% 3.563e+10 ą 0% perf-stat.ps.branch-instructions > 1.759e+08 ą 0% -5.0% 1.672e+08 ą 0% -3.4% 1.7e+08 ą 0% perf-stat.ps.cache-misses > 2.338e+08 ą 0% -4.9% 2.224e+08 ą 0% -3.3% 2.26e+08 ą 0% perf-stat.ps.cache-references > 4642 ą 0% -2.4% 4529 ą 0% -1.5% 4570 ą 0% perf-stat.ps.context-switches > 172.30 ą 0% -2.8% 167.43 ą 0% -1.8% 169.17 ą 0% perf-stat.ps.cpu-migrations > 1.769e+11 ą 0% +2.3% 1.808e+11 ą 0% +1.6% 1.797e+11 ą 0% perf-stat.ps.instructions > 1.063e+14 ą 0% +2.3% 1.087e+14 ą 0% +1.7% 1.081e+14 ą 0% perf-stat.total.instructions > 74.86 ą 0% -2.1 72.76 ą 0% -0.8 74.06 ą 0% perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 36.72 ą 0% -1.7 35.04 ą 0% -1.2 35.54 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 24.93 ą 0% -1.4 23.54 ą 0% -0.8 24.12 ą 0% perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 19.91 ą 0% -1.1 18.79 ą 0% -0.7 19.17 ą 0% perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 14.71 ą 0% -0.8 13.90 ą 0% -0.4 14.30 ą 0% perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 10.82 ą 2% -0.6 10.22 ą 2% -0.6 10.25 ą 2% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.calltrace.cycles-pp.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > 10.80 ą 2% -0.6 10.21 ą 2% -0.6 10.23 ą 2% perf-profile.calltrace.cycles-pp.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn.kthread > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > 10.76 ą 2% -0.6 10.17 ą 2% -0.6 10.20 ą 2% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd.smpboot_thread_fn > 1.49 ą 1% -0.5 0.98 ą 0% -0.5 1.00 ą 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 7.86 ą 0% -0.4 7.48 ą 0% -0.3 7.59 ą 0% perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 6.72 ą 0% -0.4 6.37 ą 0% -0.2 6.49 ą 0% perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.06 ą 2% -0.3 5.71 ą 2% -0.3 5.73 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs.run_ksoftirqd > 6.11 ą 0% -0.3 5.77 ą 0% -0.2 5.90 ą 0% perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 6.11 ą 0% -0.3 5.78 ą 1% -0.2 5.90 ą 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 5.50 ą 0% -0.3 5.19 ą 0% -0.2 5.31 ą 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 5.52 ą 0% -0.3 5.22 ą 0% -0.2 5.35 ą 0% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > 5.15 ą 0% -0.3 4.86 ą 0% -0.2 4.97 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > 5.77 ą 0% -0.3 5.48 ą 0% -0.2 5.58 ą 0% perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 5.16 ą 0% -0.3 4.88 ą 0% -0.1 5.01 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > 4.72 ą 2% -0.3 4.44 ą 2% -0.3 4.45 ą 2% perf-profile.calltrace.cycles-pp.__slab_free.kmem_cache_free.rcu_do_batch.rcu_core.handle_softirqs > 4.64 ą 0% -0.3 4.38 ą 0% -0.1 4.51 ą 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > 4.07 ą 0% -0.2 3.84 ą 0% -0.2 3.92 ą 0% perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 3.96 ą 1% -0.2 3.76 ą 1% -0.1 3.88 ą 1% perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma > 3.54 ą 0% -0.2 3.34 ą 0% -0.1 3.41 ą 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > 38.68 ą 0% -0.2 38.49 ą 0% +0.4 39.05 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.55 ą 1% -0.2 0.36 ą 65% -0.0 0.52 ą 1% perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 3.41 ą 0% -0.2 3.22 ą 0% -0.1 3.28 ą 0% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > 1.35 ą 0% -0.2 1.17 ą 0% -0.1 1.23 ą 0% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 2.22 ą 0% -0.2 2.05 ą 0% -0.1 2.12 ą 0% perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 2.27 ą 0% -0.2 2.10 ą 0% -0.1 2.15 ą 0% perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 3.25 ą 0% -0.2 3.08 ą 0% -0.1 3.14 ą 0% perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 3.12 ą 2% -0.2 2.97 ą 2% -0.1 3.04 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.96 ą 0% -0.1 0.82 ą 1% -0.1 0.87 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > 2.98 ą 1% -0.1 2.84 ą 1% -0.1 2.89 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > 8.19 ą 0% -0.1 8.05 ą 0% -0.1 8.04 ą 0% perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 3.13 ą 0% -0.1 3.00 ą 0% -0.1 3.06 ą 0% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.53 ą 1% -0.1 0.41 ą 50% -0.2 0.30 ą 81% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown_vmflags.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap > 1.73 ą 2% -0.1 1.61 ą 2% -0.0 1.70 ą 3% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > 2.14 ą 2% -0.1 2.02 ą 2% -0.0 2.09 ą 2% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 2.46 ą 0% -0.1 2.34 ą 0% -0.1 2.38 ą 0% perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > 2.04 ą 0% -0.1 1.93 ą 0% -0.1 1.96 ą 0% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.85 ą 0% -0.1 1.74 ą 0% -0.1 1.78 ą 0% perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > 2.22 ą 0% -0.1 2.12 ą 0% -0.1 2.15 ą 0% perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > 1.40 ą 0% -0.1 1.30 ą 0% -0.1 1.33 ą 0% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > 0.56 ą 1% -0.1 0.46 ą 33% -0.0 0.54 ą 2% perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma > 1.80 ą 2% -0.1 1.70 ą 2% -0.1 1.74 ą 2% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 2.43 ą 0% -0.1 2.33 ą 0% -0.1 2.37 ą 0% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap > 1.25 ą 0% -0.1 1.15 ą 1% -0.1 1.19 ą 0% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > 0.94 ą 1% -0.1 0.86 ą 0% -0.1 0.87 ą 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > 1.38 ą 0% -0.1 1.30 ą 0% -0.1 1.33 ą 1% perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > 1.22 ą 0% -0.1 1.14 ą 0% -0.1 1.17 ą 1% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > 1.28 ą 0% -0.1 1.21 ą 0% -0.0 1.23 ą 0% perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 1.54 ą 1% -0.1 1.46 ą 0% -0.0 1.49 ą 0% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap > 1.15 ą 0% -0.1 1.08 ą 1% -0.1 1.09 ą 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > 0.73 ą 1% -0.1 0.67 ą 1% -0.0 0.69 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > 0.72 ą 0% -0.1 0.66 ą 1% -0.0 0.69 ą 1% perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > 1.64 ą 1% -0.1 1.58 ą 0% -0.1 1.58 ą 0% perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.78 ą 1% -0.1 0.72 ą 1% -0.0 0.75 ą 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > 0.63 ą 1% -0.1 0.57 ą 1% -0.0 0.60 ą 1% perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > 0.69 ą 2% -0.1 0.63 ą 4% -0.0 0.66 ą 2% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma > 0.60 ą 1% -0.1 0.54 ą 1% -0.0 0.58 ą 1% perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 0.79 ą 2% -0.1 0.74 ą 3% -0.0 0.75 ą 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > 1.12 ą 0% -0.0 1.08 ą 0% -0.0 1.09 ą 1% perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > 0.67 ą 1% -0.0 0.62 ą 1% -0.0 0.63 ą 1% perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.77 ą 1% -0.0 0.72 ą 1% -0.0 0.73 ą 1% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > 1.01 ą 1% -0.0 0.96 ą 0% -0.0 0.98 ą 0% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > 0.86 ą 0% -0.0 0.81 ą 1% -0.0 0.83 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > 0.82 ą 1% -0.0 0.78 ą 1% -0.0 0.79 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 1.01 ą 0% -0.0 0.97 ą 0% -0.0 0.98 ą 0% perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.98 ą 1% -0.0 0.94 ą 0% -0.0 0.94 ą 1% perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.78 ą 0% -0.0 0.74 ą 1% -0.0 0.75 ą 1% perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.68 ą 0% -0.0 0.64 ą 1% -0.0 0.65 ą 0% perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > 0.68 ą 1% -0.0 0.64 ą 1% -0.0 0.64 ą 1% perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > 0.89 ą 1% -0.0 0.85 ą 1% -0.0 0.86 ą 1% perf-profile.calltrace.cycles-pp.mtree_load.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.62 ą 1% -0.0 0.58 ą 2% -0.0 0.59 ą 1% perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > 0.62 ą 1% -0.0 0.58 ą 1% -0.0 0.59 ą 1% perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.76 ą 1% -0.0 0.72 ą 1% -0.0 0.73 ą 1% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > 1.01 ą 0% -0.0 0.97 ą 1% -0.0 0.98 ą 1% perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > 0.64 ą 1% -0.0 0.60 ą 1% -0.0 0.61 ą 1% perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > 0.88 ą 1% -0.0 0.85 ą 0% -0.0 0.85 ą 0% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.69 ą 1% -0.0 0.66 ą 1% -0.0 0.67 ą 0% perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > 0.59 ą 1% -0.0 0.56 ą 1% -0.0 0.56 ą 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > 0.82 ą 1% -0.0 0.82 ą 1% -0.0 0.79 ą 1% perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > 0.76 ą 1% +0.1 0.83 ą 0% +0.1 0.84 ą 0% perf-profile.calltrace.cycles-pp.__madvise > 0.67 ą 1% +0.1 0.73 ą 1% +0.1 0.75 ą 1% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > 0.63 ą 1% +0.1 0.70 ą 1% +0.1 0.71 ą 0% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.62 ą 1% +0.1 0.69 ą 1% +0.1 0.71 ą 0% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 0.66 ą 1% +0.1 0.73 ą 1% +0.1 0.74 ą 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > 87.57 ą 0% +0.6 88.14 ą 0% +0.5 88.09 ą 0% perf-profile.calltrace.cycles-pp.mremap > 84.74 ą 0% +0.7 85.47 ą 0% +0.6 85.37 ą 0% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mremap > 84.58 ą 0% +0.7 85.32 ą 0% +0.6 85.22 ą 0% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 83.64 ą 0% +0.8 84.41 ą 0% +0.7 84.30 ą 0% perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 0.00 ą -1% +0.9 0.86 ą 0% +0.9 0.92 ą 0% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > 0.00 ą -1% +0.9 0.87 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > 0.00 ą -1% +0.9 0.91 ą 2% +0.9 0.92 ą 1% perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > 0.00 ą -1% +1.1 1.09 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 ą -1% +1.2 1.21 ą 0% +1.3 1.29 ą 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > 2.10 ą 0% +1.5 3.61 ą 0% +1.7 3.79 ą 0% perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 ą -1% +1.5 1.51 ą 1% +1.5 1.52 ą 0% perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > 1.60 ą 0% +1.5 3.13 ą 0% +1.7 3.31 ą 0% perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > 0.00 ą -1% +1.6 1.60 ą 0% +0.0 0.00 ą -1% perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.00 ą -1% +1.7 1.73 ą 0% +1.8 1.84 ą 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > 0.00 ą -1% +2.0 2.00 ą 1% +2.0 2.04 ą 0% perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > 5.35 ą 0% +3.0 8.37 ą 0% +1.6 6.92 ą 0% perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > 75.03 ą 0% -2.1 72.92 ą 0% -0.8 74.22 ą 0% perf-profile.children.cycles-pp.move_vma > 36.94 ą 0% -1.7 35.25 ą 0% -1.2 35.75 ą 0% perf-profile.children.cycles-pp.do_vmi_align_munmap > 25.01 ą 0% -1.4 23.61 ą 0% -0.8 24.19 ą 0% perf-profile.children.cycles-pp.copy_vma > 20.00 ą 0% -1.1 18.88 ą 0% -0.7 19.26 ą 0% perf-profile.children.cycles-pp.__split_vma > 19.92 ą 0% -1.1 18.84 ą 0% -0.8 19.14 ą 0% perf-profile.children.cycles-pp.handle_softirqs > 19.90 ą 0% -1.1 18.82 ą 0% -0.8 19.12 ą 0% perf-profile.children.cycles-pp.rcu_core > 19.88 ą 0% -1.1 18.80 ą 0% -0.8 19.10 ą 0% perf-profile.children.cycles-pp.rcu_do_batch > 17.57 ą 0% -0.9 16.66 ą 0% -0.6 16.94 ą 0% perf-profile.children.cycles-pp.kmem_cache_free > 15.29 ą 0% -0.9 14.43 ą 0% -0.5 14.75 ą 0% perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > 15.11 ą 0% -0.8 14.27 ą 0% -0.4 14.68 ą 0% perf-profile.children.cycles-pp.vma_merge > 12.15 ą 0% -0.7 11.46 ą 0% -0.5 11.65 ą 0% perf-profile.children.cycles-pp.__slab_free > 12.11 ą 0% -0.7 11.43 ą 0% -0.4 11.71 ą 0% perf-profile.children.cycles-pp.mas_wr_store_entry > 11.90 ą 0% -0.7 11.24 ą 0% -0.4 11.50 ą 0% perf-profile.children.cycles-pp.mas_store_prealloc > 10.82 ą 2% -0.6 10.22 ą 2% -0.6 10.25 ą 2% perf-profile.children.cycles-pp.smpboot_thread_fn > 10.81 ą 2% -0.6 10.21 ą 2% -0.6 10.24 ą 2% perf-profile.children.cycles-pp.run_ksoftirqd > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.kthread > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.ret_from_fork > 10.85 ą 2% -0.6 10.26 ą 2% -0.6 10.28 ą 2% perf-profile.children.cycles-pp.ret_from_fork_asm > 10.85 ą 0% -0.6 10.26 ą 0% -0.4 10.47 ą 0% perf-profile.children.cycles-pp.vm_area_dup > 9.81 ą 0% -0.5 9.28 ą 0% -0.3 9.52 ą 0% perf-profile.children.cycles-pp.mas_wr_node_store > 8.38 ą 1% -0.5 7.90 ą 1% -0.2 8.13 ą 1% perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > 7.98 ą 0% -0.4 7.58 ą 0% -0.3 7.70 ą 0% perf-profile.children.cycles-pp.move_page_tables > 6.66 ą 0% -0.4 6.29 ą 0% -0.2 6.43 ą 0% perf-profile.children.cycles-pp.vma_complete > 5.12 ą 0% -0.3 4.79 ą 0% -0.2 4.88 ą 0% perf-profile.children.cycles-pp.mas_preallocate > 6.05 ą 0% -0.3 5.72 ą 0% -0.2 5.82 ą 0% perf-profile.children.cycles-pp.vm_area_free_rcu_cb > 5.85 ą 0% -0.3 5.56 ą 0% -0.2 5.66 ą 0% perf-profile.children.cycles-pp.move_ptes > 3.51 ą 1% -0.2 3.28 ą 2% -0.1 3.37 ą 1% perf-profile.children.cycles-pp.mod_objcg_state > 3.45 ą 0% -0.2 3.24 ą 0% -0.2 3.30 ą 0% perf-profile.children.cycles-pp.___slab_alloc > 2.91 ą 0% -0.2 2.71 ą 0% -0.1 2.78 ą 0% perf-profile.children.cycles-pp.mas_alloc_nodes > 3.47 ą 0% -0.2 3.27 ą 0% -0.1 3.34 ą 0% perf-profile.children.cycles-pp.flush_tlb_mm_range > 3.43 ą 1% -0.2 3.24 ą 1% -0.1 3.35 ą 2% perf-profile.children.cycles-pp.down_write > 2.44 ą 0% -0.2 2.25 ą 0% -0.1 2.32 ą 0% perf-profile.children.cycles-pp.find_vma_prev > 4.24 ą 1% -0.2 4.06 ą 1% -0.1 4.11 ą 1% perf-profile.children.cycles-pp.anon_vma_clone > 3.35 ą 0% -0.2 3.18 ą 0% -0.1 3.24 ą 0% perf-profile.children.cycles-pp.mas_store_gfp > 2.21 ą 1% -0.2 2.05 ą 0% -0.1 2.10 ą 0% perf-profile.children.cycles-pp.__cond_resched > 3.32 ą 0% -0.2 3.17 ą 1% -0.1 3.24 ą 0% perf-profile.children.cycles-pp.__memcg_slab_free_hook > 8.26 ą 0% -0.1 8.12 ą 0% -0.1 8.11 ą 0% perf-profile.children.cycles-pp.unmap_region > 2.22 ą 1% -0.1 2.08 ą 1% -0.1 2.16 ą 3% perf-profile.children.cycles-pp.vma_prepare > 2.67 ą 0% -0.1 2.54 ą 0% -0.1 2.58 ą 0% perf-profile.children.cycles-pp.mtree_load > 3.18 ą 0% -0.1 3.05 ą 0% -0.1 3.11 ą 0% perf-profile.children.cycles-pp.unmap_vmas > 2.46 ą 0% -0.1 2.34 ą 0% -0.1 2.38 ą 0% perf-profile.children.cycles-pp.rcu_cblist_dequeue > 2.50 ą 0% -0.1 2.39 ą 0% -0.1 2.43 ą 0% perf-profile.children.cycles-pp.flush_tlb_func > 2.11 ą 1% -0.1 2.00 ą 1% -0.1 2.02 ą 1% perf-profile.children.cycles-pp.__call_rcu_common > 2.04 ą 1% -0.1 1.93 ą 1% -0.1 1.95 ą 1% perf-profile.children.cycles-pp.allocate_slab > 1.77 ą 1% -0.1 1.66 ą 0% -0.1 1.69 ą 1% perf-profile.children.cycles-pp.mas_wr_walk > 1.87 ą 0% -0.1 1.77 ą 0% -0.1 1.80 ą 0% perf-profile.children.cycles-pp.vma_link > 2.24 ą 0% -0.1 2.13 ą 0% -0.1 2.17 ą 0% perf-profile.children.cycles-pp.native_flush_tlb_one_user > 1.85 ą 1% -0.1 1.74 ą 0% -0.1 1.79 ą 2% perf-profile.children.cycles-pp.up_write > 2.48 ą 0% -0.1 2.38 ą 0% -0.1 2.42 ą 0% perf-profile.children.cycles-pp.unmap_page_range > 0.97 ą 2% -0.1 0.88 ą 1% -0.1 0.90 ą 1% perf-profile.children.cycles-pp.rcu_all_qs > 1.04 ą 0% -0.1 0.95 ą 1% -0.0 0.99 ą 1% perf-profile.children.cycles-pp.mas_prev > 1.24 ą 0% -0.1 1.16 ą 0% -0.1 1.19 ą 0% perf-profile.children.cycles-pp.mas_prev_slot > 0.93 ą 0% -0.1 0.85 ą 1% -0.0 0.88 ą 1% perf-profile.children.cycles-pp.mas_prev_setup > 1.39 ą 1% -0.1 1.31 ą 1% -0.1 1.33 ą 1% perf-profile.children.cycles-pp.shuffle_freelist > 1.52 ą 0% -0.1 1.45 ą 0% -0.0 1.48 ą 0% perf-profile.children.cycles-pp.mas_update_gap > 1.58 ą 1% -0.1 1.50 ą 0% -0.0 1.53 ą 0% perf-profile.children.cycles-pp.zap_pmd_range > 0.87 ą 1% -0.1 0.80 ą 0% -0.1 0.82 ą 1% perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 1.68 ą 1% -0.1 1.62 ą 0% -0.1 1.62 ą 0% perf-profile.children.cycles-pp.__get_unmapped_area > 0.90 ą 1% -0.1 0.84 ą 0% -0.0 0.86 ą 1% perf-profile.children.cycles-pp.percpu_counter_add_batch > 0.62 ą 1% -0.1 0.56 ą 1% -0.0 0.60 ą 1% perf-profile.children.cycles-pp.security_mmap_addr > 0.49 ą 1% -0.1 0.44 ą 1% -0.1 0.44 ą 1% perf-profile.children.cycles-pp.setup_object > 1.02 ą 0% -0.1 0.97 ą 1% -0.0 0.99 ą 0% perf-profile.children.cycles-pp.mas_leaf_max_gap > 0.98 ą 1% -0.0 0.93 ą 1% -0.0 0.94 ą 1% perf-profile.children.cycles-pp.mas_pop_node > 1.22 ą 1% -0.0 1.18 ą 1% -0.0 1.19 ą 1% perf-profile.children.cycles-pp.__pte_offset_map_lock > 0.45 ą 2% -0.0 0.40 ą 2% -0.0 0.41 ą 1% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 1.18 ą 0% -0.0 1.13 ą 0% -0.0 1.15 ą 1% perf-profile.children.cycles-pp.clear_bhb_loop > 1.08 ą 1% -0.0 1.03 ą 0% -0.0 1.05 ą 0% perf-profile.children.cycles-pp.zap_pte_range > 1.04 ą 0% -0.0 1.00 ą 0% -0.0 1.01 ą 0% perf-profile.children.cycles-pp.vma_to_resize > 0.58 ą 1% -0.0 0.53 ą 1% -0.0 0.54 ą 1% perf-profile.children.cycles-pp.mas_wr_end_piv > 0.34 ą 2% -0.0 0.30 ą 5% -0.0 0.31 ą 4% perf-profile.children.cycles-pp.get_partial_node > 0.64 ą 1% -0.0 0.61 ą 2% -0.0 0.61 ą 1% perf-profile.children.cycles-pp.get_old_pud > 0.62 ą 0% -0.0 0.59 ą 0% -0.0 0.59 ą 1% perf-profile.children.cycles-pp.__put_partials > 1.14 ą 0% -0.0 1.10 ą 1% -0.0 1.12 ą 1% perf-profile.children.cycles-pp.mt_find > 0.90 ą 0% -0.0 0.87 ą 0% -0.0 0.87 ą 0% perf-profile.children.cycles-pp.userfaultfd_unmap_complete > 0.61 ą 1% -0.0 0.58 ą 1% -0.0 0.59 ą 0% perf-profile.children.cycles-pp.entry_SYSCALL_64 > 0.32 ą 2% -0.0 0.29 ą 3% -0.0 0.30 ą 4% perf-profile.children.cycles-pp.security_vm_enough_memory_mm > 0.54 ą 1% -0.0 0.52 ą 1% -0.0 0.52 ą 1% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown_vmflags > 0.55 ą 1% -0.0 0.52 ą 1% -0.0 0.54 ą 1% perf-profile.children.cycles-pp.refill_obj_stock > 0.45 ą 1% -0.0 0.43 ą 2% -0.0 0.43 ą 2% perf-profile.children.cycles-pp.__alloc_pages_noprof > 0.43 ą 1% -0.0 0.41 ą 2% -0.0 0.41 ą 2% perf-profile.children.cycles-pp.get_page_from_freelist > 0.17 ą 1% -0.0 0.15 ą 3% -0.0 0.16 ą 1% perf-profile.children.cycles-pp.get_any_partial > 0.32 ą 1% -0.0 0.30 ą 1% -0.0 0.30 ą 1% perf-profile.children.cycles-pp.pte_offset_map_nolock > 0.40 ą 0% -0.0 0.38 ą 1% -0.0 0.39 ą 1% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.28 ą 2% -0.0 0.26 ą 2% -0.0 0.27 ą 1% perf-profile.children.cycles-pp.khugepaged_enter_vma > 0.32 ą 1% -0.0 0.30 ą 1% -0.0 0.30 ą 2% perf-profile.children.cycles-pp.mas_wr_store_setup > 0.19 ą 4% -0.0 0.17 ą 4% -0.0 0.18 ą 6% perf-profile.children.cycles-pp.cap_vm_enough_memory > 0.29 ą 1% -0.0 0.27 ą 2% -0.0 0.28 ą 3% perf-profile.children.cycles-pp.tlb_gather_mmu > 0.09 ą 4% -0.0 0.07 ą 6% -0.0 0.08 ą 5% perf-profile.children.cycles-pp.vma_dup_policy > 0.16 ą 3% -0.0 0.14 ą 2% -0.0 0.14 ą 2% perf-profile.children.cycles-pp.mas_wr_append > 0.22 ą 2% -0.0 0.20 ą 3% -0.0 0.20 ą 3% perf-profile.children.cycles-pp.__rmqueue_pcplist > 0.20 ą 2% -0.0 0.18 ą 2% -0.0 0.19 ą 3% perf-profile.children.cycles-pp.__thp_vma_allowable_orders > 0.24 ą 2% -0.0 0.23 ą 2% -0.0 0.23 ą 2% perf-profile.children.cycles-pp.free_pcppages_bulk > 0.44 ą 1% +0.0 0.45 ą 1% +0.0 0.46 ą 1% perf-profile.children.cycles-pp.mremap_userfaultfd_prep > 0.85 ą 1% +0.0 0.85 ą 1% -0.0 0.81 ą 1% perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags > 0.13 ą 3% +0.0 0.14 ą 3% +0.0 0.15 ą 2% perf-profile.children.cycles-pp.free_pgd_range > 0.08 ą 8% +0.0 0.10 ą 3% -0.0 0.08 ą 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags > 0.78 ą 1% +0.1 0.84 ą 0% +0.1 0.86 ą 0% perf-profile.children.cycles-pp.__madvise > 0.63 ą 1% +0.1 0.70 ą 1% +0.1 0.72 ą 0% perf-profile.children.cycles-pp.__x64_sys_madvise > 0.63 ą 1% +0.1 0.70 ą 0% +0.1 0.71 ą 0% perf-profile.children.cycles-pp.do_madvise > 0.00 ą -1% +0.1 0.09 ą 0% +0.1 0.09 ą 5% perf-profile.children.cycles-pp.can_modify_mm_madv > 1.32 ą 1% +0.1 1.46 ą 0% +0.2 1.50 ą 0% perf-profile.children.cycles-pp.mas_next_slot > 87.96 ą 0% +0.6 88.52 ą 0% +0.5 88.48 ą 0% perf-profile.children.cycles-pp.mremap > 85.91 ą 0% +0.8 86.69 ą 0% +0.7 86.61 ą 0% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 83.74 ą 0% +0.8 84.52 ą 0% +0.7 84.40 ą 0% perf-profile.children.cycles-pp.__do_sys_mremap > 85.42 ą 0% +0.8 86.23 ą 0% +0.7 86.14 ą 0% perf-profile.children.cycles-pp.do_syscall_64 > 40.36 ą 0% +1.4 41.74 ą 0% +2.1 42.49 ą 0% perf-profile.children.cycles-pp.do_vmi_munmap > 2.12 ą 0% +1.5 3.63 ą 0% +1.7 3.81 ą 0% perf-profile.children.cycles-pp.do_munmap > 3.62 ą 0% +2.3 5.97 ą 0% +1.7 5.29 ą 0% perf-profile.children.cycles-pp.mas_walk > 5.41 ą 0% +3.0 8.44 ą 0% +1.6 6.98 ą 0% perf-profile.children.cycles-pp.mremap_to > 5.28 ą 0% +3.2 8.48 ą 0% +2.3 7.56 ą 0% perf-profile.children.cycles-pp.mas_find > 0.00 ą -1% +5.4 5.45 ą 0% +3.9 3.94 ą 0% perf-profile.children.cycles-pp.can_modify_mm > 11.51 ą 0% -0.6 10.86 ą 0% -0.5 11.04 ą 0% perf-profile.self.cycles-pp.__slab_free > 4.23 ą 2% -0.2 4.00 ą 2% -0.1 4.13 ą 2% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > 2.34 ą 1% -0.1 2.21 ą 1% -0.0 2.30 ą 3% perf-profile.self.cycles-pp.down_write > 2.43 ą 0% -0.1 2.31 ą 0% -0.1 2.34 ą 0% perf-profile.self.cycles-pp.rcu_cblist_dequeue > 2.34 ą 0% -0.1 2.24 ą 0% -0.1 2.27 ą 0% perf-profile.self.cycles-pp.mtree_load > 2.21 ą 0% -0.1 2.11 ą 0% -0.1 2.14 ą 0% perf-profile.self.cycles-pp.native_flush_tlb_one_user > 1.75 ą 0% -0.1 1.67 ą 0% -0.0 1.70 ą 0% perf-profile.self.cycles-pp.mod_objcg_state > 1.54 ą 1% -0.1 1.46 ą 0% -0.0 1.50 ą 1% perf-profile.self.cycles-pp.up_write > 1.52 ą 0% -0.1 1.44 ą 0% -0.1 1.46 ą 0% perf-profile.self.cycles-pp.mas_wr_walk > 0.70 ą 3% -0.1 0.63 ą 1% -0.1 0.64 ą 1% perf-profile.self.cycles-pp.rcu_all_qs > 1.43 ą 1% -0.1 1.36 ą 1% -0.1 1.36 ą 1% perf-profile.self.cycles-pp.__call_rcu_common > 1.01 ą 0% -0.1 0.95 ą 0% -0.0 0.96 ą 0% perf-profile.self.cycles-pp.mas_preallocate > 1.40 ą 1% -0.1 1.33 ą 1% -0.0 1.35 ą 0% perf-profile.self.cycles-pp.do_vmi_align_munmap > 1.00 ą 0% -0.1 0.94 ą 0% -0.0 0.96 ą 0% perf-profile.self.cycles-pp.mas_prev_slot > 1.14 ą 1% -0.1 1.08 ą 1% -0.0 1.10 ą 1% perf-profile.self.cycles-pp.shuffle_freelist > 1.18 ą 0% -0.1 1.13 ą 0% -0.0 1.16 ą 0% perf-profile.self.cycles-pp.vma_merge > 0.94 ą 1% -0.1 0.89 ą 2% -0.0 0.91 ą 1% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > 0.88 ą 0% -0.1 0.83 ą 1% -0.0 0.84 ą 0% perf-profile.self.cycles-pp.___slab_alloc > 0.50 ą 1% -0.0 0.45 ą 2% -0.0 0.50 ą 1% perf-profile.self.cycles-pp.security_mmap_addr > 0.77 ą 1% -0.0 0.72 ą 1% -0.0 0.74 ą 1% perf-profile.self.cycles-pp.percpu_counter_add_batch > 0.45 ą 2% -0.0 0.40 ą 2% -0.0 0.41 ą 1% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 1.17 ą 0% -0.0 1.12 ą 0% -0.0 1.14 ą 1% perf-profile.self.cycles-pp.clear_bhb_loop > 1.08 ą 1% -0.0 1.04 ą 1% -0.0 1.06 ą 1% perf-profile.self.cycles-pp.__cond_resched > 1.50 ą 2% -0.0 1.46 ą 0% -0.0 1.48 ą 0% perf-profile.self.cycles-pp.kmem_cache_free > 1.23 ą 0% -0.0 1.18 ą 0% -0.1 1.18 ą 0% perf-profile.self.cycles-pp.move_vma > 0.68 ą 1% -0.0 0.64 ą 0% -0.0 0.65 ą 1% perf-profile.self.cycles-pp.__split_vma > 0.80 ą 0% -0.0 0.76 ą 1% -0.0 0.77 ą 0% perf-profile.self.cycles-pp.mas_wr_store_entry > 0.61 ą 2% -0.0 0.57 ą 2% -0.0 0.57 ą 6% perf-profile.self.cycles-pp.mremap > 0.85 ą 1% -0.0 0.80 ą 1% -0.0 0.81 ą 1% perf-profile.self.cycles-pp.mas_pop_node > 0.44 ą 0% -0.0 0.40 ą 1% -0.0 0.40 ą 1% perf-profile.self.cycles-pp.do_munmap > 0.98 ą 0% -0.0 0.94 ą 1% -0.0 0.95 ą 0% perf-profile.self.cycles-pp.move_ptes > 0.89 ą 0% -0.0 0.86 ą 0% -0.0 0.87 ą 0% perf-profile.self.cycles-pp.mas_leaf_max_gap > 0.46 ą 1% -0.0 0.42 ą 1% -0.0 0.43 ą 1% perf-profile.self.cycles-pp.mas_wr_end_piv > 0.89 ą 0% -0.0 0.86 ą 0% -0.0 0.87 ą 0% perf-profile.self.cycles-pp.mas_store_gfp > 0.79 ą 0% -0.0 0.76 ą 1% -0.0 0.76 ą 0% perf-profile.self.cycles-pp.userfaultfd_unmap_complete > 0.99 ą 0% -0.0 0.97 ą 0% -0.0 0.98 ą 0% perf-profile.self.cycles-pp.mt_find > 0.87 ą 0% -0.0 0.84 ą 0% -0.0 0.84 ą 0% perf-profile.self.cycles-pp.move_page_tables > 0.55 ą 2% -0.0 0.52 ą 1% -0.0 0.52 ą 1% perf-profile.self.cycles-pp.get_old_pud > 0.50 ą 0% -0.0 0.47 ą 1% -0.0 0.48 ą 0% perf-profile.self.cycles-pp.find_vma_prev > 0.61 ą 0% -0.0 0.58 ą 1% -0.0 0.59 ą 0% perf-profile.self.cycles-pp.unmap_region > 0.66 ą 0% -0.0 0.63 ą 1% -0.0 0.64 ą 0% perf-profile.self.cycles-pp.mas_store_prealloc > 0.27 ą 1% -0.0 0.25 ą 1% -0.0 0.26 ą 1% perf-profile.self.cycles-pp.mas_prev_setup > 0.61 ą 1% -0.0 0.59 ą 1% -0.0 0.60 ą 1% perf-profile.self.cycles-pp.copy_vma > 0.48 ą 0% -0.0 0.45 ą 1% -0.0 0.46 ą 1% perf-profile.self.cycles-pp.flush_tlb_mm_range > 0.41 ą 1% -0.0 0.39 ą 1% -0.0 0.40 ą 1% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.48 ą 1% -0.0 0.46 ą 1% -0.0 0.47 ą 0% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.50 ą 1% -0.0 0.48 ą 1% -0.0 0.48 ą 1% perf-profile.self.cycles-pp.refill_obj_stock > 0.47 ą 1% -0.0 0.46 ą 1% -0.0 0.45 ą 1% perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags > 0.71 ą 0% -0.0 0.69 ą 1% -0.0 0.69 ą 1% perf-profile.self.cycles-pp.unmap_page_range > 0.17 ą 4% -0.0 0.15 ą 4% -0.0 0.16 ą 3% perf-profile.self.cycles-pp.get_partial_node > 0.24 ą 1% -0.0 0.22 ą 1% -0.0 0.23 ą 0% perf-profile.self.cycles-pp.mas_prev > 0.45 ą 1% -0.0 0.43 ą 0% -0.0 0.44 ą 1% perf-profile.self.cycles-pp.mas_update_gap > 0.53 ą 1% -0.0 0.51 ą 0% -0.0 0.51 ą 1% perf-profile.self.cycles-pp.mremap_to > 0.21 ą 2% -0.0 0.19 ą 2% -0.0 0.19 ą 2% perf-profile.self.cycles-pp.__get_unmapped_area > 0.27 ą 1% -0.0 0.26 ą 1% -0.0 0.25 ą 1% perf-profile.self.cycles-pp.tlb_finish_mmu > 0.18 ą 2% -0.0 0.17 ą 2% -0.0 0.18 ą 2% perf-profile.self.cycles-pp.rcu_do_batch > 0.06 ą 0% -0.0 0.05 ą 0% -0.0 0.05 ą 0% perf-profile.self.cycles-pp.vma_dup_policy > 0.12 ą 0% -0.0 0.11 ą 0% -0.0 0.11 ą 3% perf-profile.self.cycles-pp.mas_wr_append > 0.14 ą 3% -0.0 0.13 ą 3% -0.0 0.12 ą 3% perf-profile.self.cycles-pp.x64_sys_call > 0.11 ą 0% +0.0 0.12 ą 0% +0.0 0.12 ą 3% perf-profile.self.cycles-pp.free_pgd_range > 0.06 ą 5% +0.0 0.07 ą 0% +0.0 0.06 ą 5% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags > 0.21 ą 0% +0.0 0.22 ą 2% -0.0 0.21 ą 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags > 0.45 ą 1% +0.0 0.48 ą 2% +0.0 0.50 ą 1% perf-profile.self.cycles-pp.do_vmi_munmap > 0.27 ą 1% +0.0 0.32 ą 2% -0.0 0.26 ą 1% perf-profile.self.cycles-pp.free_pgtables > 0.36 ą 2% +0.1 0.44 ą 1% -0.0 0.35 ą 4% perf-profile.self.cycles-pp.unlink_anon_vmas > 1.07 ą 1% +0.1 1.19 ą 0% +0.1 1.22 ą 0% perf-profile.self.cycles-pp.mas_next_slot > 1.50 ą 0% +0.5 2.02 ą 0% +0.4 1.85 ą 0% perf-profile.self.cycles-pp.mas_find > 0.00 ą -1% +1.4 1.38 ą 0% +0.9 0.92 ą 0% perf-profile.self.cycles-pp.can_modify_mm > 3.15 ą 0% +2.1 5.26 ą 0% +1.5 4.62 ą 0% perf-profile.self.cycles-pp.mas_walk > > > On Mon, Aug 19, 2024 at 02:35:40PM +0800, Oliver Sang wrote: > > hi, Jeff, > > > > On Mon, Aug 19, 2024 at 09:38:19AM +0800, Oliver Sang wrote: > > > hi, Jeff, > > > > > > On Sun, Aug 18, 2024 at 05:28:41PM +0800, Oliver Sang wrote: > > > > hi, Jeff, > > > > > > > > On Thu, Aug 15, 2024 at 07:58:57PM -0700, Jeff Xu wrote: > > > > > Hi Oliver > > > > > > > > [...] > > > > > > > > > > could you exlictly point to two commit-id? > > > > > sure > > > > > > > > > > this patch > > > > > 8be7258a: mseal: add mseal syscall > > > > > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > I failed to apply this patch set to "8be7258a: mseal: add mseal syscall" > > > > > > look your patch set again > > > [PATCH v1 1/2] mseal:selftest mremap across VMA boundaries > > > just for kselftests > > > > > > and I can apply > > > [PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm > > > upon "8be7258a: mseal: add mseal syscall" cleanly > > > > > > so I will start test for this [PATCH v1 2/2] > > > > > > BTW, I will firstly use our default setting - "60s testtime; reboot between each > > > run; run 10 times", since we've already have the data for 8be7258a and ff388fe5c > > > then we could give you an update kind of quickly. > > > > > > as some private mail discussed, you want some special run method, could you > > > elaborate them here? thanks > > > > here is a quick update before you give us more details about special run method. > > > > by our default run method (60s testtime; reboot between each run; run 10 times), > > your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" could > > resolve regression partically. > > > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pagemove/stress-ng/60s > > > > commit: > > ff388fe5c4 ("mseal: wire up mseal syscall") > > 8be7258aad ("mseal: add mseal syscall") > > 2a78ece39f <-- your "[PATCH v1 2/2] mseal: refactor mremap to remove can_modify_mm" > > > > ff388fe5c481d39c 8be7258aad44b5e25977a98db13 2a78ece39f13ea6f3f9679a6c66 > > ---------------- --------------------------- --------------------------- > > %stddev %change %stddev %change %stddev > > \ | \ | \ > > 4957 +1.3% 5023 +1.0% 5008 time.percent_of_cpu_this_job_got > > 2915 +1.5% 2959 +1.2% 2949 time.system_time > > 65.96 -7.3% 61.16 -5.5% 62.30 time.user_time > > 41535878 -4.0% 39873501 -2.6% 40452264 proc-vmstat.numa_hit > > 41466104 -4.0% 39806121 -2.6% 40384854 proc-vmstat.numa_local > > 77297398 -4.1% 74165258 -2.6% 75286134 proc-vmstat.pgalloc_normal > > 77016866 -4.1% 73886027 -2.6% 75012630 proc-vmstat.pgfree > > 18386219 -5.0% 17474214 -2.9% 17850959 stress-ng.pagemove.ops > > 306421 -5.0% 291207 -2.9% 297490 stress-ng.pagemove.ops_per_sec > > 4957 +1.3% 5023 +1.0% 5008 stress-ng.time.percent_of_cpu_this_job_got > > 2915 +1.5% 2959 +1.2% 2949 stress-ng.time.system_time > > 3.349e+10 ą 4% +3.0% 3.447e+10 ą 2% +4.1% 3.484e+10 perf-stat.i.branch-instructions > > 1.13 -2.1% 1.10 -2.2% 1.10 perf-stat.i.cpi > > 0.89 +2.2% 0.91 +2.0% 0.91 perf-stat.i.ipc > > 1.04 -6.9% 0.97 -4.9% 0.99 perf-stat.overall.MPKI > > 1.13 -2.3% 1.10 -2.0% 1.10 perf-stat.overall.cpi > > 1081 +5.0% 1136 +3.0% 1114 perf-stat.overall.cycles-between-cache-misses > > 0.89 +2.3% 0.91 +2.0% 0.91 perf-stat.overall.ipc > > 3.295e+10 ą 3% +2.9% 3.392e+10 ą 2% +4.0% 3.427e+10 perf-stat.ps.branch-instructions > > 1.674e+11 ą 3% +1.8% 1.704e+11 ą 2% +3.3% 1.73e+11 perf-stat.ps.instructions > > 1.046e+13 +2.7% 1.074e+13 +1.7% 1.064e+13 perf-stat.total.instructions > > 75.05 -2.0 73.02 -0.9 74.18 perf-profile.calltrace.cycles-pp.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 36.83 -1.6 35.19 -1.2 35.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 25.02 -1.4 23.65 -0.9 24.12 perf-profile.calltrace.cycles-pp.copy_vma.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 19.94 -1.1 18.87 -0.8 19.19 perf-profile.calltrace.cycles-pp.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 14.78 -0.8 14.01 -0.5 14.28 perf-profile.calltrace.cycles-pp.vma_merge.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 1.48 -0.5 0.99 -0.5 1.00 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 7.88 -0.4 7.47 -0.3 7.62 perf-profile.calltrace.cycles-pp.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 6.73 -0.4 6.37 -0.2 6.51 perf-profile.calltrace.cycles-pp.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 6.16 -0.3 5.82 -0.3 5.90 perf-profile.calltrace.cycles-pp.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 6.12 -0.3 5.79 -0.2 5.93 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 5.79 -0.3 5.48 -0.2 5.59 perf-profile.calltrace.cycles-pp.move_ptes.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > 5.54 -0.3 5.25 -0.2 5.32 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 5.56 -0.3 5.28 -0.2 5.36 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 5.19 -0.3 4.92 -0.2 4.98 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma.do_vmi_align_munmap > > 5.21 -0.3 4.95 -0.2 5.02 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma.move_vma > > 4.09 -0.2 3.85 -0.2 3.93 perf-profile.calltrace.cycles-pp.vm_area_dup.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 4.69 -0.2 4.46 -0.2 4.51 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge.copy_vma > > 3.56 -0.2 3.36 -0.1 3.43 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma.__do_sys_mremap > > 3.40 -0.2 3.22 -0.1 3.29 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma.__do_sys_mremap > > 1.35 -0.2 1.16 -0.1 1.24 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > 4.00 -0.2 3.82 -0.1 3.86 perf-profile.calltrace.cycles-pp.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_complete.__split_vma > > 2.23 -0.2 2.05 -0.1 2.12 perf-profile.calltrace.cycles-pp.find_vma_prev.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 8.26 -0.2 8.10 -0.2 8.06 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.97 ą 3% -0.2 1.81 ą 3% -0.1 1.88 ą 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > > 3.11 ą 2% -0.2 2.96 -0.1 3.05 perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > 0.97 -0.2 0.81 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.do_munmap.mremap_to > > 2.27 -0.2 2.11 -0.1 2.16 perf-profile.calltrace.cycles-pp.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 3.25 -0.1 3.10 -0.1 3.17 perf-profile.calltrace.cycles-pp.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 3.14 -0.1 3.00 -0.1 3.06 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 2.98 -0.1 2.85 -0.1 2.87 ą 2% perf-profile.calltrace.cycles-pp.anon_vma_clone.__split_vma.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 1.27 ą 2% -0.1 1.15 ą 4% -0.1 1.19 ą 6% perf-profile.calltrace.cycles-pp.__memcpy.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > > 2.45 -0.1 2.34 -0.1 2.38 perf-profile.calltrace.cycles-pp.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables.move_vma > > 2.05 -0.1 1.94 -0.1 1.97 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 2.44 -0.1 2.33 -0.1 2.38 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap > > 2.22 -0.1 2.11 -0.1 2.15 perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.flush_tlb_mm_range.move_ptes.move_page_tables > > 1.76 ą 2% -0.1 1.65 ą 2% -0.1 1.66 ą 4% perf-profile.calltrace.cycles-pp.vma_prepare.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 1.86 -0.1 1.75 -0.1 1.78 perf-profile.calltrace.cycles-pp.vma_link.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 1.40 -0.1 1.30 -0.1 1.34 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap.do_vmi_munmap > > 1.39 -0.1 1.30 -0.1 1.33 perf-profile.calltrace.cycles-pp.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma.move_vma > > 0.55 -0.1 0.46 ą 30% -0.0 0.52 perf-profile.calltrace.cycles-pp.mas_find.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > 1.25 -0.1 1.16 -0.1 1.20 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma.do_vmi_align_munmap > > 0.94 -0.1 0.86 -0.1 0.87 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.23 -0.1 1.15 -0.1 1.17 perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge.copy_vma > > 1.54 -0.1 1.47 -0.0 1.49 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap > > 0.73 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_walk.find_vma_prev.copy_vma.move_vma.__do_sys_mremap > > 1.15 -0.1 1.09 -0.1 1.10 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.do_vmi_align_munmap > > 0.60 ą 2% -0.1 0.54 -0.0 0.58 perf-profile.calltrace.cycles-pp.security_mmap_addr.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > > 1.27 -0.1 1.21 -0.0 1.24 perf-profile.calltrace.cycles-pp.mas_wr_store_entry.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.80 ą 2% -0.1 0.74 ą 2% -0.0 0.76 ą 2% perf-profile.calltrace.cycles-pp.__call_rcu_common.mas_wr_node_store.mas_wr_store_entry.mas_store_prealloc.vma_merge > > 0.72 -0.1 0.66 -0.0 0.69 perf-profile.calltrace.cycles-pp.mas_prev.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 0.78 -0.1 0.73 -0.0 0.75 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.__split_vma > > 0.69 ą 2% -0.1 0.64 ą 3% -0.0 0.66 ą 4% perf-profile.calltrace.cycles-pp.mod_objcg_state.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.vm_area_dup.copy_vma > > 1.63 -0.1 1.58 -0.1 1.57 perf-profile.calltrace.cycles-pp.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.02 -0.1 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region > > 0.77 -0.0 0.72 -0.0 0.74 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.mas_alloc_nodes.mas_preallocate.vma_merge > > 0.62 -0.0 0.57 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_setup.mas_prev.vma_merge.copy_vma.move_vma > > 0.67 -0.0 0.62 -0.0 0.64 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.86 -0.0 0.81 -0.0 0.83 perf-profile.calltrace.cycles-pp.mtree_load.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64 > > 1.12 -0.0 1.08 -0.0 1.09 perf-profile.calltrace.cycles-pp.clear_bhb_loop.mremap > > 0.56 -0.0 0.51 -0.0 0.53 perf-profile.calltrace.cycles-pp.mas_walk.mas_prev_setup.mas_prev.vma_merge.copy_vma > > 0.68 ą 2% -0.0 0.63 -0.0 0.65 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.mremap > > 0.81 -0.0 0.77 -0.0 0.80 perf-profile.calltrace.cycles-pp.mtree_load.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 1.02 -0.0 0.97 -0.0 0.98 perf-profile.calltrace.cycles-pp.vma_to_resize.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.95 ą 2% -0.0 0.90 ą 2% -0.0 0.93 perf-profile.calltrace.cycles-pp.__memcg_slab_free_hook.kmem_cache_free.unlink_anon_vmas.free_pgtables.unmap_region > > 0.98 -0.0 0.94 -0.0 0.95 perf-profile.calltrace.cycles-pp.mas_find.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.78 -0.0 0.74 -0.0 0.75 perf-profile.calltrace.cycles-pp.mas_store_prealloc.vma_link.copy_vma.move_vma.__do_sys_mremap > > 0.70 -0.0 0.66 -0.0 0.67 perf-profile.calltrace.cycles-pp.__call_rcu_common.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 0.69 -0.0 0.65 -0.0 0.66 perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.copy_vma.move_vma > > 0.69 -0.0 0.65 -0.0 0.65 perf-profile.calltrace.cycles-pp.mas_preallocate.vma_link.copy_vma.move_vma.__do_sys_mremap > > 0.62 -0.0 0.59 -0.0 0.60 perf-profile.calltrace.cycles-pp.mas_prev_slot.do_vmi_align_munmap.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.16 -0.0 1.12 -0.0 1.13 perf-profile.calltrace.cycles-pp.anon_vma_clone.copy_vma.move_vma.__do_sys_mremap.do_syscall_64 > > 0.76 ą 2% -0.0 0.72 -0.0 0.72 ą 2% perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_noprof.vm_area_dup.__split_vma > > 1.01 -0.0 0.97 -0.0 0.99 perf-profile.calltrace.cycles-pp.mt_find.vma_merge.copy_vma.move_vma.__do_sys_mremap > > 0.60 -0.0 0.57 -0.0 0.58 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas > > 0.88 -0.0 0.85 -0.0 0.85 perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.62 ą 2% -0.0 0.59 ą 2% -0.0 0.60 perf-profile.calltrace.cycles-pp.get_old_pud.move_page_tables.move_vma.__do_sys_mremap.do_syscall_64 > > 0.59 -0.0 0.56 -0.0 0.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.mremap > > 0.65 -0.0 0.62 ą 2% -0.0 0.63 perf-profile.calltrace.cycles-pp.mas_update_gap.mas_store_gfp.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.81 +0.0 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.thp_get_unmapped_area_vmflags.__get_unmapped_area.mremap_to.__do_sys_mremap.do_syscall_64 > > 2.76 +0.0 2.78 ą 2% -0.1 2.67 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap > > 3.47 +0.0 3.51 -0.1 3.37 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_vmi_align_munmap.do_vmi_munmap.move_vma > > 0.76 +0.1 0.83 +0.1 0.85 perf-profile.calltrace.cycles-pp.__madvise > > 0.66 +0.1 0.73 +0.1 0.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.67 +0.1 0.74 +0.1 0.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise > > 0.63 +0.1 0.70 +0.1 0.72 perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.62 +0.1 0.70 +0.1 0.71 perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise > > 0.00 +0.9 0.86 +0.9 0.92 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.do_munmap > > 0.00 +0.9 0.88 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.mremap_to.__do_sys_mremap > > 83.81 +0.9 84.69 +0.6 84.44 perf-profile.calltrace.cycles-pp.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 0.00 +0.9 0.90 ą 2% +0.9 0.91 perf-profile.calltrace.cycles-pp.mas_walk.mas_find.can_modify_mm.do_vmi_munmap.move_vma > > 0.00 +1.1 1.10 +0.0 0.00 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.00 +1.2 1.21 +1.3 1.28 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to > > 2.10 +1.5 3.60 +1.7 3.79 perf-profile.calltrace.cycles-pp.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +1.5 1.52 +1.5 1.52 perf-profile.calltrace.cycles-pp.mas_find.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap > > 1.59 +1.5 3.12 +1.7 3.31 perf-profile.calltrace.cycles-pp.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap.do_syscall_64 > > 0.00 +1.6 1.61 +0.0 0.00 perf-profile.calltrace.cycles-pp.can_modify_mm.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +1.7 1.73 +1.8 1.83 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.do_munmap.mremap_to.__do_sys_mremap > > 0.00 +2.0 2.01 +2.0 2.04 perf-profile.calltrace.cycles-pp.can_modify_mm.do_vmi_munmap.move_vma.__do_sys_mremap.do_syscall_64 > > 5.34 +3.0 8.38 +1.6 6.92 perf-profile.calltrace.cycles-pp.mremap_to.__do_sys_mremap.do_syscall_64.entry_SYSCALL_64_after_hwframe.mremap > > 75.22 -2.0 73.18 -0.9 74.34 perf-profile.children.cycles-pp.move_vma > > 37.04 -1.6 35.40 -1.2 35.83 perf-profile.children.cycles-pp.do_vmi_align_munmap > > 25.09 -1.4 23.72 -0.9 24.20 perf-profile.children.cycles-pp.copy_vma > > 20.04 -1.1 18.96 -0.8 19.28 perf-profile.children.cycles-pp.__split_vma > > 19.87 -1.0 18.84 -0.6 19.24 perf-profile.children.cycles-pp.rcu_core > > 19.85 -1.0 18.82 -0.6 19.22 perf-profile.children.cycles-pp.rcu_do_batch > > 19.89 -1.0 18.86 -0.6 19.26 perf-profile.children.cycles-pp.handle_softirqs > > 17.55 -0.9 16.67 -0.5 17.02 perf-profile.children.cycles-pp.kmem_cache_free > > 15.32 -0.8 14.49 -0.5 14.78 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof > > 15.17 -0.8 14.39 -0.5 14.66 perf-profile.children.cycles-pp.vma_merge > > 12.12 -0.6 11.48 -0.4 11.70 perf-profile.children.cycles-pp.__slab_free > > 12.19 -0.6 11.56 -0.5 11.73 perf-profile.children.cycles-pp.mas_wr_store_entry > > 11.99 -0.6 11.36 -0.5 11.53 perf-profile.children.cycles-pp.mas_store_prealloc > > 10.88 -0.6 10.28 -0.4 10.50 perf-profile.children.cycles-pp.vm_area_dup > > 9.90 -0.5 9.41 -0.4 9.53 perf-profile.children.cycles-pp.mas_wr_node_store > > 8.39 -0.5 7.92 -0.3 8.13 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook > > 7.99 -0.4 7.58 -0.3 7.73 perf-profile.children.cycles-pp.move_page_tables > > 6.70 -0.4 6.33 -0.3 6.43 perf-profile.children.cycles-pp.vma_complete > > 5.87 -0.3 5.55 -0.2 5.66 perf-profile.children.cycles-pp.move_ptes > > 5.12 -0.3 4.81 -0.2 4.90 perf-profile.children.cycles-pp.mas_preallocate > > 6.05 -0.3 5.74 -0.2 5.85 perf-profile.children.cycles-pp.vm_area_free_rcu_cb > > 2.98 -0.3 2.69 ą 4% -0.2 2.80 ą 6% perf-profile.children.cycles-pp.__memcpy > > 3.46 ą 2% -0.2 3.25 -0.1 3.36 ą 3% perf-profile.children.cycles-pp.mod_objcg_state > > 3.47 -0.2 3.26 -0.2 3.32 perf-profile.children.cycles-pp.___slab_alloc > > 2.44 -0.2 2.25 -0.1 2.33 perf-profile.children.cycles-pp.find_vma_prev > > 2.92 -0.2 2.73 -0.1 2.79 perf-profile.children.cycles-pp.mas_alloc_nodes > > 3.46 -0.2 3.27 -0.1 3.34 perf-profile.children.cycles-pp.flush_tlb_mm_range > > 3.47 -0.2 3.29 -0.2 3.32 ą 2% perf-profile.children.cycles-pp.down_write > > 3.33 -0.2 3.16 -0.1 3.25 perf-profile.children.cycles-pp.__memcg_slab_free_hook > > 4.23 -0.2 4.07 -0.1 4.08 ą 2% perf-profile.children.cycles-pp.anon_vma_clone > > 8.33 -0.2 8.17 -0.2 8.13 perf-profile.children.cycles-pp.unmap_region > > 3.35 -0.1 3.20 -0.1 3.26 perf-profile.children.cycles-pp.mas_store_gfp > > 2.21 -0.1 2.07 -0.1 2.10 perf-profile.children.cycles-pp.__cond_resched > > 3.19 -0.1 3.05 -0.1 3.11 perf-profile.children.cycles-pp.unmap_vmas > > 2.12 -0.1 1.99 -0.1 2.04 perf-profile.children.cycles-pp.__call_rcu_common > > 2.66 -0.1 2.54 -0.1 2.60 perf-profile.children.cycles-pp.mtree_load > > 2.24 -0.1 2.12 ą 2% -0.1 2.13 ą 3% perf-profile.children.cycles-pp.vma_prepare > > 2.50 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.flush_tlb_func > > 2.04 ą 2% -0.1 1.93 -0.1 1.96 ą 2% perf-profile.children.cycles-pp.allocate_slab > > 2.46 -0.1 2.35 -0.1 2.41 perf-profile.children.cycles-pp.rcu_cblist_dequeue > > 2.48 -0.1 2.38 -0.1 2.42 perf-profile.children.cycles-pp.unmap_page_range > > 2.23 -0.1 2.12 -0.1 2.16 perf-profile.children.cycles-pp.native_flush_tlb_one_user > > 1.77 -0.1 1.67 -0.1 1.70 perf-profile.children.cycles-pp.mas_wr_walk > > 1.88 -0.1 1.78 -0.1 1.80 perf-profile.children.cycles-pp.vma_link > > 1.84 -0.1 1.75 -0.1 1.77 perf-profile.children.cycles-pp.up_write > > 0.97 ą 2% -0.1 0.88 -0.1 0.89 perf-profile.children.cycles-pp.rcu_all_qs > > 1.40 -0.1 1.32 -0.1 1.34 ą 2% perf-profile.children.cycles-pp.shuffle_freelist > > 1.03 -0.1 0.95 -0.0 0.99 perf-profile.children.cycles-pp.mas_prev > > 0.92 -0.1 0.85 -0.0 0.88 perf-profile.children.cycles-pp.mas_prev_setup > > 1.58 -0.1 1.51 -0.1 1.53 perf-profile.children.cycles-pp.zap_pmd_range > > 1.24 -0.1 1.17 -0.0 1.20 perf-profile.children.cycles-pp.mas_prev_slot > > 1.57 -0.1 1.49 -0.1 1.49 perf-profile.children.cycles-pp.mas_update_gap > > 0.62 -0.1 0.56 -0.0 0.60 perf-profile.children.cycles-pp.security_mmap_addr > > 0.90 -0.1 0.84 -0.0 0.86 perf-profile.children.cycles-pp.percpu_counter_add_batch > > 0.86 -0.1 0.80 -0.0 0.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > > 0.98 -0.1 0.92 -0.0 0.95 perf-profile.children.cycles-pp.mas_pop_node > > 1.68 -0.1 1.62 -0.1 1.62 perf-profile.children.cycles-pp.__get_unmapped_area > > 1.23 -0.1 1.18 -0.0 1.20 perf-profile.children.cycles-pp.__pte_offset_map_lock > > 0.49 ą 2% -0.1 0.43 -0.1 0.43 ą 2% perf-profile.children.cycles-pp.setup_object > > 1.09 -0.1 1.03 -0.0 1.05 perf-profile.children.cycles-pp.zap_pte_range > > 1.07 ą 2% -0.1 1.02 ą 2% -0.1 1.00 perf-profile.children.cycles-pp.mas_leaf_max_gap > > 0.70 ą 2% -0.0 0.65 -0.0 0.67 perf-profile.children.cycles-pp.syscall_return_via_sysret > > 1.18 -0.0 1.14 -0.0 1.15 perf-profile.children.cycles-pp.clear_bhb_loop > > 0.51 ą 3% -0.0 0.47 -0.0 0.49 ą 3% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert > > 1.04 -0.0 1.00 -0.0 1.01 perf-profile.children.cycles-pp.vma_to_resize > > 0.57 -0.0 0.53 -0.0 0.54 perf-profile.children.cycles-pp.mas_wr_end_piv > > 0.44 ą 2% -0.0 0.40 ą 2% -0.0 0.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > > 1.14 -0.0 1.10 -0.0 1.12 perf-profile.children.cycles-pp.mt_find > > 0.90 -0.0 0.87 -0.0 0.87 perf-profile.children.cycles-pp.userfaultfd_unmap_complete > > 0.62 -0.0 0.59 -0.0 0.60 perf-profile.children.cycles-pp.__put_partials > > 0.45 ą 6% -0.0 0.42 -0.0 0.43 perf-profile.children.cycles-pp._raw_spin_lock > > 0.48 -0.0 0.45 ą 2% -0.0 0.46 perf-profile.children.cycles-pp.mas_prev_range > > 0.61 -0.0 0.58 -0.0 0.59 perf-profile.children.cycles-pp.entry_SYSCALL_64 > > 0.31 ą 3% -0.0 0.28 ą 3% -0.0 0.31 perf-profile.children.cycles-pp.security_vm_enough_memory_mm > > 0.33 ą 3% -0.0 0.30 ą 2% -0.0 0.31 ą 4% perf-profile.children.cycles-pp.mas_put_in_tree > > 0.32 ą 2% -0.0 0.29 ą 2% -0.0 0.30 perf-profile.children.cycles-pp.tlb_finish_mmu > > 0.46 -0.0 0.44 ą 2% -0.0 0.46 perf-profile.children.cycles-pp.rcu_segcblist_enqueue > > 0.33 -0.0 0.31 -0.0 0.32 perf-profile.children.cycles-pp.mas_destroy > > 0.36 -0.0 0.34 -0.0 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented > > 0.39 -0.0 0.37 -0.0 0.38 ą 2% perf-profile.children.cycles-pp.down_write_killable > > 0.29 -0.0 0.27 ą 2% -0.0 0.28 perf-profile.children.cycles-pp.tlb_gather_mmu > > 0.26 -0.0 0.24 ą 2% -0.0 0.25 ą 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 0.16 ą 2% -0.0 0.14 ą 3% -0.0 0.14 ą 3% perf-profile.children.cycles-pp.mas_wr_append > > 0.30 ą 2% -0.0 0.28 ą 2% -0.0 0.29 ą 2% perf-profile.children.cycles-pp.__vm_enough_memory > > 0.32 -0.0 0.30 ą 2% -0.0 0.31 perf-profile.children.cycles-pp.pte_offset_map_nolock > > 2.83 +0.0 2.85 ą 2% -0.1 2.74 perf-profile.children.cycles-pp.unlink_anon_vmas > > 0.84 +0.0 0.86 -0.0 0.81 perf-profile.children.cycles-pp.thp_get_unmapped_area_vmflags > > 0.08 ą 5% +0.0 0.10 ą 3% -0.0 0.08 ą 6% perf-profile.children.cycles-pp.mm_get_unmapped_area_vmflags > > 3.52 +0.0 3.56 -0.1 3.42 perf-profile.children.cycles-pp.free_pgtables > > 0.78 +0.1 0.85 +0.1 0.86 perf-profile.children.cycles-pp.__madvise > > 0.63 +0.1 0.70 +0.1 0.72 perf-profile.children.cycles-pp.__x64_sys_madvise > > 0.63 +0.1 0.70 +0.1 0.71 perf-profile.children.cycles-pp.do_madvise > > 0.00 +0.1 0.09 ą 3% +0.1 0.10 ą 5% perf-profile.children.cycles-pp.can_modify_mm_madv > > 1.31 +0.2 1.46 +0.2 1.50 perf-profile.children.cycles-pp.mas_next_slot > > 83.90 +0.9 84.79 +0.6 84.53 perf-profile.children.cycles-pp.__do_sys_mremap > > 40.45 +1.4 41.90 +2.1 42.57 perf-profile.children.cycles-pp.do_vmi_munmap > > 2.12 +1.5 3.62 +1.7 3.82 perf-profile.children.cycles-pp.do_munmap > > 3.63 +2.4 5.98 +1.7 5.29 perf-profile.children.cycles-pp.mas_walk > > 5.40 +3.0 8.44 +1.6 6.97 perf-profile.children.cycles-pp.mremap_to > > 5.26 +3.2 8.48 +2.3 7.58 perf-profile.children.cycles-pp.mas_find > > 0.00 +5.5 5.46 +3.9 3.93 perf-profile.children.cycles-pp.can_modify_mm > > 11.49 -0.6 10.89 -0.4 11.10 perf-profile.self.cycles-pp.__slab_free > > 4.32 -0.3 4.06 -0.2 4.16 perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook > > 1.96 -0.2 1.77 ą 4% -0.1 1.84 ą 6% perf-profile.self.cycles-pp.__memcpy > > 2.36 -0.1 2.25 ą 2% -0.1 2.25 ą 3% perf-profile.self.cycles-pp.down_write > > 2.42 -0.1 2.31 -0.0 2.38 perf-profile.self.cycles-pp.rcu_cblist_dequeue > > 2.33 -0.1 2.23 -0.1 2.28 perf-profile.self.cycles-pp.mtree_load > > 2.21 -0.1 2.10 -0.1 2.14 perf-profile.self.cycles-pp.native_flush_tlb_one_user > > 1.62 -0.1 1.54 -0.0 1.57 perf-profile.self.cycles-pp.__memcg_slab_free_hook > > 1.52 -0.1 1.44 -0.1 1.46 perf-profile.self.cycles-pp.mas_wr_walk > > 1.44 -0.1 1.36 -0.1 1.38 ą 2% perf-profile.self.cycles-pp.__call_rcu_common > > 1.53 -0.1 1.45 -0.0 1.48 perf-profile.self.cycles-pp.up_write > > 1.72 -0.1 1.65 -0.0 1.70 perf-profile.self.cycles-pp.mod_objcg_state > > 0.69 ą 2% -0.1 0.63 -0.1 0.63 perf-profile.self.cycles-pp.rcu_all_qs > > 1.14 ą 2% -0.1 1.08 -0.0 1.09 ą 2% perf-profile.self.cycles-pp.shuffle_freelist > > 1.18 -0.1 1.12 -0.0 1.17 perf-profile.self.cycles-pp.vma_merge > > 1.38 -0.1 1.33 -0.0 1.35 perf-profile.self.cycles-pp.do_vmi_align_munmap > > 0.51 ą 2% -0.1 0.45 -0.0 0.49 perf-profile.self.cycles-pp.security_mmap_addr > > 0.62 -0.1 0.56 ą 2% -0.1 0.56 perf-profile.self.cycles-pp.mremap > > 0.89 -0.1 0.83 -0.0 0.85 perf-profile.self.cycles-pp.___slab_alloc > > 0.99 -0.1 0.94 -0.0 0.96 perf-profile.self.cycles-pp.mas_prev_slot > > 1.00 -0.0 0.95 -0.0 0.96 perf-profile.self.cycles-pp.mas_preallocate > > 0.98 -0.0 0.93 -0.0 0.95 perf-profile.self.cycles-pp.move_ptes > > 0.85 -0.0 0.80 -0.0 0.82 perf-profile.self.cycles-pp.mas_pop_node > > 0.94 -0.0 0.90 -0.0 0.91 ą 2% perf-profile.self.cycles-pp.vm_area_free_rcu_cb > > 1.09 -0.0 1.04 -0.0 1.06 perf-profile.self.cycles-pp.__cond_resched > > 0.77 -0.0 0.72 -0.0 0.74 perf-profile.self.cycles-pp.percpu_counter_add_batch > > 0.94 ą 2% -0.0 0.89 ą 2% -0.1 0.87 perf-profile.self.cycles-pp.mas_leaf_max_gap > > 1.17 -0.0 1.12 -0.0 1.14 perf-profile.self.cycles-pp.clear_bhb_loop > > 0.68 -0.0 0.63 -0.0 0.65 perf-profile.self.cycles-pp.__split_vma > > 0.79 -0.0 0.75 -0.0 0.77 perf-profile.self.cycles-pp.mas_wr_store_entry > > 1.22 -0.0 1.18 -0.0 1.18 perf-profile.self.cycles-pp.move_vma > > 0.43 ą 2% -0.0 0.40 ą 2% -0.0 0.40 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > > 1.49 -0.0 1.45 +0.0 1.49 perf-profile.self.cycles-pp.kmem_cache_free > > 0.44 -0.0 0.40 -0.0 0.40 perf-profile.self.cycles-pp.do_munmap > > 0.45 -0.0 0.42 -0.0 0.43 perf-profile.self.cycles-pp.mas_wr_end_piv > > 0.89 -0.0 0.86 -0.0 0.88 perf-profile.self.cycles-pp.mas_store_gfp > > 0.78 -0.0 0.75 -0.0 0.76 perf-profile.self.cycles-pp.userfaultfd_unmap_complete > > 0.66 -0.0 0.62 -0.0 0.64 perf-profile.self.cycles-pp.mas_store_prealloc > > 0.60 -0.0 0.58 -0.0 0.59 perf-profile.self.cycles-pp.unmap_region > > 0.36 ą 4% -0.0 0.33 ą 3% -0.0 0.34 ą 2% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 0.55 -0.0 0.52 -0.0 0.53 perf-profile.self.cycles-pp.get_old_pud > > 0.99 -0.0 0.97 -0.0 0.98 perf-profile.self.cycles-pp.mt_find > > 0.61 -0.0 0.58 -0.0 0.60 perf-profile.self.cycles-pp.copy_vma > > 0.43 ą 3% -0.0 0.40 -0.0 0.41 ą 4% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert > > 0.49 -0.0 0.47 -0.0 0.48 perf-profile.self.cycles-pp.find_vma_prev > > 0.71 -0.0 0.68 -0.0 0.70 perf-profile.self.cycles-pp.unmap_page_range > > 0.27 -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.mas_prev_setup > > 0.47 -0.0 0.45 -0.0 0.46 ą 2% perf-profile.self.cycles-pp.flush_tlb_mm_range > > 0.37 ą 6% -0.0 0.35 -0.0 0.35 perf-profile.self.cycles-pp._raw_spin_lock > > 0.41 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > > 0.40 -0.0 0.37 -0.0 0.38 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.27 -0.0 0.25 ą 2% -0.0 0.25 ą 3% perf-profile.self.cycles-pp.mas_put_in_tree > > 0.49 -0.0 0.47 -0.0 0.49 perf-profile.self.cycles-pp.refill_obj_stock > > 0.48 -0.0 0.46 -0.0 0.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 0.27 ą 2% -0.0 0.25 -0.0 0.26 perf-profile.self.cycles-pp.tlb_finish_mmu > > 0.24 ą 2% -0.0 0.22 -0.0 0.23 perf-profile.self.cycles-pp.mas_prev > > 0.28 -0.0 0.26 -0.0 0.27 ą 2% perf-profile.self.cycles-pp.mas_alloc_nodes > > 0.40 -0.0 0.39 -0.0 0.40 perf-profile.self.cycles-pp.__pte_offset_map_lock > > 0.14 ą 3% -0.0 0.12 ą 2% -0.0 0.13 ą 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 0.26 -0.0 0.24 ą 2% -0.0 0.25 perf-profile.self.cycles-pp.__rb_insert_augmented > > 0.28 -0.0 0.26 -0.0 0.27 perf-profile.self.cycles-pp.alloc_new_pud > > 0.28 -0.0 0.26 -0.0 0.27 ą 2% perf-profile.self.cycles-pp.flush_tlb_func > > 0.20 ą 2% -0.0 0.19 -0.0 0.19 ą 2% perf-profile.self.cycles-pp.__get_unmapped_area > > 0.47 -0.0 0.46 -0.0 0.45 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown_vmflags > > 0.06 -0.0 0.05 ą 5% -0.0 0.05 perf-profile.self.cycles-pp.vma_dup_policy > > 0.06 ą 6% +0.0 0.07 -0.0 0.06 ą 8% perf-profile.self.cycles-pp.mm_get_unmapped_area_vmflags > > 0.11 ą 4% +0.0 0.12 ą 4% +0.0 0.12 ą 4% perf-profile.self.cycles-pp.free_pgd_range > > 0.21 +0.0 0.22 ą 2% -0.0 0.20 ą 2% perf-profile.self.cycles-pp.thp_get_unmapped_area_vmflags > > 0.45 +0.0 0.48 +0.0 0.50 perf-profile.self.cycles-pp.do_vmi_munmap > > 0.27 +0.0 0.32 -0.0 0.26 perf-profile.self.cycles-pp.free_pgtables > > 0.36 ą 2% +0.1 0.44 -0.0 0.35 perf-profile.self.cycles-pp.unlink_anon_vmas > > 1.07 +0.1 1.19 +0.2 1.22 perf-profile.self.cycles-pp.mas_next_slot > > 1.49 +0.5 2.01 +0.4 1.86 perf-profile.self.cycles-pp.mas_find > > 0.00 +1.4 1.37 +0.9 0.93 perf-profile.self.cycles-pp.can_modify_mm > > 3.14 +2.1 5.23 +1.5 4.60 perf-profile.self.cycles-pp.mas_walk > > > > > > > > > > > > > > > > > > to avoid the impact of other changes, better to apply the patch upon 8be7258a > > > > directly. > > > > > > > > if you prefer other base for this patch, please let us know. then we will > > > > supply the results for 4 commits in fact: > > > > > > > > this patch > > > > the base of this patch > > > > 8be7258a: mseal: add mseal syscall > > > > ff388fe5c: mseal: wire up mseal syscall > > > > > > > > > > > > > > > > > > > > > > > Thank you for your time and assistance in helping me on understanding > > > > > > > this issue. > > > > > > > > > > > > due to resource constraint, please expect that we need several days to finish > > > > > > this test request. > > > > > No problem. > > > > > > > > > > Thanks for your help! > > > > > -Jeff > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > -Jeff > > > > > > > > > > > > > > > -Jeff > > > > > > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ > > > > > > > > > [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c > > > > > > > > > > > > > > > > > > > > > > > > > > > Jeff Xu (2): > > > > > > > > > mseal:selftest mremap across VMA boundaries. > > > > > > > > > mseal: refactor mremap to remove can_modify_mm > > > > > > > > > > > > > > > > > > mm/internal.h | 24 ++ > > > > > > > > > mm/mremap.c | 77 +++---- > > > > > > > > > mm/mseal.c | 17 -- > > > > > > > > > tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- > > > > > > > > > 4 files changed, 353 insertions(+), 58 deletions(-) > > > > > > > > > > > > > > > > > > -- > > > > > > > > > 2.46.0.76.ge559c4bf1a-goog > > > > > > > > >
From: Jeff Xu <jeffxu@chromium.org> mremap doesn't allow relocate, expand, shrink across VMA boundaries, refactor the code to check src address range before doing anything on the destination, i.e. destination won't be unmapped, if src address failed the boundaries check. This also allows us to remove can_modify_mm from mremap.c, since the src address must be single VMA, can_modify_vma is used. It is likely this will improve the performance on mremap, previously the code does sealing check using can_modify_mm for the src address range, and the new code removed the loop (used by can_modify_mm). In order to verify this patch doesn't regress on mremap, I added tests in mseal_test, the test patch can be applied before mremap refactor patch or checkin independently. Also this patch doesn't change mseal's existing schematic: if sealing fail, user can expect the src/dst address isn't updated. So this patch can be applied regardless if we decided to go with current out-of-loop approach or in-loop approach currently in discussion. Regarding the perf test report by stress-ng [1] title: 8be7258aad: stress-ng.pagemove.page_remaps_per_sec -4.4% regression The test is using below for testing: stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --pagemove 64 I can't repro this using ChromeOS, the pagemove test shows large value of stddev and stderr, and can't reasonably refect the performance impact. For example: I write a c program [2] to run the above pagemove test 10 times and calculate the stddev, stderr, for 3 commits: 1> before mseal feature is added: Ops/sec: Mean : 3564.40 Std Dev : 2737.35 (76.80% of Mean) Std Err : 865.63 (24.29% of Mean) 2> after mseal feature is added: Ops/sec: Mean : 2703.84 Std Dev : 2085.13 (77.12% of Mean) Std Err : 659.38 (24.39% of Mean) 3> after current patch (mremap refactor) Ops/sec: Mean : 3603.67 Std Dev : 2422.22 (67.22% of Mean) Std Err : 765.97 (21.26% of Mean) The result shows 21%-24% stderr, this means whatever perf improvment/impact there might be won't be measured correctly by this test. This test machine has 32G memory, Intel(R) Celeron(R) 7305, 5 CPU. And I reboot the machine before each test, and take the first 10 runs with run_stress_ng 10 (I will run longer duration to see if test still shows large stdDev,StdErr) [1] https://lore.kernel.org/lkml/202408041602.caa0372-oliver.sang@intel.com/ [2] https://github.com/peaktocreek/mmperf/blob/main/run_stress_ng.c Jeff Xu (2): mseal:selftest mremap across VMA boundaries. mseal: refactor mremap to remove can_modify_mm mm/internal.h | 24 ++ mm/mremap.c | 77 +++---- mm/mseal.c | 17 -- tools/testing/selftests/mm/mseal_test.c | 293 +++++++++++++++++++++++- 4 files changed, 353 insertions(+), 58 deletions(-)