diff mbox series

[v1] mm: Fix khugepaged activation policy

Message ID 20240702144617.2291480-1-ryan.roberts@arm.com (mailing list archive)
State New
Headers show
Series [v1] mm: Fix khugepaged activation policy | expand

Commit Message

Ryan Roberts July 2, 2024, 2:46 p.m. UTC
Since the introduction of mTHP, the docuementation has stated that
khugepaged would be enabled when any mTHP size is enabled, and disabled
when all mTHP sizes are disabled. There are 2 problems with this; 1.
this is not what was implemented by the code and 2. this is not the
desirable behavior.

Desirable behavior is for khugepaged to be enabled when any PMD-sized
THP is enabled, anon or file. (Note that file THP is still controlled by
the top-level control so we must always consider that, as well as the
PMD-size mTHP control for anon). khugepaged only supports collapsing to
PMD-sized THP so there is no value in enabling it when PMD-sized THP is
disabled. So let's change the code and documentation to reflect this
policy.

Further, per-size enabled control modification events were not
previously forwarded to khugepaged to give it an opportunity to start or
stop. Consequently the following was resulting in khugepaged eroneously
not being activated:

  echo never > /sys/kernel/mm/transparent_hugepage/enabled
  echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
Cc: stable@vger.kernel.org
---

Hi All,

Applies on top of today's mm-unstable (9bb8753acdd8). No regressions observed in
mm selftests.

When fixing this I also noticed that khugepaged doesn't get (and never has been)
activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how
to collapse shmem - perhaps it should be activated in this case?

Thanks,
Ryan

 Documentation/admin-guide/mm/transhuge.rst | 11 +++++------
 include/linux/huge_mm.h                    | 13 +++++++------
 mm/huge_memory.c                           |  7 +++++++
 mm/khugepaged.c                            | 13 ++++++-------
 4 files changed, 25 insertions(+), 19 deletions(-)

--
2.43.0

Comments

David Hildenbrand July 2, 2024, 2:57 p.m. UTC | #1
On 02.07.24 16:46, Ryan Roberts wrote:
> Since the introduction of mTHP, the docuementation has stated that
> khugepaged would be enabled when any mTHP size is enabled, and disabled
> when all mTHP sizes are disabled. There are 2 problems with this; 1.
> this is not what was implemented by the code and 2. this is not the
> desirable behavior.
> 
> Desirable behavior is for khugepaged to be enabled when any PMD-sized
> THP is enabled, anon or file. (Note that file THP is still controlled by
> the top-level control so we must always consider that, as well as the
> PMD-size mTHP control for anon). khugepaged only supports collapsing to
> PMD-sized THP so there is no value in enabling it when PMD-sized THP is
> disabled. So let's change the code and documentation to reflect this
> policy.
> 
> Further, per-size enabled control modification events were not
> previously forwarded to khugepaged to give it an opportunity to start or
> stop. Consequently the following was resulting in khugepaged eroneously
> not being activated:
> 
>    echo never > /sys/kernel/mm/transparent_hugepage/enabled
>    echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
> Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
> Cc: stable@vger.kernel.org
> ---
> 
> Hi All,
> 
> Applies on top of today's mm-unstable (9bb8753acdd8). No regressions observed in
> mm selftests.
> 
> When fixing this I also noticed that khugepaged doesn't get (and never has been)
> activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how
> to collapse shmem - perhaps it should be activated in this case?
> 

Call me confused.

khugepaged_scan_mm_slot() and madvise_collapse() only all 
hpage_collapse_scan_file() with ... IS_ENABLED(CONFIG_SHMEM) ?

collapse_file() is only called by hpage_collapse_scan_file() ... and 
there we check "shmem_file(file)".

So why is the IS_ENABLED(CONFIG_SHMEM) check in there if collapse_file() 
seems to "collapse filemap/tmpfs/shmem pages into huge one".

Anyhow, we certainly can collapse shmem (that's how it all started IIUC).

Besides that, khugepaged only seems to collapse !shmem with
   VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);

The thp_vma_allowable_order() check tests if we are allowed to collapse 
a PMD_ORDER in that VMA.
Ryan Roberts July 2, 2024, 3:29 p.m. UTC | #2
On 02/07/2024 15:57, David Hildenbrand wrote:
> On 02.07.24 16:46, Ryan Roberts wrote:
>> Since the introduction of mTHP, the docuementation has stated that
>> khugepaged would be enabled when any mTHP size is enabled, and disabled
>> when all mTHP sizes are disabled. There are 2 problems with this; 1.
>> this is not what was implemented by the code and 2. this is not the
>> desirable behavior.
>>
>> Desirable behavior is for khugepaged to be enabled when any PMD-sized
>> THP is enabled, anon or file. (Note that file THP is still controlled by
>> the top-level control so we must always consider that, as well as the
>> PMD-size mTHP control for anon). khugepaged only supports collapsing to
>> PMD-sized THP so there is no value in enabling it when PMD-sized THP is
>> disabled. So let's change the code and documentation to reflect this
>> policy.
>>
>> Further, per-size enabled control modification events were not
>> previously forwarded to khugepaged to give it an opportunity to start or
>> stop. Consequently the following was resulting in khugepaged eroneously
>> not being activated:
>>
>>    echo never > /sys/kernel/mm/transparent_hugepage/enabled
>>    echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
>> Closes:
>> https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
>> Cc: stable@vger.kernel.org
>> ---
>>
>> Hi All,
>>
>> Applies on top of today's mm-unstable (9bb8753acdd8). No regressions observed in
>> mm selftests.
>>
>> When fixing this I also noticed that khugepaged doesn't get (and never has been)
>> activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how
>> to collapse shmem - perhaps it should be activated in this case?
>>
> 
> Call me confused.
> 
> khugepaged_scan_mm_slot() and madvise_collapse() only all
> hpage_collapse_scan_file() with ... IS_ENABLED(CONFIG_SHMEM) ?

Looks like khugepaged_scan_mm_slot() was converted from:

  if (shmem_file(vma->vm_file)) {

to:

  if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) {

By 99cb0dbd47a15d395bf3faa78dc122bc5efe3fc0 which adds THP collapse support for
non-shmem files. Clearly that looks wrong, but I guess never spotted in practice
because noone disables shemem?

I guess madvise_collapse() was a copy/paste?

> 
> collapse_file() is only called by hpage_collapse_scan_file() ... and there we
> check "shmem_file(file)".
> 
> So why is the IS_ENABLED(CONFIG_SHMEM) check in there if collapse_file() seems
> to "collapse filemap/tmpfs/shmem pages into huge one".
> 
> Anyhow, we certainly can collapse shmem (that's how it all started IIUC).

Yes, thanks for pointing me at it. Should have just searched "shmem" in
khugepaged.c :-/

> 
> Besides that, khugepaged only seems to collapse !shmem with
>   VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);

That makes sense. I guess I could use IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) to
tighen the (non-shmem) file THP check in hugepage_pmd_enabled() (currently I'm
unconditionally using the top-level enabled setting as a "is THP enabled for
files" check).

But back to my original question, I think hugepage_pmd_enabled() should also be
explicitly checking the appropriate shmem_enabled controls and ORing in the
result? Otherwise in a situation where only shmem is THP enabled (and file/anon
THP is disabled) khugepaged won't run.

> 
> The thp_vma_allowable_order() check tests if we are allowed to collapse a
> PMD_ORDER in that VMA.

I don't follow the relevance of this statement.
David Hildenbrand July 2, 2024, 3:38 p.m. UTC | #3
On 02.07.24 17:29, Ryan Roberts wrote:
> On 02/07/2024 15:57, David Hildenbrand wrote:
>> On 02.07.24 16:46, Ryan Roberts wrote:
>>> Since the introduction of mTHP, the docuementation has stated that
>>> khugepaged would be enabled when any mTHP size is enabled, and disabled
>>> when all mTHP sizes are disabled. There are 2 problems with this; 1.
>>> this is not what was implemented by the code and 2. this is not the
>>> desirable behavior.
>>>
>>> Desirable behavior is for khugepaged to be enabled when any PMD-sized
>>> THP is enabled, anon or file. (Note that file THP is still controlled by
>>> the top-level control so we must always consider that, as well as the
>>> PMD-size mTHP control for anon). khugepaged only supports collapsing to
>>> PMD-sized THP so there is no value in enabling it when PMD-sized THP is
>>> disabled. So let's change the code and documentation to reflect this
>>> policy.
>>>
>>> Further, per-size enabled control modification events were not
>>> previously forwarded to khugepaged to give it an opportunity to start or
>>> stop. Consequently the following was resulting in khugepaged eroneously
>>> not being activated:
>>>
>>>     echo never > /sys/kernel/mm/transparent_hugepage/enabled
>>>     echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
>>>
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
>>> Closes:
>>> https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
>>> Cc: stable@vger.kernel.org
>>> ---
>>>
>>> Hi All,
>>>
>>> Applies on top of today's mm-unstable (9bb8753acdd8). No regressions observed in
>>> mm selftests.
>>>
>>> When fixing this I also noticed that khugepaged doesn't get (and never has been)
>>> activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how
>>> to collapse shmem - perhaps it should be activated in this case?
>>>
>>
>> Call me confused.
>>
>> khugepaged_scan_mm_slot() and madvise_collapse() only all
>> hpage_collapse_scan_file() with ... IS_ENABLED(CONFIG_SHMEM) ?
> 
> Looks like khugepaged_scan_mm_slot() was converted from:
> 
>    if (shmem_file(vma->vm_file)) {
> 
> to:
> 
>    if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) {
> 
> By 99cb0dbd47a15d395bf3faa78dc122bc5efe3fc0 which adds THP collapse support for
> non-shmem files. Clearly that looks wrong, but I guess never spotted in practice
> because noone disables shemem?
> 
> I guess madvise_collapse() was a copy/paste?
> 

Likely.

>>
>> collapse_file() is only called by hpage_collapse_scan_file() ... and there we
>> check "shmem_file(file)".
>>
>> So why is the IS_ENABLED(CONFIG_SHMEM) check in there if collapse_file() seems
>> to "collapse filemap/tmpfs/shmem pages into huge one".
>>
>> Anyhow, we certainly can collapse shmem (that's how it all started IIUC).
> 
> Yes, thanks for pointing me at it. Should have just searched "shmem" in
> khugepaged.c :-/
> 
>>
>> Besides that, khugepaged only seems to collapse !shmem with
>>    VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);
> 
> That makes sense. I guess I could use IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) to
> tighen the (non-shmem) file THP check in hugepage_pmd_enabled() (currently I'm
> unconditionally using the top-level enabled setting as a "is THP enabled for
> files" check).
> 
> But back to my original question, I think hugepage_pmd_enabled() should also be
> explicitly checking the appropriate shmem_enabled controls and ORing in the
> result? Otherwise in a situation where only shmem is THP enabled (and file/anon
> THP is disabled) khugepaged won't run.

I think so.

> 
>>
>> The thp_vma_allowable_order() check tests if we are allowed to collapse a
>> PMD_ORDER in that VMA.
> 
> I don't follow the relevance of this statement.

On whatever VMA we indicate true, we will try to collapse in khugepaged. 
Regarding the question, what khugepaged will try to collapse.
Ryan Roberts July 2, 2024, 4:33 p.m. UTC | #4
On 02/07/2024 16:38, David Hildenbrand wrote:
> On 02.07.24 17:29, Ryan Roberts wrote:
>> On 02/07/2024 15:57, David Hildenbrand wrote:
>>> On 02.07.24 16:46, Ryan Roberts wrote:
>>>> Since the introduction of mTHP, the docuementation has stated that
>>>> khugepaged would be enabled when any mTHP size is enabled, and disabled
>>>> when all mTHP sizes are disabled. There are 2 problems with this; 1.
>>>> this is not what was implemented by the code and 2. this is not the
>>>> desirable behavior.
>>>>
>>>> Desirable behavior is for khugepaged to be enabled when any PMD-sized
>>>> THP is enabled, anon or file. (Note that file THP is still controlled by
>>>> the top-level control so we must always consider that, as well as the
>>>> PMD-size mTHP control for anon). khugepaged only supports collapsing to
>>>> PMD-sized THP so there is no value in enabling it when PMD-sized THP is
>>>> disabled. So let's change the code and documentation to reflect this
>>>> policy.
>>>>
>>>> Further, per-size enabled control modification events were not
>>>> previously forwarded to khugepaged to give it an opportunity to start or
>>>> stop. Consequently the following was resulting in khugepaged eroneously
>>>> not being activated:
>>>>
>>>>     echo never > /sys/kernel/mm/transparent_hugepage/enabled
>>>>     echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
>>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
>>>> Closes:
>>>> https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
>>>> Cc: stable@vger.kernel.org
>>>> ---
>>>>
>>>> Hi All,
>>>>
>>>> Applies on top of today's mm-unstable (9bb8753acdd8). No regressions
>>>> observed in
>>>> mm selftests.
>>>>
>>>> When fixing this I also noticed that khugepaged doesn't get (and never has
>>>> been)
>>>> activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how
>>>> to collapse shmem - perhaps it should be activated in this case?
>>>>
>>>
>>> Call me confused.
>>>
>>> khugepaged_scan_mm_slot() and madvise_collapse() only all
>>> hpage_collapse_scan_file() with ... IS_ENABLED(CONFIG_SHMEM) ?
>>
>> Looks like khugepaged_scan_mm_slot() was converted from:
>>
>>    if (shmem_file(vma->vm_file)) {
>>
>> to:
>>
>>    if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) {

CONFIG_READ_ONLY_THP_FOR_FS depends on CONFIG_SHMEM in Kconfig, so I think this is all correct/safe. Although I'm not really sure what the need for the dependency is.

>>
>> By 99cb0dbd47a15d395bf3faa78dc122bc5efe3fc0 which adds THP collapse support for
>> non-shmem files. Clearly that looks wrong, but I guess never spotted in practice
>> because noone disables shemem?
>>
>> I guess madvise_collapse() was a copy/paste?
>>
> 
> Likely.
> 
>>>
>>> collapse_file() is only called by hpage_collapse_scan_file() ... and there we
>>> check "shmem_file(file)".
>>>
>>> So why is the IS_ENABLED(CONFIG_SHMEM) check in there if collapse_file() seems
>>> to "collapse filemap/tmpfs/shmem pages into huge one".
>>>
>>> Anyhow, we certainly can collapse shmem (that's how it all started IIUC).
>>
>> Yes, thanks for pointing me at it. Should have just searched "shmem" in
>> khugepaged.c :-/
>>
>>>
>>> Besides that, khugepaged only seems to collapse !shmem with
>>>    VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem);
>>
>> That makes sense. I guess I could use IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) to
>> tighen the (non-shmem) file THP check in hugepage_pmd_enabled() (currently I'm
>> unconditionally using the top-level enabled setting as a "is THP enabled for
>> files" check).

I'll do a v2 with this addition if you agree?

static inline bool hugepage_pmd_enabled(void)
{
	/*
	 * We cover both the anon and the file-backed case here; for
	 * file-backed, we must return true if globally enabled, regardless of
	 * the anon pmd size control status.
	 */
	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) ||
	       test_bit(PMD_ORDER, &huge_anon_orders_always) ||
	       test_bit(PMD_ORDER, &huge_anon_orders_madvise) ||
	       (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && hugepage_global_enabled());
}

>>
>> But back to my original question, I think hugepage_pmd_enabled() should also be
>> explicitly checking the appropriate shmem_enabled controls and ORing in the
>> result? Otherwise in a situation where only shmem is THP enabled (and file/anon
>> THP is disabled) khugepaged won't run.
> 
> I think so.

I'll do this part as a separate change since it's fixing a separate bug.

> 
>>
>>>
>>> The thp_vma_allowable_order() check tests if we are allowed to collapse a
>>> PMD_ORDER in that VMA.
>>
>> I don't follow the relevance of this statement.
> 
> On whatever VMA we indicate true, we will try to collapse in khugepaged.
> Regarding the question, what khugepaged will try to collapse.

Oh I see. Yes agreed. But we don't have a VMA when enabling/disabling khugepaged. We just want to know "are there vmas for which khugepaged would attempt to collapse to PMD size?"
David Hildenbrand July 2, 2024, 9:29 p.m. UTC | #5
On 02.07.24 18:33, Ryan Roberts wrote:
> On 02/07/2024 16:38, David Hildenbrand wrote:
>> On 02.07.24 17:29, Ryan Roberts wrote:
>>> On 02/07/2024 15:57, David Hildenbrand wrote:
>>>> On 02.07.24 16:46, Ryan Roberts wrote:
>>>>> Since the introduction of mTHP, the docuementation has stated that
>>>>> khugepaged would be enabled when any mTHP size is enabled, and disabled
>>>>> when all mTHP sizes are disabled. There are 2 problems with this; 1.
>>>>> this is not what was implemented by the code and 2. this is not the
>>>>> desirable behavior.
>>>>>
>>>>> Desirable behavior is for khugepaged to be enabled when any PMD-sized
>>>>> THP is enabled, anon or file. (Note that file THP is still controlled by
>>>>> the top-level control so we must always consider that, as well as the
>>>>> PMD-size mTHP control for anon). khugepaged only supports collapsing to
>>>>> PMD-sized THP so there is no value in enabling it when PMD-sized THP is
>>>>> disabled. So let's change the code and documentation to reflect this
>>>>> policy.
>>>>>
>>>>> Further, per-size enabled control modification events were not
>>>>> previously forwarded to khugepaged to give it an opportunity to start or
>>>>> stop. Consequently the following was resulting in khugepaged eroneously
>>>>> not being activated:
>>>>>
>>>>>      echo never > /sys/kernel/mm/transparent_hugepage/enabled
>>>>>      echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
>>>>>
>>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>>> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")
>>>>> Closes:
>>>>> https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redhat.com/
>>>>> Cc: stable@vger.kernel.org
>>>>> ---
>>>>>
>>>>> Hi All,
>>>>>
>>>>> Applies on top of today's mm-unstable (9bb8753acdd8). No regressions
>>>>> observed in
>>>>> mm selftests.
>>>>>
>>>>> When fixing this I also noticed that khugepaged doesn't get (and never has
>>>>> been)
>>>>> activated/deactivated by `shmem_enabled=`. I'm not sure if khugepaged knows how
>>>>> to collapse shmem - perhaps it should be activated in this case?
>>>>>
>>>>
>>>> Call me confused.
>>>>
>>>> khugepaged_scan_mm_slot() and madvise_collapse() only all
>>>> hpage_collapse_scan_file() with ... IS_ENABLED(CONFIG_SHMEM) ?
>>>
>>> Looks like khugepaged_scan_mm_slot() was converted from:
>>>
>>>     if (shmem_file(vma->vm_file)) {
>>>
>>> to:
>>>
>>>     if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) {
> 
> CONFIG_READ_ONLY_THP_FOR_FS depends on CONFIG_SHMEM in Kconfig, so I think this is all correct/safe. Although I'm not really sure what the need for the dependency is.


Right. It's likely some historical artifact ...

[...]

> 
> I'll do a v2 with this addition if you agree?
> 
> static inline bool hugepage_pmd_enabled(void)
> {
> 	/*
> 	 * We cover both the anon and the file-backed case here; for
> 	 * file-backed, we must return true if globally enabled, regardless of
> 	 * the anon pmd size control status.
> 	 */
> 	return (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && hugepage_global_enabled()) ||
> 	       test_bit(PMD_ORDER, &huge_anon_orders_always) ||
> 	       test_bit(PMD_ORDER, &huge_anon_orders_madvise) ||
> 	       (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && hugepage_global_enabled());
> }
> 

I think this looks good :)

>>>
>>> But back to my original question, I think hugepage_pmd_enabled() should also be
>>> explicitly checking the appropriate shmem_enabled controls and ORing in the
>>> result? Otherwise in a situation where only shmem is THP enabled (and file/anon
>>> THP is disabled) khugepaged won't run.
>>
>> I think so.
> 
> I'll do this part as a separate change since it's fixing a separate bug.

Makes sense.

> 
>>
>>>
>>>>
>>>> The thp_vma_allowable_order() check tests if we are allowed to collapse a
>>>> PMD_ORDER in that VMA.
>>>
>>> I don't follow the relevance of this statement.
>>
>> On whatever VMA we indicate true, we will try to collapse in khugepaged.
>> Regarding the question, what khugepaged will try to collapse.
> 
> Oh I see. Yes agreed. But we don't have a VMA when enabling/disabling khugepaged. We just want to know "are there vmas for which khugepaged would attempt to collapse to PMD size?"

Yes, only to give you a pointer on which VMAs khugepaged will be willing 
to work with.
diff mbox series

Patch

diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index 709fe10b60f4..fc321d40b8ac 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -202,12 +202,11 @@  PMD-mappable transparent hugepage::

 	cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size

-khugepaged will be automatically started when one or more hugepage
-sizes are enabled (either by directly setting "always" or "madvise",
-or by setting "inherit" while the top-level enabled is set to "always"
-or "madvise"), and it'll be automatically shutdown when the last
-hugepage size is disabled (either by directly setting "never", or by
-setting "inherit" while the top-level enabled is set to "never").
+khugepaged will be automatically started when PMD-sized THP is enabled
+(either of the per-size anon control or the top-level control are set
+to "always" or "madvise"), and it'll be automatically shutdown when
+PMD-sized THP is disabled (when both the per-size anon control and the
+top-level control are "never")

 Khugepaged controls
 -------------------
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 4d155c7a4792..ce1b47b49cc3 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -128,16 +128,17 @@  static inline bool hugepage_global_always(void)
 			(1<<TRANSPARENT_HUGEPAGE_FLAG);
 }

-static inline bool hugepage_flags_enabled(void)
+static inline bool hugepage_pmd_enabled(void)
 {
 	/*
-	 * We cover both the anon and the file-backed case here; we must return
-	 * true if globally enabled, even when all anon sizes are set to never.
-	 * So we don't need to look at huge_anon_orders_inherit.
+	 * We cover both the anon and the file-backed case here; for
+	 * file-backed, we must return true if globally enabled, regardless of
+	 * the anon pmd size control status. So we don't need to look at
+	 * huge_anon_orders_inherit.
 	 */
 	return hugepage_global_enabled() ||
-	       READ_ONCE(huge_anon_orders_always) ||
-	       READ_ONCE(huge_anon_orders_madvise);
+	       test_bit(PMD_ORDER, &huge_anon_orders_always) ||
+	       test_bit(PMD_ORDER, &huge_anon_orders_madvise);
 }

 static inline int highest_order(unsigned long orders)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 251d6932130f..085f5e973231 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -502,6 +502,13 @@  static ssize_t thpsize_enabled_store(struct kobject *kobj,
 	} else
 		ret = -EINVAL;

+	if (ret > 0) {
+		int err;
+
+		err = start_stop_khugepaged();
+		if (err)
+			ret = err;
+	}
 	return ret;
 }

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 409f67a817f1..708d0e74b61f 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -449,7 +449,7 @@  void khugepaged_enter_vma(struct vm_area_struct *vma,
 			  unsigned long vm_flags)
 {
 	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
-	    hugepage_flags_enabled()) {
+	    hugepage_pmd_enabled()) {
 		if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS,
 					    PMD_ORDER))
 			__khugepaged_enter(vma->vm_mm);
@@ -2462,8 +2462,7 @@  static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,

 static int khugepaged_has_work(void)
 {
-	return !list_empty(&khugepaged_scan.mm_head) &&
-		hugepage_flags_enabled();
+	return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled();
 }

 static int khugepaged_wait_event(void)
@@ -2536,7 +2535,7 @@  static void khugepaged_wait_work(void)
 		return;
 	}

-	if (hugepage_flags_enabled())
+	if (hugepage_pmd_enabled())
 		wait_event_freezable(khugepaged_wait, khugepaged_wait_event());
 }

@@ -2567,7 +2566,7 @@  static void set_recommended_min_free_kbytes(void)
 	int nr_zones = 0;
 	unsigned long recommended_min;

-	if (!hugepage_flags_enabled()) {
+	if (!hugepage_pmd_enabled()) {
 		calculate_min_free_kbytes();
 		goto update_wmarks;
 	}
@@ -2617,7 +2616,7 @@  int start_stop_khugepaged(void)
 	int err = 0;

 	mutex_lock(&khugepaged_mutex);
-	if (hugepage_flags_enabled()) {
+	if (hugepage_pmd_enabled()) {
 		if (!khugepaged_thread)
 			khugepaged_thread = kthread_run(khugepaged, NULL,
 							"khugepaged");
@@ -2643,7 +2642,7 @@  int start_stop_khugepaged(void)
 void khugepaged_min_free_kbytes_update(void)
 {
 	mutex_lock(&khugepaged_mutex);
-	if (hugepage_flags_enabled() && khugepaged_thread)
+	if (hugepage_pmd_enabled() && khugepaged_thread)
 		set_recommended_min_free_kbytes();
 	mutex_unlock(&khugepaged_mutex);
 }