[RFC,0/4] arm64/mm: Clean up pte_dirty() state management

Message ID	20230707053331.510041-1-anshuman.khandual@arm.com (mailing list archive)
Headers	show Return-Path: <linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org> From: Anshuman Khandual <anshuman.khandual@arm.com> To: linux-arm-kernel@lists.infradead.org Cc: Anshuman Khandual <anshuman.khandual@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Ryan Roberts <ryan.roberts@arm.com>, Mark Rutland <mark.rutland@arm.com>, Andrew Morton <akpm@linux-foundation.org>, David Hildenbrand <david@redhat.com>, Jonathan Corbet <corbet@lwn.net>, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Subject: [RFC 0/4] arm64/mm: Clean up pte_dirty() state management Date: Fri, 7 Jul 2023 11:03:27 +0530 Message-Id: <20230707053331.510041-1-anshuman.khandual@arm.com> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org
Series	arm64/mm: Clean up pte_dirty() state management \| expand [RFC,0/4] arm64/mm: Clean up pte_dirty() state management [RFC,1/4] arm64/mm: Add SW and HW dirty state helpers [RFC,2/4] arm64/mm: Call pte_sw_mkdirty() while preserving the HW dirty state [RFC,3/4] arm64/mm: Add pte_preserve_hw_dirty() [RFC,4/4] docs: arm64: Add help document for pte dirty state management

Anshuman Khandual July 7, 2023, 5:33 a.m. UTC

These pte_dirty() changes make things explicitly clear, while improving the
code readability. This optimizes HW dirty state transfer into SW dirty bit.
This also adds a new arm64 documentation explaining overall pte dirty state
management in detail. This series applies on the latest mainline kernel.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-doc@vger.kernel.org

Anshuman Khandual (4):
  arm64/mm: Add SW and HW dirty state helpers
  arm64/mm: Call pte_sw_mkdirty() while preserving the HW dirty state
  arm64/mm: Add pte_preserve_hw_dirty()
  docs: arm64: Add help document for pte dirty state management

 Documentation/arch/arm64/index.rst     |  1 +
 Documentation/arch/arm64/pte-dirty.rst | 95 ++++++++++++++++++++++++++
 arch/arm64/include/asm/pgtable.h       | 66 ++++++++++++++----
 3 files changed, 147 insertions(+), 15 deletions(-)
 create mode 100644 Documentation/arch/arm64/pte-dirty.rst

David Hildenbrand July 7, 2023, 12:11 p.m. UTC | #1

On 07.07.23 07:33, Anshuman Khandual wrote:
> These pte_dirty() changes make things explicitly clear, while improving the
> code readability. This optimizes HW dirty state transfer into SW dirty bit.
> This also adds a new arm64 documentation explaining overall pte dirty state
> management in detail. This series applies on the latest mainline kernel.
> 
>

I skimmed over most of the series, and I am not convinced that this is 
actually a cleanup. If we cannot really always differentiate between 
sw/hw clearing, why have separate primitives that give one the illusion 
that it could be done and that they are two different concepts?

Maybe there are other opinions, but at least for me this made the code 
harder to grasp.

I do appreciate a doc update, though :)

Anshuman Khandual July 10, 2023, 2:20 a.m. UTC | #2

On 7/7/23 17:41, David Hildenbrand wrote:
> On 07.07.23 07:33, Anshuman Khandual wrote:
>> These pte_dirty() changes make things explicitly clear, while improving the
>> code readability. This optimizes HW dirty state transfer into SW dirty bit.
>> This also adds a new arm64 documentation explaining overall pte dirty state
>> management in detail. This series applies on the latest mainline kernel.
>>
>>
> 
> I skimmed over most of the series, and I am not convinced that this is actually a cleanup. If we cannot really always differentiate between sw/hw clearing, why have separate primitives that give one the illusion that it could be done and that they are two different concepts?

These are indeed two different concepts working together, the current code just
obscures that. Without these primitives it's even hard to follow how the SW and
HW dirty parts are intertwined in implementing the generic pte_dirty() state.

The current code acknowledges these two different concepts in identifying them
i.e via pte_hw_dirty() and pte_sw_dirty().

#define pte_hw_dirty(pte)       (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
#define pte_sw_dirty(pte)       (!!(pte_val(pte) & PTE_DIRTY))

But then falls short in demonstrating how these two states are being managed i.e
created and cleared. The first patch tries to clear the situation to some extent.

> 
> Maybe there are other opinions, but at least for me this made the code harder to grasp.
> 
> I do appreciate a doc update, though :)
These two changes also streamline HW dirty bit saving in SW dirty bit efficiently,
skipping the re-doing of HW dirty bits setting which might be cleared off. That is
the primary reason for saving it off in SW dirty bit in the first place.

  arm64/mm: Call pte_sw_mkdirty() while preserving the HW dirty state
  arm64/mm: Add pte_preserve_hw_dirty(

Regardless, I am still trying to understand how the first patch does not improve
the clarity in modifying the PTE dirty state.

David Hildenbrand July 10, 2023, 9:17 a.m. UTC | #3

On 10.07.23 04:20, Anshuman Khandual wrote:
> 
> 
> On 7/7/23 17:41, David Hildenbrand wrote:
>> On 07.07.23 07:33, Anshuman Khandual wrote:
>>> These pte_dirty() changes make things explicitly clear, while improving the
>>> code readability. This optimizes HW dirty state transfer into SW dirty bit.
>>> This also adds a new arm64 documentation explaining overall pte dirty state
>>> management in detail. This series applies on the latest mainline kernel.
>>>
>>>
>>
>> I skimmed over most of the series, and I am not convinced that this is actually a cleanup. If we cannot really always differentiate between sw/hw clearing, why have separate primitives that give one the illusion that it could be done and that they are two different concepts?
> 
> These are indeed two different concepts working together, the current code just
> obscures that. Without these primitives it's even hard to follow how the SW and
> HW dirty parts are intertwined in implementing the generic pte_dirty() state.
> 
> The current code acknowledges these two different concepts in identifying them
> i.e via pte_hw_dirty() and pte_sw_dirty().
> 
> #define pte_hw_dirty(pte)       (pte_write(pte) && !(pte_val(pte) & PTE_RDONLY))
> #define pte_sw_dirty(pte)       (!!(pte_val(pte) & PTE_DIRTY))
> 

^ these primitives make sense to me, but not the clearing part.

If there is only one way to clear both, then only have one primitive to 
clear both and state there, that separate clearing is impossible because 
both are intertwined.

Mark Rutland July 10, 2023, 11:25 a.m. UTC | #4

On Fri, Jul 07, 2023 at 11:03:27AM +0530, Anshuman Khandual wrote:
> These pte_dirty() changes make things explicitly clear, while improving the
> code readability. This optimizes HW dirty state transfer into SW dirty bit.
> This also adds a new arm64 documentation explaining overall pte dirty state
> management in detail. This series applies on the latest mainline kernel.

TBH, I think this is all swings and roundabouts, and I'm not sure this is
worthwhile. I appreciate that as-is some people find this confusing, but I
don't think the end result of this series is actually better, and it adds more
code/documentation to maintain.

In particular, I don't think that we should add Documentation/ files for this,
as it's very likely that won't be updated together with the code, and I think
it's more of a maintenance burden than a help. If we want some introductory
text to explain how the HW/SW dirty bits work, I think that should be a comment
block in <asm/pgtable.h>, clearly associated with the code.

Overall, I'd prefer to leave the code as-is.

Thanks,
Mark.

> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> 
> Anshuman Khandual (4):
>   arm64/mm: Add SW and HW dirty state helpers
>   arm64/mm: Call pte_sw_mkdirty() while preserving the HW dirty state
>   arm64/mm: Add pte_preserve_hw_dirty()
>   docs: arm64: Add help document for pte dirty state management
> 
>  Documentation/arch/arm64/index.rst     |  1 +
>  Documentation/arch/arm64/pte-dirty.rst | 95 ++++++++++++++++++++++++++
>  arch/arm64/include/asm/pgtable.h       | 66 ++++++++++++++----
>  3 files changed, 147 insertions(+), 15 deletions(-)
>  create mode 100644 Documentation/arch/arm64/pte-dirty.rst
> 
> -- 
> 2.30.2
>

Anshuman Khandual July 12, 2023, 4:01 a.m. UTC | #5

On 7/10/23 16:55, Mark Rutland wrote:
> On Fri, Jul 07, 2023 at 11:03:27AM +0530, Anshuman Khandual wrote:
>> These pte_dirty() changes make things explicitly clear, while improving the
>> code readability. This optimizes HW dirty state transfer into SW dirty bit.
>> This also adds a new arm64 documentation explaining overall pte dirty state
>> management in detail. This series applies on the latest mainline kernel.
> TBH, I think this is all swings and roundabouts, and I'm not sure this is
> worthwhile. I appreciate that as-is some people find this confusing, but I

Current situation for pte_dirty() management is confusing when there are two
distinct mechanisms to track PTE dirty states, but both are forced to work
together because

- HW DBM cannot track non-writable dirty state (PTE_DBM == PTE_WRITE)
- Runtime check for HW DBM is avoided

> don't think the end result of this series is actually better, and it adds more
> code/documentation to maintain.

Agreed, it does add more code and documentation but still trying to understand
why it is not worthwhile. Regardless, following patch does optimize a situation
where we dont need to call pte_mkdirty() knowing it will be cleared afterwards.

[RFC 2/4] arm64/mm: Call pte_sw_mkdirty() while preserving the HW dirty state

> 
> In particular, I don't think that we should add Documentation/ files for this,
> as it's very likely that won't be updated together with the code, and I think
> it's more of a maintenance burden than a help. If we want some introductory

There are many documentation files such as Documentation/arm64/memory.rst which
needs to be updated when kernel virtual address layout changes again. I am just
wondering - should not there be any documentation for internal implementation
details, just because they need updating with code changes.

> text to explain how the HW/SW dirty bits work, I think that should be a comment
> block in <asm/pgtable.h>, clearly associated with the code.

Sure, will add that.

> 
> Overall, I'd prefer to leave the code as-is.
Even if we discount individual dirty clearing functions, why should not HW dirty
bit transfer to SW dirty be optimized and wrapped around in a helper.

Catalin Marinas July 16, 2023, 3:10 p.m. UTC | #6

On Wed, Jul 12, 2023 at 09:31:39AM +0530, Anshuman Khandual wrote:
> On 7/10/23 16:55, Mark Rutland wrote:
> > On Fri, Jul 07, 2023 at 11:03:27AM +0530, Anshuman Khandual wrote:
> >> These pte_dirty() changes make things explicitly clear, while improving the
> >> code readability. This optimizes HW dirty state transfer into SW dirty bit.
> >> This also adds a new arm64 documentation explaining overall pte dirty state
> >> management in detail. This series applies on the latest mainline kernel.
> > 
> > TBH, I think this is all swings and roundabouts, and I'm not sure this is
> > worthwhile. I appreciate that as-is some people find this confusing, but I

I'm pretty much on the same lines, though maybe I looked too much at
this code that I don't like any further changes to it ;).

> Current situation for pte_dirty() management is confusing when there are two
> distinct mechanisms to track PTE dirty states, but both are forced to work
> together because
> 
> - HW DBM cannot track non-writable dirty state (PTE_DBM == PTE_WRITE)
> - Runtime check for HW DBM is avoided

Depending on how you look at it, we can say that any writeable PTE (as
in page table permission, PTE_RDONLY cleared) is dirty and we only have
a software mechanism for tracking the dirty state. The DBM feature is
not actually giving us a dirty bit but an automated way to make a PTE
writeable on access (for some historical reasons like the SMMU not
having such mechanism in place).

Maybe we can clean the code a bit based on the above perspective. E.g.
instead of pte_hw_dirty() just have a !pte_hw_rdonly() macro. It may
help with the confusion of having two mechanisms.

OTOH, with PIE, we can have a true dirty bit but at that point we can
eliminate the pte_sw_dirty() use entirely and allow soft-dirty using the
current PTE_DIRTY (with some static labels based on the feature).

> > don't think the end result of this series is actually better, and it adds more
> > code/documentation to maintain.
> 
> Agreed, it does add more code and documentation but still trying to understand
> why it is not worthwhile. Regardless, following patch does optimize a situation
> where we dont need to call pte_mkdirty() knowing it will be cleared afterwards.
> 
> [RFC 2/4] arm64/mm: Call pte_sw_mkdirty() while preserving the HW dirty state

I wonder whether the compiler eliminates much of this duplication since
there are some checks for pte_write() before. We may be able to remove
some checks. For example, does pte_hw_dirty() actually need to check
pte_write()? A !PTE_RDONLY entry is dirty automatically since we can't
trap any write access to it (prior to PIE; I need to check Joey's
patches on how it treats writeable+clean PTEs; still on holiday).

As for the fourth patch, I'd rather add documentation in the header
file, it's more likely to be looked at and updated.

[RFC,0/4] arm64/mm: Clean up pte_dirty() state management

Message

Comments