diff mbox series

[3/5] KVM: arm64: Only return attributes from stage2_update_leaf_attrs()

Message ID 20230111000300.2034799-4-oliver.upton@linux.dev (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Handle unaligned memslots in kvm_(test_)_age_gfn() | expand

Commit Message

Oliver Upton Jan. 11, 2023, 12:02 a.m. UTC
Returning a single PTE from stage2_update_leaf_attrs() doesn't make a
great deal of sense given that the function could be used to apply a
change to a range of PTEs. Instead, return a bitwise OR of attributes
from all the visited PTEs.

As the walker is no longer returning the full PTE, drop the check for a
valid PTE in kvm_age_gfn().

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
---
 arch/arm64/kvm/hyp/pgtable.c | 32 +++++++++++++++++---------------
 arch/arm64/kvm/mmu.c         |  2 +-
 2 files changed, 18 insertions(+), 16 deletions(-)

Comments

Marc Zyngier Jan. 11, 2023, 8:52 a.m. UTC | #1
[+ Suzuki and Zenghui who are missing from the Cc list]

On Wed, 11 Jan 2023 00:02:58 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> Returning a single PTE from stage2_update_leaf_attrs() doesn't make a
> great deal of sense given that the function could be used to apply a
> change to a range of PTEs. Instead, return a bitwise OR of attributes
> from all the visited PTEs.

I find this amalgamation of attributes quite confusing, and I have a
hard time attaching semantics to the resulting collection of bits.

It also means that you cannot reason about a particular attribute
being 0 if any of the neighbour PTEs has this bit set.

> 
> As the walker is no longer returning the full PTE, drop the check for a
> valid PTE in kvm_age_gfn().

But then what does it mean to check for a potentially invalid PTE?

The helpers explicitly say:

/*
 * The following only work if pte_present(). Undefined behaviour otherwise.
 */
#define pte_present(pte)	(!!(pte_val(pte) & (PTE_VALID | PTE_PROT_NONE)))
#define pte_young(pte)		(!!(pte_val(pte) & PTE_AF))

and you seem to be violating this requirement.

Thanks,

	M.
Oliver Upton Jan. 11, 2023, 5:21 p.m. UTC | #2
Hey Marc,

On Wed, Jan 11, 2023 at 08:52:25AM +0000, Marc Zyngier wrote:
> [+ Suzuki and Zenghui who are missing from the Cc list]

Doh! Just switched over to working out of a new git tree and didn't move
over my cc-cmd. Apologies to you two.

> On Wed, 11 Jan 2023 00:02:58 +0000,
> Oliver Upton <oliver.upton@linux.dev> wrote:
> > 
> > Returning a single PTE from stage2_update_leaf_attrs() doesn't make a
> > great deal of sense given that the function could be used to apply a
> > change to a range of PTEs. Instead, return a bitwise OR of attributes
> > from all the visited PTEs.
> 
> I find this amalgamation of attributes quite confusing, and I have a
> hard time attaching semantics to the resulting collection of bits.
> 
> It also means that you cannot reason about a particular attribute
> being 0 if any of the neighbour PTEs has this bit set.

Very true. What I had really wanted to do was make a walker that allows
software to check the state of specific attribute bit(s) within a range
of memory instead of returning all of them. I decided against it because
it would put more churn on other callers or require a new walker
entirely.

Anyway, I can go about that change to make this a bit easier to reason
about. Thoughts?

> > 
> > As the walker is no longer returning the full PTE, drop the check for a
> > valid PTE in kvm_age_gfn().
> 
> But then what does it mean to check for a potentially invalid PTE?
> 
> The helpers explicitly say:
> 
> /*
>  * The following only work if pte_present(). Undefined behaviour otherwise.
>  */
> #define pte_present(pte)	(!!(pte_val(pte) & (PTE_VALID | PTE_PROT_NONE)))
> #define pte_young(pte)		(!!(pte_val(pte) & PTE_AF))
> 
> and you seem to be violating this requirement.

Indeed I have. It all still works because the call returns 0 if no valid
PTE was found in the walk but that's nowhere near as obvious as what we
had before.

--
Thanks,
Oliver
Oliver Upton Feb. 2, 2023, 10:08 p.m. UTC | #3
On Wed, Jan 11, 2023 at 05:21:13PM +0000, Oliver Upton wrote:
> Hey Marc,
> 
> On Wed, Jan 11, 2023 at 08:52:25AM +0000, Marc Zyngier wrote:
> > [+ Suzuki and Zenghui who are missing from the Cc list]
> 
> Doh! Just switched over to working out of a new git tree and didn't move
> over my cc-cmd. Apologies to you two.
> 
> > On Wed, 11 Jan 2023 00:02:58 +0000,
> > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > 
> > > Returning a single PTE from stage2_update_leaf_attrs() doesn't make a
> > > great deal of sense given that the function could be used to apply a
> > > change to a range of PTEs. Instead, return a bitwise OR of attributes
> > > from all the visited PTEs.
> > 
> > I find this amalgamation of attributes quite confusing, and I have a
> > hard time attaching semantics to the resulting collection of bits.
> > 
> > It also means that you cannot reason about a particular attribute
> > being 0 if any of the neighbour PTEs has this bit set.
> 
> Very true. What I had really wanted to do was make a walker that allows
> software to check the state of specific attribute bit(s) within a range
> of memory instead of returning all of them. I decided against it because
> it would put more churn on other callers or require a new walker
> entirely.
> 
> Anyway, I can go about that change to make this a bit easier to reason
> about. Thoughts?

LOL, I thought I hadn't replied which is why I didn't hear anything
back. Ball is in your court, Marc, any thoughts?

--
Thanks,
Oliver
Marc Zyngier Feb. 7, 2023, 2:56 p.m. UTC | #4
On Thu, 02 Feb 2023 22:08:37 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Wed, Jan 11, 2023 at 05:21:13PM +0000, Oliver Upton wrote:
> > Hey Marc,
> > 
> > On Wed, Jan 11, 2023 at 08:52:25AM +0000, Marc Zyngier wrote:
> > > [+ Suzuki and Zenghui who are missing from the Cc list]
> > 
> > Doh! Just switched over to working out of a new git tree and didn't move
> > over my cc-cmd. Apologies to you two.
> > 
> > > On Wed, 11 Jan 2023 00:02:58 +0000,
> > > Oliver Upton <oliver.upton@linux.dev> wrote:
> > > > 
> > > > Returning a single PTE from stage2_update_leaf_attrs() doesn't make a
> > > > great deal of sense given that the function could be used to apply a
> > > > change to a range of PTEs. Instead, return a bitwise OR of attributes
> > > > from all the visited PTEs.
> > > 
> > > I find this amalgamation of attributes quite confusing, and I have a
> > > hard time attaching semantics to the resulting collection of bits.
> > > 
> > > It also means that you cannot reason about a particular attribute
> > > being 0 if any of the neighbour PTEs has this bit set.
> > 
> > Very true. What I had really wanted to do was make a walker that allows
> > software to check the state of specific attribute bit(s) within a range
> > of memory instead of returning all of them. I decided against it because
> > it would put more churn on other callers or require a new walker
> > entirely.
> > 
> > Anyway, I can go about that change to make this a bit easier to reason
> > about. Thoughts?
> 
> LOL, I thought I hadn't replied which is why I didn't hear anything
> back. Ball is in your court, Marc, any thoughts?

Huh, I clearly missed that email. Sorry about the delay.

Having a separate walker doesn't strike me as unreasonable. This
clearly is something different from the existing walkers, and it is
bound to be very little code anyway.

Thanks,

	M.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 26f61462527d..a3d599e3af60 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -980,7 +980,7 @@  int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
 struct stage2_attr_data {
 	kvm_pte_t			attr_set;
 	kvm_pte_t			attr_clr;
-	kvm_pte_t			pte;
+	kvm_pte_t			attr_old;
 	u32				level;
 };
 
@@ -995,7 +995,7 @@  static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx,
 		return 0;
 
 	data->level = ctx->level;
-	data->pte = pte;
+	data->attr_old |= pte & KVM_PTE_LEAF_ATTRS;
 	pte &= ~data->attr_clr;
 	pte |= data->attr_set;
 
@@ -1004,7 +1004,7 @@  static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	 * but worst-case the access flag update gets lost and will be
 	 * set on the next access instead.
 	 */
-	if (data->pte != pte) {
+	if (ctx->old != pte) {
 		/*
 		 * Invalidate instruction cache before updating the guest
 		 * stage-2 PTE if we are going to add executable permission.
@@ -1023,7 +1023,7 @@  static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx,
 
 static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
 				    u64 size, kvm_pte_t attr_set,
-				    kvm_pte_t attr_clr, kvm_pte_t *orig_pte,
+				    kvm_pte_t attr_clr, kvm_pte_t *attr_old,
 				    u32 *level, enum kvm_pgtable_walk_flags flags)
 {
 	int ret;
@@ -1041,11 +1041,12 @@  static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
 	if (ret)
 		return ret;
 
-	if (orig_pte)
-		*orig_pte = data.pte;
+	if (attr_old)
+		*attr_old = data.attr_old;
 
 	if (level)
 		*level = data.level;
+
 	return 0;
 }
 
@@ -1058,32 +1059,33 @@  int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size)
 
 kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr)
 {
-	kvm_pte_t pte = 0;
+	kvm_pte_t attr_old = 0;
+
 	stage2_update_leaf_attrs(pgt, addr, 1, KVM_PTE_LEAF_ATTR_LO_S2_AF, 0,
-				 &pte, NULL, 0);
+				 &attr_old, NULL, 0);
 	dsb(ishst);
-	return pte;
+	return attr_old;
 }
 
 kvm_pte_t kvm_pgtable_stage2_mkold(struct kvm_pgtable *pgt, u64 addr)
 {
-	kvm_pte_t pte = 0;
+	kvm_pte_t attr_old = 0;
 	stage2_update_leaf_attrs(pgt, addr, 1, 0, KVM_PTE_LEAF_ATTR_LO_S2_AF,
-				 &pte, NULL, 0);
+				 &attr_old, NULL, 0);
 	/*
 	 * "But where's the TLBI?!", you scream.
 	 * "Over in the core code", I sigh.
 	 *
 	 * See the '->clear_flush_young()' callback on the KVM mmu notifier.
 	 */
-	return pte;
+	return attr_old;
 }
 
 bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr)
 {
-	kvm_pte_t pte = 0;
-	stage2_update_leaf_attrs(pgt, addr, 1, 0, 0, &pte, NULL, 0);
-	return pte & KVM_PTE_LEAF_ATTR_LO_S2_AF;
+	kvm_pte_t attr_old = 0;
+	stage2_update_leaf_attrs(pgt, addr, 1, 0, 0, &attr_old, NULL, 0);
+	return attr_old & KVM_PTE_LEAF_ATTR_LO_S2_AF;
 }
 
 int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 31d7fa4c7c14..0741f3a8ddca 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1618,7 +1618,7 @@  bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt,
 					range->start << PAGE_SHIFT);
 	pte = __pte(kpte);
-	return pte_valid(pte) && pte_young(pte);
+	return pte_young(pte);
 }
 
 bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)