[next] KVM: x86: Fix allocation sizeof argument

Message ID	20211001110106.15056-1-colin.king@canonical.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Colin King <colin.king@canonical.com> To: Paolo Bonzini <pbonzini@redhat.com>, Sean Christopherson <seanjc@google.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Wanpeng Li <wanpengli@tencent.com>, Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>, David Stevens <stevensd@chromium.org>, kvm@vger.kernel.org Cc: kernel-janitors@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH][next] KVM: x86: Fix allocation sizeof argument Date: Fri, 1 Oct 2021 12:01:06 +0100 Message-Id: <20211001110106.15056-1-colin.king@canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[next] KVM: x86: Fix allocation sizeof argument \| expand [next] KVM: x86: Fix allocation sizeof argument

Colin King Oct. 1, 2021, 11:01 a.m. UTC

From: Colin Ian King <colin.king@canonical.com>

The allocation for *gfn_track should be for a slot->npages lot of
short integers, however the current allocation is using sizeof(*gfn_track)
and that is the size of a pointer, which is too large. Fix this by
using sizeof(**gfn_track) instead.

Addresses-Coverity: ("Wrong sizeof argument")
Fixes: 35b330bba6a7 ("KVM: x86: only allocate gfn_track when necessary")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 arch/x86/kvm/mmu/page_track.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Sean Christopherson Oct. 5, 2021, 3:41 p.m. UTC | #1

On Fri, Oct 01, 2021, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> The allocation for *gfn_track should be for a slot->npages lot of
> short integers, however the current allocation is using sizeof(*gfn_track)
> and that is the size of a pointer, which is too large. Fix this by
> using sizeof(**gfn_track) instead.
> 
> Addresses-Coverity: ("Wrong sizeof argument")
> Fixes: 35b330bba6a7 ("KVM: x86: only allocate gfn_track when necessary")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  arch/x86/kvm/mmu/page_track.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> index bb5d60bd4dbf..5b785a5f7dc9 100644
> --- a/arch/x86/kvm/mmu/page_track.c
> +++ b/arch/x86/kvm/mmu/page_track.c
> @@ -92,7 +92,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
>  		slots = __kvm_memslots(kvm, i);
>  		kvm_for_each_memslot(slot, slots) {
>  			gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
> -			*gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> +			*gfn_track = kvcalloc(slot->npages, sizeof(**gfn_track),
>  					      GFP_KERNEL_ACCOUNT);

Eww (not your patch, the original code).  IMO the double indirection is completely
unnecessary, e.g. I find this far easier to follow

diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index bb5d60bd4dbf..8cae41b831dd 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -75,7 +75,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
 {
        struct kvm_memslots *slots;
        struct kvm_memory_slot *slot;
-       unsigned short **gfn_track;
+       unsigned short *gfn_track;
        int i;
 
        if (write_tracking_enabled(kvm))
@@ -91,13 +91,13 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
        for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
                slots = __kvm_memslots(kvm, i);
                kvm_for_each_memslot(slot, slots) {
-                       gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
-                       *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
-                                             GFP_KERNEL_ACCOUNT);
-                       if (*gfn_track == NULL) {
+                       gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
+                                            GFP_KERNEL_ACCOUNT);
+                       if (gfn_track == NULL) {
                                mutex_unlock(&kvm->slots_arch_lock);
                                return -ENOMEM;
                        }
+                       slot->arch.gfn_track[KVM_PAGE_TRACK_WRITE] = gfn_track;
                }
        }
 


>  			if (*gfn_track == NULL) {
>  				mutex_unlock(&kvm->slots_arch_lock);

Hrm, this fails to free the gfn_track allocations for previous memslots.  The
on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
in the _current_ slot, but does not free previous slots).

And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
and means we have to maintain two near-identical copies of non-obvious code.

Paolo, is it too late to just drop the original deae4a10f166 ("KVM: x86: only
allocate gfn_track when necessary")?

> -- 
> 2.32.0
>

Paolo Bonzini Oct. 5, 2021, 5:27 p.m. UTC | #2

On 05/10/21 17:41, Sean Christopherson wrote:
>>   			if (*gfn_track == NULL) {
>>   				mutex_unlock(&kvm->slots_arch_lock);
> Hrm, this fails to free the gfn_track allocations for previous memslots.  The
> on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> in the_current_  slot, but does not free previous slots).

That's not a huge deal because the syscall is failing.  So as long as 
it's not leaked forever, it's okay.  The problem is the 
WARN_ON(slot->arch.rmap[i]), or the missing check in 
kvm_page_track_enable_mmu_write_tracking, but that's easily fixed.  I'd 
even remove the call to memslot_rmaps_free.

> And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> and means we have to maintain two near-identical copies of non-obvious code.

I was thinking the separate flow (not so much the flag) is needed 
because, if KVMGT is enabled, gfn_track is allocated unconditionally. 
rmaps are added on top of that if shadow paging is enabled; but 
kvm_page_track_create_memslot will have already created the counter, 
including the one for KVM_PAGE_TRACK_WRITE.

But looking at the code again, I guess you could call 
kvm_page_track_enable_mmu_write_tracking inside alloc_all_memslots_rmaps 
(with a little bit of renaming), and with that the flag would go away.

I'll take a look tomorrow, but I'd rather avoid reverting the patch.

Thanks,

Paolo

> Paolo, is it too late to just drop the original deae4a10f166 ("KVM: x86: only
> allocate gfn_track when necessary")?
>

Sean Christopherson Oct. 5, 2021, 5:55 p.m. UTC | #3

On Tue, Oct 05, 2021, Paolo Bonzini wrote:
> On 05/10/21 17:41, Sean Christopherson wrote:
> > >   			if (*gfn_track == NULL) {
> > >   				mutex_unlock(&kvm->slots_arch_lock);
> > Hrm, this fails to free the gfn_track allocations for previous memslots.  The
> > on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> > in the_current_  slot, but does not free previous slots).
> 
> That's not a huge deal because the syscall is failing.  So as long as it's
> not leaked forever, it's okay.  The problem is the
> WARN_ON(slot->arch.rmap[i]), or the missing check in
> kvm_page_track_enable_mmu_write_tracking, but that's easily fixed.  I'd even
> remove the call to memslot_rmaps_free.

It can be leaked forever though, e.g. if userspace invokes KVM_RUN over and over
on -ENOMEM.  That would trigger the WARN_ON(slot->arch.rmap[i]) and leak the
previous allocation.  I think it would be safe to change that WARN_ON to a
check-and-continue, i.e. to preserve the previous allocation

> > And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> > and means we have to maintain two near-identical copies of non-obvious code.
> 
> I was thinking the separate flow (not so much the flag) is needed because,
> if KVMGT is enabled, gfn_track is allocated unconditionally. rmaps are added
> on top of that if shadow paging is enabled; but
> kvm_page_track_create_memslot will have already created the counter,
> including the one for KVM_PAGE_TRACK_WRITE.
> 
> But looking at the code again, I guess you could call
> kvm_page_track_enable_mmu_write_tracking inside alloc_all_memslots_rmaps
> (with a little bit of renaming), and with that the flag would go away.

Yes, and reuse the control flow, which is what I really care about since that's
the part that both features get wrong.
 
> I'll take a look tomorrow, but I'd rather avoid reverting the patch.

I can poke at it too if you don't have time.  I wasn't suggesting a full revert,
rather a "drop and pretend it never got applied", with a plan to apply a new
version instead of fixing up the current code.

Paolo Bonzini Oct. 5, 2021, 8:52 p.m. UTC | #4

On 05/10/21 19:55, Sean Christopherson wrote:
>   I wasn't suggesting a full revert,
> rather a "drop and pretend it never got applied", with a plan to apply a new
> version instead of fixing up the current code.

Considering that there are issues in the rmaps as well, I'd rather fix 
both the right way.

Paolo

David Stevens Oct. 6, 2021, 12:22 a.m. UTC | #5

On Wed, Oct 6, 2021 at 12:41 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Oct 01, 2021, Colin King wrote:
> > From: Colin Ian King <colin.king@canonical.com>
> >
> > The allocation for *gfn_track should be for a slot->npages lot of
> > short integers, however the current allocation is using sizeof(*gfn_track)
> > and that is the size of a pointer, which is too large. Fix this by
> > using sizeof(**gfn_track) instead.
> >
> > Addresses-Coverity: ("Wrong sizeof argument")
> > Fixes: 35b330bba6a7 ("KVM: x86: only allocate gfn_track when necessary")
> > Signed-off-by: Colin Ian King <colin.king@canonical.com>
> > ---
> >  arch/x86/kvm/mmu/page_track.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> > index bb5d60bd4dbf..5b785a5f7dc9 100644
> > --- a/arch/x86/kvm/mmu/page_track.c
> > +++ b/arch/x86/kvm/mmu/page_track.c
> > @@ -92,7 +92,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
> >               slots = __kvm_memslots(kvm, i);
> >               kvm_for_each_memslot(slot, slots) {
> >                       gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
> > -                     *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> > +                     *gfn_track = kvcalloc(slot->npages, sizeof(**gfn_track),
> >                                             GFP_KERNEL_ACCOUNT);
>
> Eww (not your patch, the original code).  IMO the double indirection is completely
> unnecessary, e.g. I find this far easier to follow
>
> diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
> index bb5d60bd4dbf..8cae41b831dd 100644
> --- a/arch/x86/kvm/mmu/page_track.c
> +++ b/arch/x86/kvm/mmu/page_track.c
> @@ -75,7 +75,7 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
>  {
>         struct kvm_memslots *slots;
>         struct kvm_memory_slot *slot;
> -       unsigned short **gfn_track;
> +       unsigned short *gfn_track;
>         int i;
>
>         if (write_tracking_enabled(kvm))
> @@ -91,13 +91,13 @@ int kvm_page_track_enable_mmu_write_tracking(struct kvm *kvm)
>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>                 slots = __kvm_memslots(kvm, i);
>                 kvm_for_each_memslot(slot, slots) {
> -                       gfn_track = slot->arch.gfn_track + KVM_PAGE_TRACK_WRITE;
> -                       *gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> -                                             GFP_KERNEL_ACCOUNT);
> -                       if (*gfn_track == NULL) {
> +                       gfn_track = kvcalloc(slot->npages, sizeof(*gfn_track),
> +                                            GFP_KERNEL_ACCOUNT);
> +                       if (gfn_track == NULL) {
>                                 mutex_unlock(&kvm->slots_arch_lock);
>                                 return -ENOMEM;
>                         }
> +                       slot->arch.gfn_track[KVM_PAGE_TRACK_WRITE] = gfn_track;
>                 }
>         }
>
>
>
> >                       if (*gfn_track == NULL) {
> >                               mutex_unlock(&kvm->slots_arch_lock);
>
> Hrm, this fails to free the gfn_track allocations for previous memslots.  The
> on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> in the _current_ slot, but does not free previous slots).
>
> And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> and means we have to maintain two near-identical copies of non-obvious code.

I agree that's better than my patch. I can put together a new patch
once it's decided whether or not my patch should be dropped.

-David

Sean Christopherson Oct. 6, 2021, 12:41 a.m. UTC | #6

On Wed, Oct 06, 2021, David Stevens wrote:
> On Wed, Oct 6, 2021 at 12:41 AM Sean Christopherson <seanjc@google.com> wrote:
> > Hrm, this fails to free the gfn_track allocations for previous memslots.  The
> > on-demand rmaps code has the exact same bug (it frees rmaps for previous lpages
> > in the _current_ slot, but does not free previous slots).
> >
> > And having two separate flows (and flags) for rmaps vs. gfn_track is pointless,
> > and means we have to maintain two near-identical copies of non-obvious code.
> 
> I agree that's better than my patch. I can put together a new patch
> once it's decided whether or not my patch should be dropped.

All yours, unless Paolo wants to fight you for it :-)  I'm totally ok doing
cleanup/fixes on top.

[next] KVM: x86: Fix allocation sizeof argument

Commit Message

Comments

Patch