diff mbox

[v1,1/1] KVM: X86: NULL pointer dereference [null-pointer-deref] (CWE 476) problem

Message ID 20180213182640.10646-2-joe.moriarty@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Joe Moriarty Feb. 13, 2018, 6:26 p.m. UTC
The Parfait (version 2.1.0) static code analysis tool found the
following NULL pointer dereference problem.

- arch/x86/kvm/mmu.c
There is a possibility that the call to __gfn_to_rmap() can happen
with a NULL pointer given for the slot argument.  This can happen
if the slot information cannot be determined from a previous call
to __gfn_to_memslot().  The base_gfn will be passed in as 0 to
gfn_to_index if no valid slot information can be obtained from a
call to __gfn_to_memslot().

Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com>
---
 arch/x86/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paolo Bonzini Feb. 14, 2018, 4:28 p.m. UTC | #1
On 13/02/2018 19:26, Joe Moriarty wrote:
> The Parfait (version 2.1.0) static code analysis tool found the
> following NULL pointer dereference problem.
> 
> - arch/x86/kvm/mmu.c
> There is a possibility that the call to __gfn_to_rmap() can happen
> with a NULL pointer given for the slot argument.  This can happen
> if the slot information cannot be determined from a previous call
> to __gfn_to_memslot().  The base_gfn will be passed in as 0 to
> gfn_to_index if no valid slot information can be obtained from a
> call to __gfn_to_memslot().

How does it justify the possible NULL pointer dereference?

The fix below is incorrect anyway, since it would just get a random page
out of guest memory; please report the issue without providing a patch
if you do not understand the code well.

I can't exclude there is a race condition here that can lead to a NULL
pointer dereference.  However, it should be handled by passing the
struct kvm_memory_slot all the way down to __gfn_to_rmap, i.e. adding a
new argument to gfn_to_rmap, rmap_add, mmu_set_spte, __direct_map,
try_async_pf and many more functions.  That's not as easy as the patch
you posted. :)

Thanks,

Paolo

> Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com>
> ---
>  arch/x86/kvm/mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 8eca1d04aeb8..69d41b5d0948 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -1239,7 +1239,7 @@ static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
>  {
>  	unsigned long idx;
>  
> -	idx = gfn_to_index(gfn, slot->base_gfn, level);
> +	idx = gfn_to_index(gfn, slot ? slot->base_gfn : 0, level);
>  	return &slot->arch.rmap[level - PT_PAGE_TABLE_LEVEL][idx];
>  }
>  
>
Joe Moriarty Feb. 14, 2018, 4:55 p.m. UTC | #2
On 2/14/2018 11:28 AM, Paolo Bonzini wrote:
> On 13/02/2018 19:26, Joe Moriarty wrote:
>> The Parfait (version 2.1.0) static code analysis tool found the
>> following NULL pointer dereference problem.
>>
>> - arch/x86/kvm/mmu.c
>> There is a possibility that the call to __gfn_to_rmap() can happen
>> with a NULL pointer given for the slot argument.  This can happen
>> if the slot information cannot be determined from a previous call
>> to __gfn_to_memslot().  The base_gfn will be passed in as 0 to
>> gfn_to_index if no valid slot information can be obtained from a
>> call to __gfn_to_memslot().
> 
> How does it justify the possible NULL pointer dereference?
> 
> The fix below is incorrect anyway, since it would just get a random page
> out of guest memory; please report the issue without providing a patch
> if you do not understand the code well.
> 
> I can't exclude there is a race condition here that can lead to a NULL
> pointer dereference.  However, it should be handled by passing the
> struct kvm_memory_slot all the way down to __gfn_to_rmap, i.e. adding a
> new argument to gfn_to_rmap, rmap_add, mmu_set_spte, __direct_map,
> try_async_pf and many more functions.  That's not as easy as the patch
> you posted. :)
> 
> Thanks,
> 
> Paolo
> 
>> Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com>
>> ---
>>   arch/x86/kvm/mmu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 8eca1d04aeb8..69d41b5d0948 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -1239,7 +1239,7 @@ static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
>>   {
>>   	unsigned long idx;
>>   
>> -	idx = gfn_to_index(gfn, slot->base_gfn, level);
>> +	idx = gfn_to_index(gfn, slot ? slot->base_gfn : 0, level);
>>   	return &slot->arch.rmap[level - PT_PAGE_TABLE_LEVEL][idx];
>>   }
>>   
>>
> 
Hi Paolo,

Thank you for the review.  I wasn't sure if the change I proposed was 
correct or not.  I will take your suggestion of posting to the mailing 
list instead of as a patch the next time I encounter a situation like 
this again.  In the meantime, I will look at doing your suggestion of 
passing kvm_memory_slot down to gfn_to_rmap() and the other functions 
you pointed out above for the next version of the patch.

Joe
Paolo Bonzini Feb. 14, 2018, 5:33 p.m. UTC | #3
On 14/02/2018 17:55, Joe Moriarty wrote:
>>
> Hi Paolo,
> 
> Thank you for the review.  I wasn't sure if the change I proposed was
> correct or not.  I will take your suggestion of posting to the mailing
> list instead of as a patch the next time I encounter a situation like
> this again.  In the meantime, I will look at doing your suggestion of
> passing kvm_memory_slot down to gfn_to_rmap() and the other functions
> you pointed out above for the next version of the patch.

It's not easy, but I can send you a box of beers if you get round to it.
 Note that I'm still not sure how the NULL pointer dereference can
happen, and you didn't include more output from your tool, so you might
be wasting your time after all...

Anyway, I would start basically by mapping the paths from try_async_pf's
callers to mmu_set_spte and from there to rmap_add.

On the other hand, in the rmap_remove path, you probably should just
exit immediately if slot is NULL.  (Guangrong, do you have any idea why
we don't zap SPTEs in kvm_arch_free_memslot?)

Paolo
Joe Moriarty Feb. 14, 2018, 6:14 p.m. UTC | #4
On 2/14/2018 12:33 PM, Paolo Bonzini wrote:
> On 14/02/2018 17:55, Joe Moriarty wrote:
>>>
>> Hi Paolo,
>>
>> Thank you for the review.  I wasn't sure if the change I proposed was
>> correct or not.  I will take your suggestion of posting to the mailing
>> list instead of as a patch the next time I encounter a situation like
>> this again.  In the meantime, I will look at doing your suggestion of
>> passing kvm_memory_slot down to gfn_to_rmap() and the other functions
>> you pointed out above for the next version of the patch.
> 
> It's not easy, but I can send you a box of beers if you get round to it.
>   Note that I'm still not sure how the NULL pointer dereference can
> happen, and you didn't include more output from your tool, so you might
> be wasting your time after all...
> 
> Anyway, I would start basically by mapping the paths from try_async_pf's
> callers to mmu_set_spte and from there to rmap_add.
> 
> On the other hand, in the rmap_remove path, you probably should just
> exit immediately if slot is NULL.  (Guangrong, do you have any idea why
> we don't zap SPTEs in kvm_arch_free_memslot?)
> 
> Paolo
> 
The crux of the problem is the possible return of NULL from 
search_memslots() located at line 950 in include/linux/kvm_host.h.  This 
is called by __gfn_to_memslot().  I would assume this is a case of this 
is never going to happen.  I say this because the kernel would panic 
today with a NULL pointer dereference at line 1254 of arch/x86/kvm/mmu.c 
if it did happen.

I was thinking a BUG_ON() would be more appropriate at the end of 
search_memslots() before the "return NULL".  That is if my 
aforementioned assumption is correct about the code path.  If not, then 
continue on with the suggested fix of passing down kvm_memory_slot to 
gfn_to_rmap().

Let me know which solution you would like.
Joe
Paolo Bonzini Feb. 14, 2018, 9:42 p.m. UTC | #5
----- Original Message -----
> From: "Joe Moriarty" <joe.moriarty@oracle.com>
> To: "Paolo Bonzini" <pbonzini@redhat.com>, rkrcmar@redhat.com, kvm@vger.kernel.org, "Xiao Guangrong"
> <xiaoguangrong@tencent.com>
> Sent: Wednesday, February 14, 2018 7:14:09 PM
> Subject: Re: [PATCH v1 1/1] KVM: X86: NULL pointer dereference [null-pointer-deref] (CWE 476) problem
> 
> On 2/14/2018 12:33 PM, Paolo Bonzini wrote:
> > On 14/02/2018 17:55, Joe Moriarty wrote:
> >>>
> >> Hi Paolo,
> >>
> >> Thank you for the review.  I wasn't sure if the change I proposed was
> >> correct or not.  I will take your suggestion of posting to the mailing
> >> list instead of as a patch the next time I encounter a situation like
> >> this again.  In the meantime, I will look at doing your suggestion of
> >> passing kvm_memory_slot down to gfn_to_rmap() and the other functions
> >> you pointed out above for the next version of the patch.
> > 
> > It's not easy, but I can send you a box of beers if you get round to it.
> >   Note that I'm still not sure how the NULL pointer dereference can
> > happen, and you didn't include more output from your tool, so you might
> > be wasting your time after all...
> > 
> > Anyway, I would start basically by mapping the paths from try_async_pf's
> > callers to mmu_set_spte and from there to rmap_add.
> > 
> > On the other hand, in the rmap_remove path, you probably should just
> > exit immediately if slot is NULL.  (Guangrong, do you have any idea why
> > we don't zap SPTEs in kvm_arch_free_memslot?)
> 
> The crux of the problem is the possible return of NULL from
> search_memslots() located at line 950 in include/linux/kvm_host.h.  This
> is called by __gfn_to_memslot().  I would assume this is a case of this
> is never going to happen.  I say this because the kernel would panic
> today with a NULL pointer dereference at line 1254 of arch/x86/kvm/mmu.c
> if it did happen.

I cannot really rule out that it won't happen.  I didn't have in mind
that NULL-pointer dereference, but it's always looked racy to me---I've
never gotten round to fix it because I wasn't sure about the right thing
to do on the rmap_remove path.

> I was thinking a BUG_ON() would be more appropriate at the end of
> search_memslots() before the "return NULL".

No, this is probably too much.  The right fix is to pass down the memslot,
thus avoiding gfn_to_rmap in favor of __gfn_to_rmap.

Paolo
Xiao Guangrong Feb. 24, 2018, 3:19 a.m. UTC | #6
On 02/15/2018 01:33 AM, Paolo Bonzini wrote:
> On 14/02/2018 17:55, Joe Moriarty wrote:
>>>
>> Hi Paolo,
>>
>> Thank you for the review.  I wasn't sure if the change I proposed was
>> correct or not.  I will take your suggestion of posting to the mailing
>> list instead of as a patch the next time I encounter a situation like
>> this again.  In the meantime, I will look at doing your suggestion of
>> passing kvm_memory_slot down to gfn_to_rmap() and the other functions
>> you pointed out above for the next version of the patch.
> 
> It's not easy, but I can send you a box of beers if you get round to it.
>   Note that I'm still not sure how the NULL pointer dereference can
> happen, and you didn't include more output from your tool, so you might
> be wasting your time after all...
> 
> Anyway, I would start basically by mapping the paths from try_async_pf's
> callers to mmu_set_spte and from there to rmap_add.
> 
> On the other hand, in the rmap_remove path, you probably should just
> exit immediately if slot is NULL.  (Guangrong, do you have any idea why
> we don't zap SPTEs in kvm_arch_free_memslot?)

Sorry for the delay replay due to Chinese New Year.

We do zap sptes in kvm_arch_flush_shadow_memslot() if the old memslot is being
deleted or moved:
    kvm_arch_flush_shadow_memslot() -> kvm_page_track_flush_slot()
      ->kvm_mmu_invalidate_zap_pages_in_memslot()


And i do not think the case can happen as we have called gfn_to_pfn()
to check if the gfn is MMIO (see if it exists in memeslots) before
mapping it into rmap. During it, the srcu prevents any memslot
to go away.
Paolo Bonzini Feb. 24, 2018, 8:48 p.m. UTC | #7
----- Original Message -----
> From: "Xiao Guangrong" <guangrong.xiao@gmail.com>
> To: "Paolo Bonzini" <pbonzini@redhat.com>, "Joe Moriarty" <joe.moriarty@oracle.com>, rkrcmar@redhat.com,
> kvm@vger.kernel.org, "Xiao Guangrong" <xiaoguangrong@tencent.com>
> Sent: Saturday, February 24, 2018 4:19:59 AM
> Subject: Re: [PATCH v1 1/1] KVM: X86: NULL pointer dereference [null-pointer-deref] (CWE 476) problem
> 
> 
> 
> On 02/15/2018 01:33 AM, Paolo Bonzini wrote:
> > On 14/02/2018 17:55, Joe Moriarty wrote:
> >>>
> >> Hi Paolo,
> >>
> >> Thank you for the review.  I wasn't sure if the change I proposed was
> >> correct or not.  I will take your suggestion of posting to the mailing
> >> list instead of as a patch the next time I encounter a situation like
> >> this again.  In the meantime, I will look at doing your suggestion of
> >> passing kvm_memory_slot down to gfn_to_rmap() and the other functions
> >> you pointed out above for the next version of the patch.
> > 
> > It's not easy, but I can send you a box of beers if you get round to it.
> >   Note that I'm still not sure how the NULL pointer dereference can
> > happen, and you didn't include more output from your tool, so you might
> > be wasting your time after all...
> > 
> > Anyway, I would start basically by mapping the paths from try_async_pf's
> > callers to mmu_set_spte and from there to rmap_add.
> > 
> > On the other hand, in the rmap_remove path, you probably should just
> > exit immediately if slot is NULL.  (Guangrong, do you have any idea why
> > we don't zap SPTEs in kvm_arch_free_memslot?)
> 
> Sorry for the delay replay due to Chinese New Year.
> 
> We do zap sptes in kvm_arch_flush_shadow_memslot() if the old memslot is
> being
> deleted or moved:
>     kvm_arch_flush_shadow_memslot() -> kvm_page_track_flush_slot()
>       ->kvm_mmu_invalidate_zap_pages_in_memslot()

Right, that is used even if ept=1.

> And i do not think the case can happen as we have called gfn_to_pfn()
> to check if the gfn is MMIO (see if it exists in memeslots) before
> mapping it into rmap. During it, the srcu prevents any memslot
> to go away.

It cannot be freed, but it can return NULL the second time.  RCU does not
guarantee that no change happen; if you want consistent reads you need to
read each pointer at most once.

Paolo
Xiao Guangrong Feb. 26, 2018, 7:38 a.m. UTC | #8
On 02/25/2018 04:48 AM, Paolo Bonzini wrote:
> 
> 
> ----- Original Message -----
>> From: "Xiao Guangrong" <guangrong.xiao@gmail.com>
>> To: "Paolo Bonzini" <pbonzini@redhat.com>, "Joe Moriarty" <joe.moriarty@oracle.com>, rkrcmar@redhat.com,
>> kvm@vger.kernel.org, "Xiao Guangrong" <xiaoguangrong@tencent.com>
>> Sent: Saturday, February 24, 2018 4:19:59 AM
>> Subject: Re: [PATCH v1 1/1] KVM: X86: NULL pointer dereference [null-pointer-deref] (CWE 476) problem
>>
>>
>>
>> On 02/15/2018 01:33 AM, Paolo Bonzini wrote:
>>> On 14/02/2018 17:55, Joe Moriarty wrote:
>>>>>
>>>> Hi Paolo,
>>>>
>>>> Thank you for the review.  I wasn't sure if the change I proposed was
>>>> correct or not.  I will take your suggestion of posting to the mailing
>>>> list instead of as a patch the next time I encounter a situation like
>>>> this again.  In the meantime, I will look at doing your suggestion of
>>>> passing kvm_memory_slot down to gfn_to_rmap() and the other functions
>>>> you pointed out above for the next version of the patch.
>>>
>>> It's not easy, but I can send you a box of beers if you get round to it.
>>>    Note that I'm still not sure how the NULL pointer dereference can
>>> happen, and you didn't include more output from your tool, so you might
>>> be wasting your time after all...
>>>
>>> Anyway, I would start basically by mapping the paths from try_async_pf's
>>> callers to mmu_set_spte and from there to rmap_add.
>>>
>>> On the other hand, in the rmap_remove path, you probably should just
>>> exit immediately if slot is NULL.  (Guangrong, do you have any idea why
>>> we don't zap SPTEs in kvm_arch_free_memslot?)
>>
>> Sorry for the delay replay due to Chinese New Year.
>>
>> We do zap sptes in kvm_arch_flush_shadow_memslot() if the old memslot is
>> being
>> deleted or moved:
>>      kvm_arch_flush_shadow_memslot() -> kvm_page_track_flush_slot()
>>        ->kvm_mmu_invalidate_zap_pages_in_memslot()
> 
> Right, that is used even if ept=1.
> 
>> And i do not think the case can happen as we have called gfn_to_pfn()
>> to check if the gfn is MMIO (see if it exists in memeslots) before
>> mapping it into rmap. During it, the srcu prevents any memslot
>> to go away.
> 
> It cannot be freed, but it can return NULL the second time.  RCU does not
> guarantee that no change happen; if you want consistent reads you need to
> read each pointer at most once.

That is the reason why invalid memslot comes in.

KVM sets the memslot as invalid first then mmu will see the invalid slot
rather than NULL. A new call of #PF handler will see 'invalid' bit is set
that make the handler directly exit.
Paolo Bonzini Feb. 26, 2018, 8:59 a.m. UTC | #9
On 26/02/2018 08:38, Xiao Guangrong wrote:
>>
>>
>>> And i do not think the case can happen as we have called gfn_to_pfn()
>>> to check if the gfn is MMIO (see if it exists in memeslots) before
>>> mapping it into rmap. During it, the srcu prevents any memslot
>>> to go away.
>>
>> It cannot be freed, but it can return NULL the second time.  RCU does not
>> guarantee that no change happen; if you want consistent reads you need to
>> read each pointer at most once.
> 
> That is the reason why invalid memslot comes in.
> 
> KVM sets the memslot as invalid first then mmu will see the invalid slot
> rather than NULL. A new call of #PF handler will see 'invalid' bit is set
> that make the handler directly exit.

How can this be guaranteed?  First, KVM_MEMSLOT_INVALID is not even set
under any lock.  Second, that doesn't fix the problem that KVM accesses
the memslots array twice---the first before KVM_MEMSLOT_INVALID is set,
and the second time after the new array has been installed.

Paolo
Xiao Guangrong Feb. 26, 2018, 11:53 a.m. UTC | #10
On 02/26/2018 04:59 PM, Paolo Bonzini wrote:
> On 26/02/2018 08:38, Xiao Guangrong wrote:
>>>
>>>
>>>> And i do not think the case can happen as we have called gfn_to_pfn()
>>>> to check if the gfn is MMIO (see if it exists in memeslots) before
>>>> mapping it into rmap. During it, the srcu prevents any memslot
>>>> to go away.
>>>
>>> It cannot be freed, but it can return NULL the second time.  RCU does not
>>> guarantee that no change happen; if you want consistent reads you need to
>>> read each pointer at most once.
>>
>> That is the reason why invalid memslot comes in.
>>
>> KVM sets the memslot as invalid first then mmu will see the invalid slot
>> rather than NULL. A new call of #PF handler will see 'invalid' bit is set
>> that make the handler directly exit.
> 
> How can this be guaranteed?  First, KVM_MEMSLOT_INVALID is not even set
> under any lock.

Yes, so the user can see either the valid memslot or the invalid memslot
, under both cases the memslot can be be NULL, that's okay.

>  Second, that doesn't fix the problem that KVM accesses
> the memslots array twice---the first before KVM_MEMSLOT_INVALID is set,
> and the second time after the new array has been installed.

The user can not go across between the invalid memslot and new installed memslot, as
the user should hold the srcu lock during its operation and __kvm_set_memory_region()
always waits a SRCU grace period for each status update for the memslot, i.e,

__kvm_set_memory_region()                    vCPU 0

     set memslot invalid
                                            srcu-lock
                                            gfn_to_pfn
                                            add gfn to rmap
                                            srcu-unlock

     synchronize_srcu_expedited()

                                           srcu-lock
                                           case 1: see the memslot is invalid, then directly exits

     delete the memslot or add new memslot

                                           case 2: see the memslot is gone treat it as mmio OR
                                                   a normal memslot
     synchronize_srcu_expedited()
diff mbox

Patch

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 8eca1d04aeb8..69d41b5d0948 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1239,7 +1239,7 @@  static struct kvm_rmap_head *__gfn_to_rmap(gfn_t gfn, int level,
 {
 	unsigned long idx;
 
-	idx = gfn_to_index(gfn, slot->base_gfn, level);
+	idx = gfn_to_index(gfn, slot ? slot->base_gfn : 0, level);
 	return &slot->arch.rmap[level - PT_PAGE_TABLE_LEVEL][idx];
 }