KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
diff mbox series

Message ID 20190912090050.20295-1-thuth@redhat.com
State New
Headers show
Series
  • KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
Related show

Commit Message

Thomas Huth Sept. 12, 2019, 9 a.m. UTC
When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
an interrupt, we convert them from the legacy struct kvm_s390_interrupt
to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
However, this function does not take care of all types of interrupts
that we can inject into the guest later (see do_inject_vcpu()). Since we
do not clear out the s390irq values before calling s390int_to_s390irq(),
there is a chance that we copy unwanted data from the kernel stack
into the guest memory later if the interrupt data has not been properly
initialized by s390int_to_s390irq().

Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
interrupt: s390int_to_s390irq() does not handle it, but the function
__deliver_pfault_init() will later copy the uninitialized stack data
from the ext.ext_params2 into the guest memory.

Fix it by handling that interrupt type in s390int_to_s390irq(), too.
And while we're at it, make sure that s390int_to_s390irq() now
directly returns -EINVAL for unknown interrupt types, so that we
do not run into this problem again in case we add more interrupt
types to do_inject_vcpu() sometime in the future.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 arch/s390/kvm/interrupt.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

David Hildenbrand Sept. 12, 2019, 9:14 a.m. UTC | #1
On 12.09.19 11:00, Thomas Huth wrote:
> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
> However, this function does not take care of all types of interrupts
> that we can inject into the guest later (see do_inject_vcpu()). Since we
> do not clear out the s390irq values before calling s390int_to_s390irq(),
> there is a chance that we copy unwanted data from the kernel stack
> into the guest memory later if the interrupt data has not been properly
> initialized by s390int_to_s390irq().
> 
> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
> interrupt: s390int_to_s390irq() does not handle it, but the function
> __deliver_pfault_init() will later copy the uninitialized stack data
> from the ext.ext_params2 into the guest memory.
> 
> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
> And while we're at it, make sure that s390int_to_s390irq() now
> directly returns -EINVAL for unknown interrupt types, so that we
> do not run into this problem again in case we add more interrupt
> types to do_inject_vcpu() sometime in the future.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 3e7efdd9228a..165dea4c7f19 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>  	case KVM_S390_MCHK:
>  		irq->u.mchk.mcic = s390int->parm64;
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +		irq->u.ext.ext_params = s390int->parm;
> +		irq->u.ext.ext_params2 = s390int->parm64;
> +		break;
> +	case KVM_S390_RESTART:
> +	case KVM_S390_INT_CLOCK_COMP:
> +	case KVM_S390_INT_CPU_TIMER:
> +		break;
> +	default:
> +		return -EINVAL;
>  	}
>  	return 0;
>  }
> 

Wouldn't a safe fix be to initialize the struct to zero in the caller?
Thomas Huth Sept. 12, 2019, 9:20 a.m. UTC | #2
On 12/09/2019 11.14, David Hildenbrand wrote:
> On 12.09.19 11:00, Thomas Huth wrote:
>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>> However, this function does not take care of all types of interrupts
>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>> there is a chance that we copy unwanted data from the kernel stack
>> into the guest memory later if the interrupt data has not been properly
>> initialized by s390int_to_s390irq().
>>
>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>> interrupt: s390int_to_s390irq() does not handle it, but the function
>> __deliver_pfault_init() will later copy the uninitialized stack data
>> from the ext.ext_params2 into the guest memory.
>>
>> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
>> And while we're at it, make sure that s390int_to_s390irq() now
>> directly returns -EINVAL for unknown interrupt types, so that we
>> do not run into this problem again in case we add more interrupt
>> types to do_inject_vcpu() sometime in the future.
>>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index 3e7efdd9228a..165dea4c7f19 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>>  	case KVM_S390_MCHK:
>>  		irq->u.mchk.mcic = s390int->parm64;
>>  		break;
>> +	case KVM_S390_INT_PFAULT_INIT:
>> +		irq->u.ext.ext_params = s390int->parm;
>> +		irq->u.ext.ext_params2 = s390int->parm64;
>> +		break;
>> +	case KVM_S390_RESTART:
>> +	case KVM_S390_INT_CLOCK_COMP:
>> +	case KVM_S390_INT_CPU_TIMER:
>> +		break;
>> +	default:
>> +		return -EINVAL;
>>  	}
>>  	return 0;
>>  }
>>
> 
> Wouldn't a safe fix be to initialize the struct to zero in the caller?

That's of course possible, too. But that means that we always have to
zero out the whole structure, so that's a little bit more of overhead
(well, it likely doesn't matter for such a legacy ioctl).

But the more important question: Do we then still care of fixing the
PFAULT_INIT interrupt here? Since it requires a parameter, the "case
KVM_S390_INT_PFAULT_INIT:" part would be required here anyway.

 Thomas
Janosch Frank Sept. 12, 2019, 9:28 a.m. UTC | #3
On 9/12/19 11:20 AM, Thomas Huth wrote:
> On 12/09/2019 11.14, David Hildenbrand wrote:
>> On 12.09.19 11:00, Thomas Huth wrote:
>>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>>> However, this function does not take care of all types of interrupts
>>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>>> there is a chance that we copy unwanted data from the kernel stack
>>> into the guest memory later if the interrupt data has not been properly
>>> initialized by s390int_to_s390irq().
>>>
>>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>>> interrupt: s390int_to_s390irq() does not handle it, but the function
>>> __deliver_pfault_init() will later copy the uninitialized stack data
>>> from the ext.ext_params2 into the guest memory.
>>>
>>> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
>>> And while we're at it, make sure that s390int_to_s390irq() now
>>> directly returns -EINVAL for unknown interrupt types, so that we
>>> do not run into this problem again in case we add more interrupt
>>> types to do_inject_vcpu() sometime in the future.
>>>
>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>> ---
>>>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>>> index 3e7efdd9228a..165dea4c7f19 100644
>>> --- a/arch/s390/kvm/interrupt.c
>>> +++ b/arch/s390/kvm/interrupt.c
>>> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>>>  	case KVM_S390_MCHK:
>>>  		irq->u.mchk.mcic = s390int->parm64;
>>>  		break;
>>> +	case KVM_S390_INT_PFAULT_INIT:
>>> +		irq->u.ext.ext_params = s390int->parm;
>>> +		irq->u.ext.ext_params2 = s390int->parm64;
>>> +		break;
>>> +	case KVM_S390_RESTART:
>>> +	case KVM_S390_INT_CLOCK_COMP:
>>> +	case KVM_S390_INT_CPU_TIMER:
>>> +		break;
>>> +	default:
>>> +		return -EINVAL;
>>>  	}
>>>  	return 0;
>>>  }
>>>
>>
>> Wouldn't a safe fix be to initialize the struct to zero in the caller?
> 
> That's of course possible, too. But that means that we always have to
> zero out the whole structure, so that's a little bit more of overhead
> (well, it likely doesn't matter for such a legacy ioctl).

Or memset it in s390int_to_s390irq so the caller doesn't need to know.
That should be fast enough for such a small struct.

> 
> But the more important question: Do we then still care of fixing the
> PFAULT_INIT interrupt here? Since it requires a parameter, the "case
> KVM_S390_INT_PFAULT_INIT:" part would be required here anyway.

Let's wait for Christian's opinion on that.
Christian Borntraeger Sept. 12, 2019, 10:47 a.m. UTC | #4
On 12.09.19 11:00, Thomas Huth wrote:
> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
> However, this function does not take care of all types of interrupts
> that we can inject into the guest later (see do_inject_vcpu()). Since we
> do not clear out the s390irq values before calling s390int_to_s390irq(),
> there is a chance that we copy unwanted data from the kernel stack
> into the guest memory later if the interrupt data has not been properly
> initialized by s390int_to_s390irq().

You mean by using the migration callbacks to get all interrupts back to 
userspace?



> 
> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
> interrupt: s390int_to_s390irq() does not handle it, but the function
> __deliver_pfault_init() will later copy the uninitialized stack data
> from the ext.ext_params2 into the guest memory.


Shouldnt we add some more detailed description how this can happen?
Something like
"By using the KVM_S390_INTERRUPT ioctl with a KVM_S390_INT_PFAULT_INIT
interrupt followed by the KVM_S390_GET_IRQ_STATE ioctl the user can
extract a value from the kernel stack."


> 
> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
> And while we're at it, make sure that s390int_to_s390irq() now
> directly returns -EINVAL for unknown interrupt types, so that we
> do not run into this problem again in case we add more interrupt
> types to do_inject_vcpu() sometime in the future.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>

Cc: stable

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 3e7efdd9228a..165dea4c7f19 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>  	case KVM_S390_MCHK:
>  		irq->u.mchk.mcic = s390int->parm64;
>  		break;
> +	case KVM_S390_INT_PFAULT_INIT:
> +		irq->u.ext.ext_params = s390int->parm;
> +		irq->u.ext.ext_params2 = s390int->parm64;
> +		break;
> +	case KVM_S390_RESTART:
> +	case KVM_S390_INT_CLOCK_COMP:
> +	case KVM_S390_INT_CPU_TIMER:
> +		break;
> +	default:
> +		return -EINVAL;
>  	}
>  	return 0;
>  }
>
Christian Borntraeger Sept. 12, 2019, 10:52 a.m. UTC | #5
On 12.09.19 11:20, Thomas Huth wrote:
> On 12/09/2019 11.14, David Hildenbrand wrote:
>> On 12.09.19 11:00, Thomas Huth wrote:
>>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>>> However, this function does not take care of all types of interrupts
>>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>>> there is a chance that we copy unwanted data from the kernel stack
>>> into the guest memory later if the interrupt data has not been properly
>>> initialized by s390int_to_s390irq().
>>>
>>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>>> interrupt: s390int_to_s390irq() does not handle it, but the function
>>> __deliver_pfault_init() will later copy the uninitialized stack data
>>> from the ext.ext_params2 into the guest memory.
>>>
>>> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
>>> And while we're at it, make sure that s390int_to_s390irq() now
>>> directly returns -EINVAL for unknown interrupt types, so that we
>>> do not run into this problem again in case we add more interrupt
>>> types to do_inject_vcpu() sometime in the future.
>>>
>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>> ---
>>>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>>> index 3e7efdd9228a..165dea4c7f19 100644
>>> --- a/arch/s390/kvm/interrupt.c
>>> +++ b/arch/s390/kvm/interrupt.c
>>> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>>>  	case KVM_S390_MCHK:
>>>  		irq->u.mchk.mcic = s390int->parm64;
>>>  		break;
>>> +	case KVM_S390_INT_PFAULT_INIT:
>>> +		irq->u.ext.ext_params = s390int->parm;
>>> +		irq->u.ext.ext_params2 = s390int->parm64;
>>> +		break;
>>> +	case KVM_S390_RESTART:
>>> +	case KVM_S390_INT_CLOCK_COMP:
>>> +	case KVM_S390_INT_CPU_TIMER:
>>> +		break;
>>> +	default:
>>> +		return -EINVAL;
>>>  	}
>>>  	return 0;
>>>  }
>>>
>>
>> Wouldn't a safe fix be to initialize the struct to zero in the caller?
> 
> That's of course possible, too. But that means that we always have to
> zero out the whole structure, so that's a little bit more of overhead
> (well, it likely doesn't matter for such a legacy ioctl).

Yes doing something like

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c19a24e940a1..b1f6f434af5d 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4332,7 +4332,7 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
        }
        case KVM_S390_INTERRUPT: {
                struct kvm_s390_interrupt s390int;
-               struct kvm_s390_irq s390irq;
+               struct kvm_s390_irq s390irq = {};
 
                if (copy_from_user(&s390int, argp, sizeof(s390int)))
                        return -EFAULT;

would certainly be ok as well, but

> But the more important question: Do we then still care of fixing the
> PFAULT_INIT interrupt here? Since it requires a parameter, the "case
> KVM_S390_INT_PFAULT_INIT:" part would be required here anyway.

as long as we we this interface we should fix it and we should do the
pfault thing correctly. 
Maybe we should start to deprecate this interface and remove it.
For the time being Thomas fix is certainly good enough. We might want to
add the designated initializer as an additional safety barrier.
David Hildenbrand Sept. 12, 2019, 10:58 a.m. UTC | #6
On 12.09.19 11:20, Thomas Huth wrote:
> On 12/09/2019 11.14, David Hildenbrand wrote:
>> On 12.09.19 11:00, Thomas Huth wrote:
>>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>>> However, this function does not take care of all types of interrupts
>>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>>> there is a chance that we copy unwanted data from the kernel stack
>>> into the guest memory later if the interrupt data has not been properly
>>> initialized by s390int_to_s390irq().
>>>
>>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>>> interrupt: s390int_to_s390irq() does not handle it, but the function
>>> __deliver_pfault_init() will later copy the uninitialized stack data
>>> from the ext.ext_params2 into the guest memory.
>>>
>>> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
>>> And while we're at it, make sure that s390int_to_s390irq() now
>>> directly returns -EINVAL for unknown interrupt types, so that we
>>> do not run into this problem again in case we add more interrupt
>>> types to do_inject_vcpu() sometime in the future.
>>>
>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>> ---
>>>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>>> index 3e7efdd9228a..165dea4c7f19 100644
>>> --- a/arch/s390/kvm/interrupt.c
>>> +++ b/arch/s390/kvm/interrupt.c
>>> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>>>  	case KVM_S390_MCHK:
>>>  		irq->u.mchk.mcic = s390int->parm64;
>>>  		break;
>>> +	case KVM_S390_INT_PFAULT_INIT:
>>> +		irq->u.ext.ext_params = s390int->parm;
>>> +		irq->u.ext.ext_params2 = s390int->parm64;
>>> +		break;
>>> +	case KVM_S390_RESTART:
>>> +	case KVM_S390_INT_CLOCK_COMP:
>>> +	case KVM_S390_INT_CPU_TIMER:
>>> +		break;
>>> +	default:
>>> +		return -EINVAL;
>>>  	}
>>>  	return 0;
>>>  }
>>>
>>
>> Wouldn't a safe fix be to initialize the struct to zero in the caller?
> 
> That's of course possible, too. But that means that we always have to
> zero out the whole structure, so that's a little bit more of overhead
> (well, it likely doesn't matter for such a legacy ioctl).

I would vote for doing this as well.

> 
> But the more important question: Do we then still care of fixing the
> PFAULT_INIT interrupt here? Since it requires a parameter, the "case
> KVM_S390_INT_PFAULT_INIT:" part would be required here anyway.
> 

That's indeed true.

Reviewed-by: David Hildenbrand <david@redhat.com>

>  Thomas
>
Christian Borntraeger Sept. 12, 2019, 11 a.m. UTC | #7
On 12.09.19 12:58, David Hildenbrand wrote:
> On 12.09.19 11:20, Thomas Huth wrote:
>> On 12/09/2019 11.14, David Hildenbrand wrote:
>>> On 12.09.19 11:00, Thomas Huth wrote:
>>>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>>>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>>>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>>>> However, this function does not take care of all types of interrupts
>>>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>>>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>>>> there is a chance that we copy unwanted data from the kernel stack
>>>> into the guest memory later if the interrupt data has not been properly
>>>> initialized by s390int_to_s390irq().
>>>>
>>>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>>>> interrupt: s390int_to_s390irq() does not handle it, but the function
>>>> __deliver_pfault_init() will later copy the uninitialized stack data
>>>> from the ext.ext_params2 into the guest memory.
>>>>
>>>> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
>>>> And while we're at it, make sure that s390int_to_s390irq() now
>>>> directly returns -EINVAL for unknown interrupt types, so that we
>>>> do not run into this problem again in case we add more interrupt
>>>> types to do_inject_vcpu() sometime in the future.
>>>>
>>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>>> ---
>>>>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>>>>  1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>>>> index 3e7efdd9228a..165dea4c7f19 100644
>>>> --- a/arch/s390/kvm/interrupt.c
>>>> +++ b/arch/s390/kvm/interrupt.c
>>>> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>>>>  	case KVM_S390_MCHK:
>>>>  		irq->u.mchk.mcic = s390int->parm64;
>>>>  		break;
>>>> +	case KVM_S390_INT_PFAULT_INIT:
>>>> +		irq->u.ext.ext_params = s390int->parm;
>>>> +		irq->u.ext.ext_params2 = s390int->parm64;
>>>> +		break;
>>>> +	case KVM_S390_RESTART:
>>>> +	case KVM_S390_INT_CLOCK_COMP:
>>>> +	case KVM_S390_INT_CPU_TIMER:
>>>> +		break;
>>>> +	default:
>>>> +		return -EINVAL;
>>>>  	}
>>>>  	return 0;
>>>>  }
>>>>
>>>
>>> Wouldn't a safe fix be to initialize the struct to zero in the caller?
>>
>> That's of course possible, too. But that means that we always have to
>> zero out the whole structure, so that's a little bit more of overhead
>> (well, it likely doesn't matter for such a legacy ioctl).
> 
> I would vote for doing this as well.

Yes, lets also do the designated initializer, add more text to the patch
description (or should we not?) add cc stable and I will pick a v2. 
> 
>>
>> But the more important question: Do we then still care of fixing the
>> PFAULT_INIT interrupt here? Since it requires a parameter, the "case
>> KVM_S390_INT_PFAULT_INIT:" part would be required here anyway.
>>
> 
> That's indeed true.
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
>>  Thomas
>>
> 
>
Thomas Huth Sept. 12, 2019, 11:08 a.m. UTC | #8
On 12/09/2019 12.47, Christian Borntraeger wrote:
> 
> 
> On 12.09.19 11:00, Thomas Huth wrote:
>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>> However, this function does not take care of all types of interrupts
>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>> there is a chance that we copy unwanted data from the kernel stack
>> into the guest memory later if the interrupt data has not been properly
>> initialized by s390int_to_s390irq().
> 
> You mean by using the migration callbacks to get all interrupts back to 
> userspace?

Oh, I was not thinking about GET_IRQ_STATE yet, I was thinking about
__deliver_pfault_init() which would deliver the value into the guest
memory (from where the userspace program could extract it again).

>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>> interrupt: s390int_to_s390irq() does not handle it, but the function
>> __deliver_pfault_init() will later copy the uninitialized stack data
>> from the ext.ext_params2 into the guest memory.
> 
> Shouldnt we add some more detailed description how this can happen?
> Something like
> "By using the KVM_S390_INTERRUPT ioctl with a KVM_S390_INT_PFAULT_INIT
> interrupt followed by the KVM_S390_GET_IRQ_STATE ioctl the user can
> extract a value from the kernel stack."

GET_IRQ_STATE certainly deserves to be mentioned here, I'll add it to
the patch description and will send a v2.

 Thomas
Thomas Huth Sept. 12, 2019, 11:23 a.m. UTC | #9
On 12/09/2019 12.52, Christian Borntraeger wrote:
> 
> 
> On 12.09.19 11:20, Thomas Huth wrote:
>> On 12/09/2019 11.14, David Hildenbrand wrote:
>>> On 12.09.19 11:00, Thomas Huth wrote:
>>>> When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
>>>> an interrupt, we convert them from the legacy struct kvm_s390_interrupt
>>>> to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
>>>> However, this function does not take care of all types of interrupts
>>>> that we can inject into the guest later (see do_inject_vcpu()). Since we
>>>> do not clear out the s390irq values before calling s390int_to_s390irq(),
>>>> there is a chance that we copy unwanted data from the kernel stack
>>>> into the guest memory later if the interrupt data has not been properly
>>>> initialized by s390int_to_s390irq().
>>>>
>>>> Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
>>>> interrupt: s390int_to_s390irq() does not handle it, but the function
>>>> __deliver_pfault_init() will later copy the uninitialized stack data
>>>> from the ext.ext_params2 into the guest memory.
>>>>
>>>> Fix it by handling that interrupt type in s390int_to_s390irq(), too.
>>>> And while we're at it, make sure that s390int_to_s390irq() now
>>>> directly returns -EINVAL for unknown interrupt types, so that we
>>>> do not run into this problem again in case we add more interrupt
>>>> types to do_inject_vcpu() sometime in the future.
>>>>
>>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>>> ---
>>>>  arch/s390/kvm/interrupt.c | 10 ++++++++++
>>>>  1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>>>> index 3e7efdd9228a..165dea4c7f19 100644
>>>> --- a/arch/s390/kvm/interrupt.c
>>>> +++ b/arch/s390/kvm/interrupt.c
>>>> @@ -1960,6 +1960,16 @@ int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
>>>>  	case KVM_S390_MCHK:
>>>>  		irq->u.mchk.mcic = s390int->parm64;
>>>>  		break;
>>>> +	case KVM_S390_INT_PFAULT_INIT:
>>>> +		irq->u.ext.ext_params = s390int->parm;
>>>> +		irq->u.ext.ext_params2 = s390int->parm64;
>>>> +		break;
>>>> +	case KVM_S390_RESTART:
>>>> +	case KVM_S390_INT_CLOCK_COMP:
>>>> +	case KVM_S390_INT_CPU_TIMER:
>>>> +		break;
>>>> +	default:
>>>> +		return -EINVAL;
>>>>  	}
>>>>  	return 0;
>>>>  }
>>>>
>>>
>>> Wouldn't a safe fix be to initialize the struct to zero in the caller?
>>
>> That's of course possible, too. But that means that we always have to
>> zero out the whole structure, so that's a little bit more of overhead
>> (well, it likely doesn't matter for such a legacy ioctl).
> 
> Yes doing something like
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index c19a24e940a1..b1f6f434af5d 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4332,7 +4332,7 @@ long kvm_arch_vcpu_async_ioctl(struct file *filp,
>         }
>         case KVM_S390_INTERRUPT: {
>                 struct kvm_s390_interrupt s390int;
> -               struct kvm_s390_irq s390irq;
> +               struct kvm_s390_irq s390irq = {};
>  
>                 if (copy_from_user(&s390int, argp, sizeof(s390int)))
>                         return -EFAULT;
> 
> would certainly be ok as well, but

I don't think that it's urgently necessary, but ok, if you all prefer to
have it, too, I can add it to my patch.

>> But the more important question: Do we then still care of fixing the
>> PFAULT_INIT interrupt here? Since it requires a parameter, the "case
>> KVM_S390_INT_PFAULT_INIT:" part would be required here anyway.
> 
> as long as we we this interface we should fix it and we should do the
> pfault thing correctly. 
> Maybe we should start to deprecate this interface and remove it.

Hmm, we already talked about deprecating support for pre-3.15 kernel
stuff in the past (see
https://wiki.qemu.org/ChangeLog/2.12#Future_incompatible_changes for
example), since this has been broken in QEMU since quite a while, but
the new KVM_S390_IRQ replacement has just been introduced with kernel
4.1 ... so removing this KVM_S390_INTERRUPT ioctl any time soon sounds
wrong to me, we might break some userspace programs that are still there
in the wild...

 Thomas
Cornelia Huck Sept. 13, 2019, 7:20 a.m. UTC | #10
On Thu, 12 Sep 2019 13:23:38 +0200
Thomas Huth <thuth@redhat.com> wrote:

> Hmm, we already talked about deprecating support for pre-3.15 kernel
> stuff in the past (see
> https://wiki.qemu.org/ChangeLog/2.12#Future_incompatible_changes for
> example),

Btw: did we ever do that? I don't quite recall what code we were
talking about...
Thomas Huth Sept. 13, 2019, 7:34 a.m. UTC | #11
On 13/09/2019 09.20, Cornelia Huck wrote:
> On Thu, 12 Sep 2019 13:23:38 +0200
> Thomas Huth <thuth@redhat.com> wrote:
> 
>> Hmm, we already talked about deprecating support for pre-3.15 kernel
>> stuff in the past (see
>> https://wiki.qemu.org/ChangeLog/2.12#Future_incompatible_changes for
>> example),
> 
> Btw: did we ever do that? I don't quite recall what code we were
> talking about...

We never really did - but we also never fixed the issue: If you run the
current QEMU on a kernel before 3.15, it refuses to work due to the
missing in-kernel FLIC device:

Initialization of device s390-flic-kvm failed: KVM is missing capability
KVM_CAP_DEVICE_CTR

Since nobody really complained so far that running QEMU with KVM is
still required on a kernel < 3.15, I think we could make this also
"official" now and improve the error message a little bit, pointing the
user to a kernel >= 3.15.

 Thomas
David Hildenbrand Sept. 13, 2019, 7:37 a.m. UTC | #12
On 13.09.19 09:34, Thomas Huth wrote:
> On 13/09/2019 09.20, Cornelia Huck wrote:
>> On Thu, 12 Sep 2019 13:23:38 +0200
>> Thomas Huth <thuth@redhat.com> wrote:
>>
>>> Hmm, we already talked about deprecating support for pre-3.15 kernel
>>> stuff in the past (see
>>> https://wiki.qemu.org/ChangeLog/2.12#Future_incompatible_changes for
>>> example),
>>
>> Btw: did we ever do that? I don't quite recall what code we were
>> talking about...
> 
> We never really did - but we also never fixed the issue: If you run the
> current QEMU on a kernel before 3.15, it refuses to work due to the
> missing in-kernel FLIC device:
> 
> Initialization of device s390-flic-kvm failed: KVM is missing capability
> KVM_CAP_DEVICE_CTR
> 
> Since nobody really complained so far that running QEMU with KVM is
> still required on a kernel < 3.15, I think we could make this also
> "official" now and improve the error message a little bit, pointing the
> user to a kernel >= 3.15.

Didn't we discuss back then to clean up *QEMU* and not the *kernel*?
Especially, to wait with cleanups until somebody requests to fix
instead. I mean you could have any user space in the wild that still
makes use of these interfaces ...
Thomas Huth Sept. 13, 2019, 7:43 a.m. UTC | #13
On 13/09/2019 09.37, David Hildenbrand wrote:
> On 13.09.19 09:34, Thomas Huth wrote:
>> On 13/09/2019 09.20, Cornelia Huck wrote:
>>> On Thu, 12 Sep 2019 13:23:38 +0200
>>> Thomas Huth <thuth@redhat.com> wrote:
>>>
>>>> Hmm, we already talked about deprecating support for pre-3.15 kernel
>>>> stuff in the past (see
>>>> https://wiki.qemu.org/ChangeLog/2.12#Future_incompatible_changes for
>>>> example),
>>>
>>> Btw: did we ever do that? I don't quite recall what code we were
>>> talking about...
>>
>> We never really did - but we also never fixed the issue: If you run the
>> current QEMU on a kernel before 3.15, it refuses to work due to the
>> missing in-kernel FLIC device:
>>
>> Initialization of device s390-flic-kvm failed: KVM is missing capability
>> KVM_CAP_DEVICE_CTR
>>
>> Since nobody really complained so far that running QEMU with KVM is
>> still required on a kernel < 3.15, I think we could make this also
>> "official" now and improve the error message a little bit, pointing the
>> user to a kernel >= 3.15.
> 
> Didn't we discuss back then to clean up *QEMU* and not the *kernel*?
> Especially, to wait with cleanups until somebody requests to fix
> instead. I mean you could have any user space in the wild that still
> makes use of these interfaces ...

Yes, that error message is part of QEMU, so I was referring to that one.
Sorry for mixing this into a mail thread on the kernel mailing list :-/

 Thomas
David Hildenbrand Sept. 13, 2019, 7:47 a.m. UTC | #14
On 13.09.19 09:43, Thomas Huth wrote:
> On 13/09/2019 09.37, David Hildenbrand wrote:
>> On 13.09.19 09:34, Thomas Huth wrote:
>>> On 13/09/2019 09.20, Cornelia Huck wrote:
>>>> On Thu, 12 Sep 2019 13:23:38 +0200
>>>> Thomas Huth <thuth@redhat.com> wrote:
>>>>
>>>>> Hmm, we already talked about deprecating support for pre-3.15 kernel
>>>>> stuff in the past (see
>>>>> https://wiki.qemu.org/ChangeLog/2.12#Future_incompatible_changes for
>>>>> example),
>>>>
>>>> Btw: did we ever do that? I don't quite recall what code we were
>>>> talking about...
>>>
>>> We never really did - but we also never fixed the issue: If you run the
>>> current QEMU on a kernel before 3.15, it refuses to work due to the
>>> missing in-kernel FLIC device:
>>>
>>> Initialization of device s390-flic-kvm failed: KVM is missing capability
>>> KVM_CAP_DEVICE_CTR
>>>
>>> Since nobody really complained so far that running QEMU with KVM is
>>> still required on a kernel < 3.15, I think we could make this also
>>> "official" now and improve the error message a little bit, pointing the
>>> user to a kernel >= 3.15.
>>
>> Didn't we discuss back then to clean up *QEMU* and not the *kernel*?
>> Especially, to wait with cleanups until somebody requests to fix
>> instead. I mean you could have any user space in the wild that still
>> makes use of these interfaces ...
> 
> Yes, that error message is part of QEMU, so I was referring to that one.
> Sorry for mixing this into a mail thread on the kernel mailing list :-/

Ah okay, I messed up then :)

Patch
diff mbox series

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 3e7efdd9228a..165dea4c7f19 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -1960,6 +1960,16 @@  int s390int_to_s390irq(struct kvm_s390_interrupt *s390int,
 	case KVM_S390_MCHK:
 		irq->u.mchk.mcic = s390int->parm64;
 		break;
+	case KVM_S390_INT_PFAULT_INIT:
+		irq->u.ext.ext_params = s390int->parm;
+		irq->u.ext.ext_params2 = s390int->parm64;
+		break;
+	case KVM_S390_RESTART:
+	case KVM_S390_INT_CLOCK_COMP:
+	case KVM_S390_INT_CPU_TIMER:
+		break;
+	default:
+		return -EINVAL;
 	}
 	return 0;
 }