diff mbox

[1/3] kvm: fix irq 0 assignment

Message ID 1236153269-8825-2-git-send-email-sheng@linux.intel.com (mailing list archive)
State Superseded
Headers show

Commit Message

Sheng Yang March 4, 2009, 7:54 a.m. UTC
Shouldn't update assigned irq if host irq is 0, which means uninitialized
or don't support INTx.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 qemu/hw/device-assignment.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Comments

Chris Wedgwood March 4, 2009, 9:53 a.m. UTC | #1
On Wed, Mar 04, 2009 at 03:54:27PM +0800, Sheng Yang wrote:

> Shouldn't update assigned irq if host irq is 0, which means
> uninitialized or don't support INTx.

Is that generally true?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sheng Yang March 4, 2009, 9:58 a.m. UTC | #2
On Wednesday 04 March 2009 17:53:52 Chris Wedgwood wrote:
> On Wed, Mar 04, 2009 at 03:54:27PM +0800, Sheng Yang wrote:
> > Shouldn't update assigned irq if host irq is 0, which means
> > uninitialized or don't support INTx.
>
> Is that generally true?

Host irq 0 is reserved for PIT timer, at least for x86.
Sheng Yang March 4, 2009, 10:02 a.m. UTC | #3
On Wednesday 04 March 2009 17:58:31 Sheng Yang wrote:
> On Wednesday 04 March 2009 17:53:52 Chris Wedgwood wrote:
> > On Wed, Mar 04, 2009 at 03:54:27PM +0800, Sheng Yang wrote:
> > > Shouldn't update assigned irq if host irq is 0, which means
> > > uninitialized or don't support INTx.
> >
> > Is that generally true?
>
> Host irq 0 is reserved for PIT timer, at least for x86.

Or in some other condition it would be used for pci/pcie device? And please 
notice that currently we only support assign with x86 and IA64. (I really feel 
there is some reason for you ask. :) )
Marcelo Tosatti March 4, 2009, 7:30 p.m. UTC | #4
On Wed, Mar 04, 2009 at 03:54:27PM +0800, Sheng Yang wrote:
> Shouldn't update assigned irq if host irq is 0, which means uninitialized
> or don't support INTx.
> 
> Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> ---
>  qemu/hw/device-assignment.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
> index 32fba2a..7f9c789 100644
> --- a/qemu/hw/device-assignment.c
> +++ b/qemu/hw/device-assignment.c
> @@ -574,6 +574,10 @@ static int assign_irq(AssignedDevInfo *adev)
>      AssignedDevice *dev = adev->assigned_dev;
>      int irq, r = 0;
>  
> +    /* irq 0 means uninitialized */
> +    if (dev->real_device.irq == 0)
> +        return 0;
> +
>      irq = pci_map_irq(&dev->dev, dev->intpin);
>      irq = piix_get_irq(irq);

Sheng,

Please do not special case irq 0. The fact that only x86/IA64 are
supported at the moment does not mean other architectures can't 
use it.

Also, the kernel code to handle dev/irq assignment is quite messy. It
should be only a mechanism, with userspace controlling the policy.

For example:

        if (match->irq_requested_type & KVM_ASSIGNED_DEV_MSIX)
                current_flags |= KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX;
        else if ((match->irq_requested_type & KVM_ASSIGNED_DEV_HOST_MSI) &&
                 (match->irq_requested_type & KVM_ASSIGNED_DEV_GUEST_MSI))
                current_flags |= KVM_DEV_IRQ_ASSIGN_ENABLE_MSI;

And then in userspace you have

+    if (*ctrl_word & PCI_MSIX_ENABLE) {
+        assigned_irq_data.flags = KVM_DEV_IRQ_ASSIGN_ENABLE_MSIX;
+        if (assigned_dev_update_msix_mmio(pci_dev) < 0) {
+            perror("assigned_dev_update_msix_mmio");
+        }
+    }

We really need to fix this before merging anything else in this area.

So ideally the kernel only provides ioctl commands that do one thing 
at a time:

- assign host device
- unassign guest device

- assign host irq (INTx or MSI)
- assign dev irq (INTx or MSI)

- unassign host irq
- unassign dev irq

So you don't have to make policy decisions like

        } else {
                /*
                 * Guest require to disable device MSI, we disable MSI
                 * and
                 * re-enable INTx by default again. Notice it's only for
                 * non-msi2intx.
                 */
                assigned_device_update_intx(kvm, adev, airq);
                return 0;
        }

Because you do in userspace. However, I do not think it is necessary to
create new ioctl commands and deprecate the old ones (it seems possible
to do this using the flags passed in "struct kvm_assigned_irq")

However, the current userspace code probably relies on implicit
behaviour by the kernel part. IMHO it is fair to break older 
userspace. Better change it sooner than later.

For example, don't do this

        if ((changed_flags & KVM_DEV_IRQ_ASSIGN_MSI_ACTION) ||
            (msi2intx && match->dev->msi_enabled)) {

But just enable MSI in either host or guest (one at a time). Return an
error if you can't. And don't make such assumptions:

        } else if (assigned_irq->host_irq == 0 && match->dev->irq == 0)
                /* Host device IRQ 0 means don't support INTx */

To use the current ioctl commands, maybe enforce the "one thing at a
time" nature and fail otherwise.

Can you please investigate about this possibility? Because talk is
cheap and I'm not sure whether you can always separate host/guest IRQ
assignment, but probably as long as done carefully.

Am I missing any reason why policy can't live outside the kernel?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Wright March 4, 2009, 7:41 p.m. UTC | #5
* Marcelo Tosatti (mtosatti@redhat.com) wrote:
> Please do not special case irq 0. The fact that only x86/IA64 are
> supported at the moment does not mean other architectures can't 
> use it.

Seems logical to use a flag instead of overloading the irq value.

> Also, the kernel code to handle dev/irq assignment is quite messy. It
> should be only a mechanism, with userspace controlling the policy.

I really agree here.  I think some work to simplify/clarify would be
time well-spent!
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sheng Yang March 5, 2009, 2:47 a.m. UTC | #6
On Thursday 05 March 2009 03:41:41 Chris Wright wrote:
> * Marcelo Tosatti (mtosatti@redhat.com) wrote:
> > Please do not special case irq 0. The fact that only x86/IA64 are
> > supported at the moment does not mean other architectures can't
> > use it.
>
> Seems logical to use a flag instead of overloading the irq value.
>
> > Also, the kernel code to handle dev/irq assignment is quite messy. It
> > should be only a mechanism, with userspace controlling the policy.
>
> I really agree here.  I think some work to simplify/clarify would be
> time well-spent!

So do I... After discussion with Marcelo, I would work with him on this clean 
up. :)
diff mbox

Patch

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 32fba2a..7f9c789 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -574,6 +574,10 @@  static int assign_irq(AssignedDevInfo *adev)
     AssignedDevice *dev = adev->assigned_dev;
     int irq, r = 0;
 
+    /* irq 0 means uninitialized */
+    if (dev->real_device.irq == 0)
+        return 0;
+
     irq = pci_map_irq(&dev->dev, dev->intpin);
     irq = piix_get_irq(irq);