Message ID | 1344351168-2568-4-git-send-email-cornelia.huck@de.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 7 Aug 2012 16:52:47 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > Add a driver for kvm guests that matches virtual ccw devices provided > by the host as virtio bridge devices. Hi Cornelia, OK, this is a good opportunity to fix some limitations, just as we did for virtio_mmio (drivers/virtio/virtio_mmio.c). 1) Please don't limit yourself to 32 feature bits! If you look at how virtio_mmio does it, they use a selector to index into a theoretically-infinite array of feature bits: * 0x010 R HostFeatures Features supported by the host * 0x014 W HostFeaturesSel Set of host features to access via HostFeatures * * 0x020 W GuestFeatures Features activated by the guest * 0x024 W GuestFeaturesSel Set of activated features to set via GuestFeatures 2) Please also allow the guest to set the alignment for virtio ring layout (it controls the spacing between the rings), eg: * 0x03c W QueueAlign Used Ring alignment for the current queue 3) Finally, make sure the guest can set the size of the queue, in case it can't allocate the size the host suggests, eg: * 0x034 R QueueNumMax Maximum size of the currently selected queue * 0x038 W QueueNum Queue size for the currently selected queue This means the host can suggest huge queues, knowing the guest won't simply fail if it does so. Note that we're also speculating a move to a new vring format, which will probably be little-endian. But you probably want a completely new ccw code for that anyway. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 08 Aug 2012 13:52:57 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote: > On Tue, 7 Aug 2012 16:52:47 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > > Add a driver for kvm guests that matches virtual ccw devices provided > > by the host as virtio bridge devices. > > Hi Cornelia, > > OK, this is a good opportunity to fix some limitations, just as > we did for virtio_mmio (drivers/virtio/virtio_mmio.c). > > 1) Please don't limit yourself to 32 feature bits! If you look at how > virtio_mmio does it, they use a selector to index into a > theoretically-infinite array of feature bits: > > * 0x010 R HostFeatures Features supported by the host > * 0x014 W HostFeaturesSel Set of host features to access via HostFeatures > * > * 0x020 W GuestFeatures Features activated by the guest > * 0x024 W GuestFeaturesSel Set of activated features to set via GuestFeatures It should be easy to extend the data processed by the feature ccws to a feature/index combination. Would it be practical to limit the index to an 8 bit value? > > 2) Please also allow the guest to set the alignment for virtio ring > layout (it controls the spacing between the rings), eg: > > * 0x03c W QueueAlign Used Ring alignment for the current queue I think the set_vq ccw could be extended with that info. > > 3) Finally, make sure the guest can set the size of the queue, in case > it can't allocate the size the host suggests, eg: > > * 0x034 R QueueNumMax Maximum size of the currently selected queue > * 0x038 W QueueNum Queue size for the currently selected queue > > This means the host can suggest huge queues, knowing the guest won't > simply fail if it does so. Makes sense, I didn't like just failing to allocate either. The actual size could probably go into the set_vq ccw as well. > > Note that we're also speculating a move to a new vring format, which > will probably be little-endian. But you probably want a completely new > ccw code for that anyway. Do you have a pointer to that discussion handy? If the host may support different vring formats, I'll probably want to add some kind of discovery mechanism for that as well (what discovery mechanism depends on whether this would be per-device or per-machine). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 13 Aug 2012 10:56:38 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > On Wed, 08 Aug 2012 13:52:57 +0930 > Rusty Russell <rusty@rustcorp.com.au> wrote: > > > On Tue, 7 Aug 2012 16:52:47 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > > 1) Please don't limit yourself to 32 feature bits! If you look at how > > virtio_mmio does it, they use a selector to index into a > > theoretically-infinite array of feature bits: > > It should be easy to extend the data processed by the feature ccws to a > feature/index combination. Would it be practical to limit the index to > an 8 bit value? 256 feature bits? That seems like it could one day be limiting. Or an 8 bit accessor into feature words? 8192 seems enough for anyone sane. > > Note that we're also speculating a move to a new vring format, which > > will probably be little-endian. But you probably want a completely new > > ccw code for that anyway. > > Do you have a pointer to that discussion handy? > > If the host may support different vring formats, I'll probably want to > add some kind of discovery mechanism for that as well (what discovery > mechanism depends on whether this would be per-device or per-machine). It would be per-machine; per-device would be a bit crazy. We'd deprecate the old ring format. There's been no consistent thread on the ideas for a ring change, unfortunately, but you can find interesting parts here, off this thread: Message-ID: <8762gj6q5r.fsf@rustcorp.com.au> Subject: Re: [RFC 7/11] virtio_pci: new, capability-aware driver. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 14 Aug 2012 09:40:01 +0930 Rusty Russell <rusty@rustcorp.com.au> wrote: > On Mon, 13 Aug 2012 10:56:38 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > > On Wed, 08 Aug 2012 13:52:57 +0930 > > Rusty Russell <rusty@rustcorp.com.au> wrote: > > > > > On Tue, 7 Aug 2012 16:52:47 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > > > 1) Please don't limit yourself to 32 feature bits! If you look at how > > > virtio_mmio does it, they use a selector to index into a > > > theoretically-infinite array of feature bits: > > > > It should be easy to extend the data processed by the feature ccws to a > > feature/index combination. Would it be practical to limit the index to > > an 8 bit value? > > 256 feature bits? That seems like it could one day be limiting. Or an > 8 bit accessor into feature words? 8192 seems enough for anyone sane. An 8 bit accessor. I hope everybody stays on the sane side :) > > > > Note that we're also speculating a move to a new vring format, which > > > will probably be little-endian. But you probably want a completely new > > > ccw code for that anyway. > > > > Do you have a pointer to that discussion handy? > > > > If the host may support different vring formats, I'll probably want to > > add some kind of discovery mechanism for that as well (what discovery > > mechanism depends on whether this would be per-device or per-machine). > > It would be per-machine; per-device would be a bit crazy. We'd > deprecate the old ring format. > > There's been no consistent thread on the ideas for a ring change, > unfortunately, but you can find interesting parts here, off this thread: > > Message-ID: <8762gj6q5r.fsf@rustcorp.com.au> > Subject: Re: [RFC 7/11] virtio_pci: new, capability-aware driver. I've read a bit through this and it looks like this is really virtio-2 or so. How about discoverability by the guest? Guests will likely have to support both formats, and forcing them to look at the feature bits for each device in order to figure out the queue format feels wrong if it is going to be the same format for the whole machine anyway. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Cornelia Huck <cornelia.huck@de.ibm.com> writes: > Add a driver for kvm guests that matches virtual ccw devices provided > by the host as virtio bridge devices. > > These virtio-ccw devices use a special set of channel commands in order > to perform virtio functions. > > Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Hi, Have you written an appendix for the virtio specification for virtio-ccw? I think it would be good to include in this series for the purposes of review. Regards, Anthony Liguori > --- > arch/s390/include/asm/irq.h | 1 + > arch/s390/kernel/irq.c | 1 + > drivers/s390/kvm/Makefile | 2 +- > drivers/s390/kvm/virtio_ccw.c | 761 ++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 764 insertions(+), 1 deletion(-) > create mode 100644 drivers/s390/kvm/virtio_ccw.c > > diff --git a/arch/s390/include/asm/irq.h b/arch/s390/include/asm/irq.h > index 2b9d418..b4bea53 100644 > --- a/arch/s390/include/asm/irq.h > +++ b/arch/s390/include/asm/irq.h > @@ -31,6 +31,7 @@ enum interruption_class { > IOINT_CTC, > IOINT_APB, > IOINT_CSC, > + IOINT_VIR, > NMI_NMI, > NR_IRQS, > }; > diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c > index dd7630d..2cc7eed 100644 > --- a/arch/s390/kernel/irq.c > +++ b/arch/s390/kernel/irq.c > @@ -56,6 +56,7 @@ static const struct irq_class intrclass_names[] = { > {.name = "CTC", .desc = "[I/O] CTC" }, > {.name = "APB", .desc = "[I/O] AP Bus" }, > {.name = "CSC", .desc = "[I/O] CHSC Subchannel" }, > + {.name = "VIR", .desc = "[I/O] Virtual I/O Devices" }, > {.name = "NMI", .desc = "[NMI] Machine Check" }, > }; > > diff --git a/drivers/s390/kvm/Makefile b/drivers/s390/kvm/Makefile > index 0815690..241891a 100644 > --- a/drivers/s390/kvm/Makefile > +++ b/drivers/s390/kvm/Makefile > @@ -6,4 +6,4 @@ > # it under the terms of the GNU General Public License (version 2 only) > # as published by the Free Software Foundation. > > -obj-$(CONFIG_S390_GUEST) += kvm_virtio.o > +obj-$(CONFIG_S390_GUEST) += kvm_virtio.o virtio_ccw.o > diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c > new file mode 100644 > index 0000000..df0f994 > --- /dev/null > +++ b/drivers/s390/kvm/virtio_ccw.c > @@ -0,0 +1,761 @@ > +/* > + * ccw based virtio transport > + * > + * Copyright IBM Corp. 2012 > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License (version 2 only) > + * as published by the Free Software Foundation. > + * > + * Author(s): Cornelia Huck <cornelia.huck@de.ibm.com> > + */ > + > +#include <linux/kernel_stat.h> > +#include <linux/init.h> > +#include <linux/bootmem.h> > +#include <linux/err.h> > +#include <linux/virtio.h> > +#include <linux/virtio_config.h> > +#include <linux/slab.h> > +#include <linux/virtio_console.h> > +#include <linux/interrupt.h> > +#include <linux/virtio_ring.h> > +#include <linux/pfn.h> > +#include <linux/async.h> > +#include <linux/wait.h> > +#include <linux/list.h> > +#include <linux/bitops.h> > +#include <linux/module.h> > +#include <asm/io.h> > +#include <asm/kvm_para.h> > +#include <asm/setup.h> > +#include <asm/irq.h> > +#include <asm/cio.h> > +#include <asm/ccwdev.h> > + > +/* > + * virtio related functions > + */ > + > +struct vq_config_block { > + __u16 index; > + __u16 num; > +} __attribute__ ((packed)); > + > +#define VIRTIO_CCW_CONFIG_SIZE 0x100 > +/* same as PCI config space size, should be enough for all drivers */ > + > +struct virtio_ccw_device { > + struct virtio_device vdev; > + __u8 status; > + __u8 config[VIRTIO_CCW_CONFIG_SIZE]; > + struct ccw_device *cdev; > + struct ccw1 ccw; > + __u32 area; > + __u32 curr_io; > + int err; > + wait_queue_head_t wait_q; > + spinlock_t lock; > + struct list_head virtqueues; > + unsigned long indicators; /* XXX - works because we're under 64 bit */ > + struct vq_config_block *config_block; > +}; > + > +struct vq_info_block { > + __u64 queue; > + __u16 num; > +} __attribute__ ((packed)); > + > +struct virtio_ccw_vq_info { > + struct virtqueue *vq; > + int num; > + int queue_index; > + void *queue; > + struct vq_info_block *info_block; > + struct list_head node; > +}; > + > +#define KVM_VIRTIO_CCW_RING_ALIGN 4096 > + > +#define CCW_CMD_SET_VQ 0x13 > +#define CCW_CMD_VDEV_RESET 0x33 > +#define CCW_CMD_SET_IND 0x43 > +#define CCW_CMD_READ_FEAT 0x12 > +#define CCW_CMD_WRITE_FEAT 0x11 > +#define CCW_CMD_READ_CONF 0x22 > +#define CCW_CMD_WRITE_CONF 0x21 > +#define CCW_CMD_WRITE_STATUS 0x31 > +#define CCW_CMD_READ_VQ_CONF 0x32 > + > +#define VIRTIO_CCW_DOING_SET_VQ 0x00010000 > +#define VIRTIO_CCW_DOING_RESET 0x00040000 > +#define VIRTIO_CCW_DOING_READ_FEAT 0x00080000 > +#define VIRTIO_CCW_DOING_WRITE_FEAT 0x00100000 > +#define VIRTIO_CCW_DOING_READ_CONFIG 0x00200000 > +#define VIRTIO_CCW_DOING_WRITE_CONFIG 0x00400000 > +#define VIRTIO_CCW_DOING_WRITE_STATUS 0x00800000 > +#define VIRTIO_CCW_DOING_SET_IND 0x01000000 > +#define VIRTIO_CCW_DOING_READ_VQ_CONF 0x02000000 > +#define VIRTIO_CCW_INTPARM_MASK 0xffff0000 > + > +static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev) > +{ > + return container_of(vdev, struct virtio_ccw_device, vdev); > +} > + > +static int doing_io(struct virtio_ccw_device *vcdev, __u32 flag) > +{ > + unsigned long flags; > + __u32 ret; > + > + spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags); > + if (vcdev->err) > + ret = vcdev->err; > + else > + ret = vcdev->curr_io & flag; > + spin_unlock_irqrestore(get_ccwdev_lock(vcdev->cdev), flags); > + return ret; > +} > + > +static int ccw_io_helper(struct virtio_ccw_device *vcdev, __u32 intparm) > +{ > + int ret; > + unsigned long flags; > + int flag = intparm & VIRTIO_CCW_INTPARM_MASK; > + > + spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags); > + ret = ccw_device_start(vcdev->cdev, &vcdev->ccw, intparm, 0, 0); > + if (!ret) > + vcdev->curr_io |= flag; > + spin_unlock_irqrestore(get_ccwdev_lock(vcdev->cdev), flags); > + wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0); > + return ret ? ret : vcdev->err; > +} > + > +static void virtio_ccw_kvm_notify(struct virtqueue *vq) > +{ > + struct virtio_ccw_vq_info *info = vq->priv; > + struct virtio_ccw_device *vcdev; > + struct subchannel_id schid; > + __u32 reg2; > + > + vcdev = to_vc_device(info->vq->vdev); > + ccw_device_get_schid(vcdev->cdev, &schid); > + reg2 = *(__u32 *)&schid; > + kvm_hypercall2(3 /* CCW_NOTIFY */, reg2, info->queue_index); > +} > + > +static int virtio_ccw_read_vq_conf(struct virtio_ccw_device *vcdev, int index) > +{ > + vcdev->config_block->index = index; > + vcdev->ccw.cmd_code = CCW_CMD_READ_VQ_CONF; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = sizeof(struct vq_config_block); > + vcdev->ccw.cda = (__u32)(unsigned long)(vcdev->config_block); > + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_READ_VQ_CONF); > + return vcdev->config_block->num; > +} > + > +static void virtio_ccw_del_vq(struct virtqueue *vq) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev); > + struct virtio_ccw_vq_info *info = vq->priv; > + unsigned long flags; > + unsigned long size; > + int ret; > + > + /* Remove from our list. */ > + spin_lock_irqsave(&vcdev->lock, flags); > + list_del(&info->node); > + spin_unlock_irqrestore(&vcdev->lock, flags); > + > + /* Release from host. */ > + info->info_block->queue = 0; > + info->info_block->num = info->queue_index; > + vcdev->ccw.cmd_code = CCW_CMD_SET_VQ; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = sizeof(*info->info_block); > + vcdev->ccw.cda = (__u32)(unsigned long)(info->info_block); > + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_SET_VQ | info->queue_index); > + if (ret) > + dev_warn(&vq->vdev->dev, "Error %x while deleting queue %d", > + ret, info->queue_index); > + > + vring_del_virtqueue(vq); > + size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN)); > + free_pages_exact(info->queue, size); > + kfree(info->info_block); > + kfree(info); > +} > + > +static void virtio_ccw_del_vqs(struct virtio_device *vdev) > +{ > + struct virtqueue *vq, *n; > + > + list_for_each_entry_safe(vq, n, &vdev->vqs, list) > + virtio_ccw_del_vq(vq); > +} > + > +static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev, > + int i, vq_callback_t *callback, > + const char *name) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + int err; > + struct virtqueue *vq; > + struct virtio_ccw_vq_info *info; > + unsigned long size; > + unsigned long flags; > + > + /* Allocate queue. */ > + info = kzalloc(sizeof(struct virtio_ccw_vq_info), GFP_KERNEL); > + if (!info) { > + dev_warn(&vcdev->cdev->dev, "no info\n"); > + err = -ENOMEM; > + goto out_err; > + } > + info->info_block = kzalloc(sizeof(*info->info_block), > + GFP_DMA | GFP_KERNEL); > + if (!info->info_block) { > + dev_warn(&vcdev->cdev->dev, "no info block\n"); > + err = -ENOMEM; > + goto out_err; > + } > + info->queue_index = i; > + info->num = virtio_ccw_read_vq_conf(vcdev, i); > + size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN)); > + info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO); > + if (info->queue == NULL) { > + dev_warn(&vcdev->cdev->dev, "no queue\n"); > + err = -ENOMEM; > + goto out_err; > + } > + vq = vring_new_virtqueue(info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev, > + true, info->queue, virtio_ccw_kvm_notify, > + callback, name); > + if (!vq) { > + dev_warn(&vcdev->cdev->dev, "no vq\n"); > + err = -ENOMEM; > + free_pages_exact(info->queue, size); > + goto out_err; > + } > + info->vq = vq; > + vq->priv = info; > + > + /* Register it with the host. */ > + info->info_block->queue = (__u64)info->queue; > + info->info_block->num = info->queue_index; > + vcdev->ccw.cmd_code = CCW_CMD_SET_VQ; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = sizeof(*info->info_block); > + vcdev->ccw.cda = (__u32)(unsigned long)(info->info_block); > + err = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_SET_VQ | info->queue_index); > + if (err) { > + dev_warn(&vcdev->cdev->dev, "SET_VQ failed\n"); > + free_pages_exact(info->queue, size); > + info->vq = NULL; > + vq->priv = NULL; > + goto out_err; > + } > + > + /* Save it to our list. */ > + spin_lock_irqsave(&vcdev->lock, flags); > + list_add(&info->node, &vcdev->virtqueues); > + spin_unlock_irqrestore(&vcdev->lock, flags); > + > + return vq; > + > +out_err: > + if (info) > + kfree(info->info_block); > + kfree(info); > + return ERR_PTR(err); > +} > + > +static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, > + struct virtqueue *vqs[], > + vq_callback_t *callbacks[], > + const char *names[]) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + int ret, i; > + > + for (i = 0; i < nvqs; ++i) { > + vqs[i] = virtio_ccw_setup_vq(vdev, i, callbacks[i], names[i]); > + if (IS_ERR(vqs[i])) { > + ret = PTR_ERR(vqs[i]); > + vqs[i] = NULL; > + goto out; > + } > + } > + /* Register queue indicators with host. */ > + vcdev->indicators = 0; > + vcdev->ccw.cmd_code = CCW_CMD_SET_IND; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = sizeof(vcdev->indicators); > + vcdev->ccw.cda = (__u32)(unsigned long)(&vcdev->indicators); > + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_SET_IND); > + if (ret) > + goto out; > + return 0; > +out: > + virtio_ccw_del_vqs(vdev); > + return ret; > +} > + > +static void virtio_ccw_reset(struct virtio_device *vdev) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + > + /* Send a reset ccw on device. */ > + vcdev->ccw.cmd_code = CCW_CMD_VDEV_RESET; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = 0; > + vcdev->ccw.cda = 0; > + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_RESET); > +} > + > +static u32 virtio_ccw_get_features(struct virtio_device *vdev) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + u32 features; > + int ret; > + > + /* Read the feature bits from the host. */ > + vcdev->ccw.cmd_code = CCW_CMD_READ_FEAT; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = sizeof(features); > + vcdev->ccw.cda = vcdev->area; > + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_READ_FEAT); > + if (ret) > + return 0; > + > + memcpy(&features, (void *)(unsigned long)vcdev->area, > + sizeof(features)); > + return le32_to_cpu(features); > +} > + > +static void virtio_ccw_finalize_features(struct virtio_device *vdev) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + > + /* Give virtio_ring a chance to accept features. */ > + vring_transport_features(vdev); > + > + memcpy((void *)(unsigned long)vcdev->area, vdev->features, > + sizeof(*vdev->features)); > + /* Write the feature bits to the host. */ > + vcdev->ccw.cmd_code = CCW_CMD_WRITE_FEAT; > + /* Sigh. The kernel's features may be longer than the host's. */ > + vcdev->ccw.flags = CCW_FLAG_SLI; > + vcdev->ccw.count = sizeof(*vdev->features); > + vcdev->ccw.cda = vcdev->area; > + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_WRITE_FEAT); > +} > + > +static void virtio_ccw_get_config(struct virtio_device *vdev, > + unsigned int offset, void *buf, unsigned len) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + int ret; > + > + /* Read the config area from the host. */ > + vcdev->ccw.cmd_code = CCW_CMD_READ_CONF; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = offset + len; > + vcdev->ccw.cda = vcdev->area; > + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_READ_CONFIG); > + if (ret) > + return; > + > + memcpy(vcdev->config, (void *)(unsigned long)vcdev->area, > + sizeof(vcdev->config)); > + memcpy(buf, &vcdev->config[offset], len); > +} > + > +static void virtio_ccw_set_config(struct virtio_device *vdev, > + unsigned int offset, const void *buf, > + unsigned len) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + > + memcpy(&vcdev->config[offset], buf, len); > + /* Write the config area to the host. */ > + memcpy((void *)(unsigned long)vcdev->area, vcdev->config, > + sizeof(vcdev->config)); > + vcdev->ccw.cmd_code = CCW_CMD_WRITE_CONF; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = offset + len; > + vcdev->ccw.cda = vcdev->area; > + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_WRITE_CONFIG); > +} > + > +static u8 virtio_ccw_get_status(struct virtio_device *vdev) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + > + return vcdev->status; > +} > + > +static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status) > +{ > + struct virtio_ccw_device *vcdev = to_vc_device(vdev); > + > + /* Write the status to the host. */ > + vcdev->status = status; > + memcpy((void *)(unsigned long)vcdev->area, &status, sizeof(status)); > + vcdev->ccw.cmd_code = CCW_CMD_WRITE_STATUS; > + vcdev->ccw.flags = 0; > + vcdev->ccw.count = sizeof(status); > + vcdev->ccw.cda = vcdev->area; > + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_WRITE_STATUS); > +} > + > +static struct virtio_config_ops virtio_ccw_config_ops = { > + .get_features = virtio_ccw_get_features, > + .finalize_features = virtio_ccw_finalize_features, > + .get = virtio_ccw_get_config, > + .set = virtio_ccw_set_config, > + .get_status = virtio_ccw_get_status, > + .set_status = virtio_ccw_set_status, > + .reset = virtio_ccw_reset, > + .find_vqs = virtio_ccw_find_vqs, > + .del_vqs = virtio_ccw_del_vqs, > +}; > + > + > +/* > + * ccw bus driver related functions > + */ > + > +static void virtio_ccw_release_dev(struct device *_d) > +{ > + struct virtio_device *dev = container_of(_d, struct virtio_device, > + dev); > + struct virtio_ccw_device *vcdev = to_vc_device(dev); > + > + kfree((void *)(unsigned long)vcdev->area); > + kfree(vcdev->config_block); > + kfree(vcdev); > +} > + > +static int irb_is_error(struct irb *irb) > +{ > + if (scsw_cstat(&irb->scsw) != 0) > + return 1; > + if (scsw_dstat(&irb->scsw) & ~(DEV_STAT_CHN_END | DEV_STAT_DEV_END)) > + return 1; > + if (scsw_cc(&irb->scsw) != 0) > + return 1; > + return 0; > +} > + > +static struct virtqueue *virtio_ccw_vq_by_ind(struct virtio_ccw_device *vcdev, > + int index) > +{ > + struct virtio_ccw_vq_info *info; > + unsigned long flags; > + struct virtqueue *vq; > + > + vq = NULL; > + spin_lock_irqsave(&vcdev->lock, flags); > + list_for_each_entry(info, &vcdev->virtqueues, node) { > + if (info->queue_index == index) { > + vq = info->vq; > + break; > + } > + } > + spin_unlock_irqrestore(&vcdev->lock, flags); > + return vq; > +} > + > +static void virtio_ccw_int_handler(struct ccw_device *cdev, > + unsigned long intparm, > + struct irb *irb) > +{ > + __u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK; > + struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev); > + int i; > + struct virtqueue *vq; > + > + /* Check if it's a notification from the host. */ > + if ((intparm == 0) && > + (scsw_stctl(&irb->scsw) == > + (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND))) { > + /* OK */ > + } > + if (irb_is_error(irb)) > + vcdev->err = -EIO; /* XXX - use real error */ > + if (vcdev->curr_io & activity) { > + switch (activity) { > + case VIRTIO_CCW_DOING_READ_FEAT: > + case VIRTIO_CCW_DOING_WRITE_FEAT: > + case VIRTIO_CCW_DOING_READ_CONFIG: > + case VIRTIO_CCW_DOING_WRITE_CONFIG: > + case VIRTIO_CCW_DOING_WRITE_STATUS: > + case VIRTIO_CCW_DOING_SET_VQ: > + case VIRTIO_CCW_DOING_SET_IND: > + case VIRTIO_CCW_DOING_RESET: > + case VIRTIO_CCW_DOING_READ_VQ_CONF: > + vcdev->curr_io &= ~activity; > + wake_up(&vcdev->wait_q); > + break; > + default: > + /* don't know what to do... */ > + dev_warn(&cdev->dev, "Suspicious activity '%08x'\n", > + activity); > + WARN_ON(1); > + break; > + } > + } > + for_each_set_bit(i, &vcdev->indicators, > + sizeof(vcdev->indicators)) { > + vq = virtio_ccw_vq_by_ind(vcdev, i); > + vring_interrupt(0, vq); > + clear_bit(i, &vcdev->indicators); > + } > +} > + > +/* > + * We usually want to autoonline all devices, but give the admin > + * a way to exempt devices from this. > + */ > +#define __DEV_WORDS ((__MAX_SUBCHANNEL + (8*sizeof(long) - 1)) / \ > + (8*sizeof(long))) > +static unsigned long devs_no_auto[__MAX_SSID + 1][__DEV_WORDS]; > + > +static char *no_auto = ""; > + > +module_param(no_auto, charp, 0444); > +MODULE_PARM_DESC(no_auto, "list of ccw bus id ranges not to be auto-onlined"); > + > +static int virtio_ccw_check_autoonline(struct ccw_device *cdev) > +{ > + struct ccw_dev_id id; > + > + ccw_device_get_id(cdev, &id); > + if (test_bit(id.devno, devs_no_auto[id.ssid])) > + return 0; > + return 1; > +} > + > +static void virtio_ccw_auto_online(void *data, async_cookie_t cookie) > +{ > + struct ccw_device *cdev = data; > + int ret; > + > + ret = ccw_device_set_online(cdev); > + if (ret) > + dev_warn(&cdev->dev, "Failed to set online: %d\n", ret); > +} > + > +static int virtio_ccw_probe(struct ccw_device *cdev) > +{ > + cdev->handler = virtio_ccw_int_handler; > + > + if (virtio_ccw_check_autoonline(cdev)) > + async_schedule(virtio_ccw_auto_online, cdev); > + return 0; > +} > + > +static void virtio_ccw_remove(struct ccw_device *cdev) > +{ > + cdev->handler = NULL; > +} > + > +static int virtio_ccw_offline(struct ccw_device *cdev) > +{ > + struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev); > + > + unregister_virtio_device(&vcdev->vdev); > + dev_set_drvdata(&cdev->dev, NULL); > + return 0; > +} > + > + > +/* Area needs to be big enough to fit status, features or configuration. */ > +#define VIRTIO_AREA_SIZE VIRTIO_CCW_CONFIG_SIZE /* biggest possible use */ > + > +static int virtio_ccw_online(struct ccw_device *cdev) > +{ > + int ret; > + struct virtio_ccw_device *vcdev; > + > + vcdev = kzalloc(sizeof(*vcdev), GFP_KERNEL); > + if (!vcdev) { > + dev_warn(&cdev->dev, "Could not get memory for virtio\n"); > + ret = -ENOMEM; > + goto out_free; > + } > + vcdev->area = (__u32)(unsigned long)kzalloc(VIRTIO_AREA_SIZE, > + GFP_DMA | GFP_KERNEL); > + if (!vcdev->area) { > + dev_warn(&cdev->dev, "Cound not get memory for virtio\n"); > + ret = -ENOMEM; > + goto out_free; > + } > + vcdev->config_block = kzalloc(sizeof(*vcdev->config_block), > + GFP_DMA | GFP_KERNEL); > + if (!vcdev->config_block) { > + ret = -ENOMEM; > + goto out_free; > + } > + vcdev->vdev.dev.parent = &cdev->dev; > + vcdev->vdev.dev.release = virtio_ccw_release_dev; > + vcdev->vdev.config = &virtio_ccw_config_ops; > + vcdev->cdev = cdev; > + init_waitqueue_head(&vcdev->wait_q); > + INIT_LIST_HEAD(&vcdev->virtqueues); > + > + dev_set_drvdata(&cdev->dev, vcdev); > + vcdev->vdev.id.vendor = cdev->id.cu_type; > + vcdev->vdev.id.device = cdev->id.cu_model; > + ret = register_virtio_device(&vcdev->vdev); > + if (ret) { > + dev_warn(&cdev->dev, "Failed to register virtio device: %d\n", > + ret); > + goto out_put; > + } > + return 0; > +out_put: > + dev_set_drvdata(&cdev->dev, NULL); > + put_device(&vcdev->vdev.dev); > + return ret; > +out_free: > + if (vcdev) { > + kfree((void *)(unsigned long)vcdev->area); > + kfree(vcdev->config_block); > + } > + kfree(vcdev); > + return ret; > +} > + > +static int virtio_ccw_cio_notify(struct ccw_device *cdev, int event) > +{ > + /* TODO: Check whether we need special handling here. */ > + return 0; > +} > + > +static struct ccw_device_id virtio_ids[] = { > + { CCW_DEVICE(0x3832, 0) }, > + {}, > +}; > +MODULE_DEVICE_TABLE(ccw, virtio_ids); > + > +static struct ccw_driver virtio_ccw_driver = { > + .driver = { > + .owner = THIS_MODULE, > + .name = "virtio_ccw", > + }, > + .ids = virtio_ids, > + .probe = virtio_ccw_probe, > + .remove = virtio_ccw_remove, > + .set_offline = virtio_ccw_offline, > + .set_online = virtio_ccw_online, > + .notify = virtio_ccw_cio_notify, > + .int_class = IOINT_VIR, > +}; > + > +static int __init pure_hex(char **cp, unsigned int *val, int min_digit, > + int max_digit, int max_val) > +{ > + int diff; > + > + diff = 0; > + *val = 0; > + > + while (diff <= max_digit) { > + int value = hex_to_bin(**cp); > + > + if (value < 0) > + break; > + *val = *val * 16 + value; > + (*cp)++; > + diff++; > + } > + > + if ((diff < min_digit) || (diff > max_digit) || (*val > max_val)) > + return 1; > + > + return 0; > +} > + > +static int __init parse_busid(char *str, unsigned int *cssid, > + unsigned int *ssid, unsigned int *devno) > +{ > + char *str_work; > + int rc, ret; > + > + rc = 1; > + > + if (*str == '\0') > + goto out; > + > + str_work = str; > + ret = pure_hex(&str_work, cssid, 1, 2, __MAX_CSSID); > + if (ret || (str_work[0] != '.')) > + goto out; > + str_work++; > + ret = pure_hex(&str_work, ssid, 1, 1, __MAX_SSID); > + if (ret || (str_work[0] != '.')) > + goto out; > + str_work++; > + ret = pure_hex(&str_work, devno, 4, 4, __MAX_SUBCHANNEL); > + if (ret || (str_work[0] != '\0')) > + goto out; > + > + rc = 0; > +out: > + return rc; > +} > + > +static void __init no_auto_parse(void) > +{ > + unsigned int from_cssid, to_cssid, from_ssid, to_ssid, from, to; > + char *parm, *str; > + int rc; > + > + str = no_auto; > + while ((parm = strsep(&str, ","))) { > + rc = parse_busid(strsep(&parm, "-"), &from_cssid, > + &from_ssid, &from); > + if (rc) > + continue; > + if (parm != NULL) { > + rc = parse_busid(parm, &to_cssid, > + &to_ssid, &to); > + if ((from_ssid > to_ssid) || > + ((from_ssid == to_ssid) && (from > to))) > + rc = -EINVAL; > + } else { > + to_cssid = from_cssid; > + to_ssid = from_ssid; > + to = from; > + } > + if (rc) > + continue; > + while ((from_ssid < to_ssid) || > + ((from_ssid == to_ssid) && (from <= to))) { > + set_bit(from, devs_no_auto[from_ssid]); > + from++; > + if (from > __MAX_SUBCHANNEL) { > + from_ssid++; > + from = 0; > + } > + } > + } > +} > + > +static int __init virtio_ccw_init(void) > +{ > + /* parse no_auto string before we do anything further */ > + no_auto_parse(); > + return ccw_driver_register(&virtio_ccw_driver); > +} > +module_init(virtio_ccw_init); > + > +static void __exit virtio_ccw_exit(void) > +{ > + ccw_driver_unregister(&virtio_ccw_driver); > +} > +module_exit(virtio_ccw_exit); > -- > 1.7.11.4 > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 14 Aug 2012 13:03:34 +0200, Cornelia Huck <cornelia.huck@de.ibm.com> wrote: > > It would be per-machine; per-device would be a bit crazy. We'd > > deprecate the old ring format. > > > > There's been no consistent thread on the ideas for a ring change, > > unfortunately, but you can find interesting parts here, off this thread: > > > > Message-ID: <8762gj6q5r.fsf@rustcorp.com.au> > > Subject: Re: [RFC 7/11] virtio_pci: new, capability-aware driver. > > I've read a bit through this and it looks like this is really virtio-2 > or so. How about discoverability by the guest? Guests will likely have > to support both formats, and forcing them to look at the feature bits > for each device in order to figure out the queue format feels wrong if > it is going to be the same format for the whole machine anyway. Yes, it needs some out-of-band acknowledgement mechanism by the guest. Might be worth putting a max version number somewhere, which the guest writes to acknowledge (ie. currently it would be 1, and the guest would always write a 1). Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 14 Aug 2012 14:56:07 -0500, Anthony Liguori <aliguori@us.ibm.com> wrote: > Cornelia Huck <cornelia.huck@de.ibm.com> writes: > > > Add a driver for kvm guests that matches virtual ccw devices provided > > by the host as virtio bridge devices. > > > > These virtio-ccw devices use a special set of channel commands in order > > to perform virtio functions. > > > > Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> > > Hi, > > Have you written an appendix for the virtio specification for > virtio-ccw? I think it would be good to include in this series for the > purposes of review. Might be nice, but don't get fancy about it. Text will be fine and I can cut & paste it in once it's finalized. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Have you written an appendix for the virtio specification for >> virtio-ccw? I think it would be good to include in this series for the >> purposes of review. > > Might be nice, but don't get fancy about it. Text will be fine and I > can cut & paste it in once it's finalized. There was a patch against the virtio spec in the patch series. Did it not make it to you? But Anthony is right, the spec is important, because the two main things that we need to get right are the 2 interfaces guest<->host and userspace(qemu)<->kvm. Everything else is not an ABI and can be fixed later on if necessary. Regarding the guest<->host interface a lot of things are mandated by the channel-IO architecture. We have also reserved some IDs for virtio (control unit type 0x3832, channel path type 0x32 and channel subsystem id 0xfe) so we just have to focus on the virtio part (thanks Rusty for the good feedback already). Regarding the qemu<->kvm interface it is important to have an interface that - allows us to be architectural compliant - which is fast - which must not prevent features like vhost - which allows to implement live migration - ...? Christian -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 15 Aug 2012 09:48:44 +0200, Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> Have you written an appendix for the virtio specification for > >> virtio-ccw? I think it would be good to include in this series for the > >> purposes of review. > > > > Might be nice, but don't get fancy about it. Text will be fine and I > > can cut & paste it in once it's finalized. > > There was a patch against the virtio spec in the patch series. Did it not > make it to you? Yes, I got it, thanks (not in this thread, but that's OK). Will hold off until finalized. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/s390/include/asm/irq.h b/arch/s390/include/asm/irq.h index 2b9d418..b4bea53 100644 --- a/arch/s390/include/asm/irq.h +++ b/arch/s390/include/asm/irq.h @@ -31,6 +31,7 @@ enum interruption_class { IOINT_CTC, IOINT_APB, IOINT_CSC, + IOINT_VIR, NMI_NMI, NR_IRQS, }; diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c index dd7630d..2cc7eed 100644 --- a/arch/s390/kernel/irq.c +++ b/arch/s390/kernel/irq.c @@ -56,6 +56,7 @@ static const struct irq_class intrclass_names[] = { {.name = "CTC", .desc = "[I/O] CTC" }, {.name = "APB", .desc = "[I/O] AP Bus" }, {.name = "CSC", .desc = "[I/O] CHSC Subchannel" }, + {.name = "VIR", .desc = "[I/O] Virtual I/O Devices" }, {.name = "NMI", .desc = "[NMI] Machine Check" }, }; diff --git a/drivers/s390/kvm/Makefile b/drivers/s390/kvm/Makefile index 0815690..241891a 100644 --- a/drivers/s390/kvm/Makefile +++ b/drivers/s390/kvm/Makefile @@ -6,4 +6,4 @@ # it under the terms of the GNU General Public License (version 2 only) # as published by the Free Software Foundation. -obj-$(CONFIG_S390_GUEST) += kvm_virtio.o +obj-$(CONFIG_S390_GUEST) += kvm_virtio.o virtio_ccw.o diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c new file mode 100644 index 0000000..df0f994 --- /dev/null +++ b/drivers/s390/kvm/virtio_ccw.c @@ -0,0 +1,761 @@ +/* + * ccw based virtio transport + * + * Copyright IBM Corp. 2012 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Cornelia Huck <cornelia.huck@de.ibm.com> + */ + +#include <linux/kernel_stat.h> +#include <linux/init.h> +#include <linux/bootmem.h> +#include <linux/err.h> +#include <linux/virtio.h> +#include <linux/virtio_config.h> +#include <linux/slab.h> +#include <linux/virtio_console.h> +#include <linux/interrupt.h> +#include <linux/virtio_ring.h> +#include <linux/pfn.h> +#include <linux/async.h> +#include <linux/wait.h> +#include <linux/list.h> +#include <linux/bitops.h> +#include <linux/module.h> +#include <asm/io.h> +#include <asm/kvm_para.h> +#include <asm/setup.h> +#include <asm/irq.h> +#include <asm/cio.h> +#include <asm/ccwdev.h> + +/* + * virtio related functions + */ + +struct vq_config_block { + __u16 index; + __u16 num; +} __attribute__ ((packed)); + +#define VIRTIO_CCW_CONFIG_SIZE 0x100 +/* same as PCI config space size, should be enough for all drivers */ + +struct virtio_ccw_device { + struct virtio_device vdev; + __u8 status; + __u8 config[VIRTIO_CCW_CONFIG_SIZE]; + struct ccw_device *cdev; + struct ccw1 ccw; + __u32 area; + __u32 curr_io; + int err; + wait_queue_head_t wait_q; + spinlock_t lock; + struct list_head virtqueues; + unsigned long indicators; /* XXX - works because we're under 64 bit */ + struct vq_config_block *config_block; +}; + +struct vq_info_block { + __u64 queue; + __u16 num; +} __attribute__ ((packed)); + +struct virtio_ccw_vq_info { + struct virtqueue *vq; + int num; + int queue_index; + void *queue; + struct vq_info_block *info_block; + struct list_head node; +}; + +#define KVM_VIRTIO_CCW_RING_ALIGN 4096 + +#define CCW_CMD_SET_VQ 0x13 +#define CCW_CMD_VDEV_RESET 0x33 +#define CCW_CMD_SET_IND 0x43 +#define CCW_CMD_READ_FEAT 0x12 +#define CCW_CMD_WRITE_FEAT 0x11 +#define CCW_CMD_READ_CONF 0x22 +#define CCW_CMD_WRITE_CONF 0x21 +#define CCW_CMD_WRITE_STATUS 0x31 +#define CCW_CMD_READ_VQ_CONF 0x32 + +#define VIRTIO_CCW_DOING_SET_VQ 0x00010000 +#define VIRTIO_CCW_DOING_RESET 0x00040000 +#define VIRTIO_CCW_DOING_READ_FEAT 0x00080000 +#define VIRTIO_CCW_DOING_WRITE_FEAT 0x00100000 +#define VIRTIO_CCW_DOING_READ_CONFIG 0x00200000 +#define VIRTIO_CCW_DOING_WRITE_CONFIG 0x00400000 +#define VIRTIO_CCW_DOING_WRITE_STATUS 0x00800000 +#define VIRTIO_CCW_DOING_SET_IND 0x01000000 +#define VIRTIO_CCW_DOING_READ_VQ_CONF 0x02000000 +#define VIRTIO_CCW_INTPARM_MASK 0xffff0000 + +static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev) +{ + return container_of(vdev, struct virtio_ccw_device, vdev); +} + +static int doing_io(struct virtio_ccw_device *vcdev, __u32 flag) +{ + unsigned long flags; + __u32 ret; + + spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags); + if (vcdev->err) + ret = vcdev->err; + else + ret = vcdev->curr_io & flag; + spin_unlock_irqrestore(get_ccwdev_lock(vcdev->cdev), flags); + return ret; +} + +static int ccw_io_helper(struct virtio_ccw_device *vcdev, __u32 intparm) +{ + int ret; + unsigned long flags; + int flag = intparm & VIRTIO_CCW_INTPARM_MASK; + + spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags); + ret = ccw_device_start(vcdev->cdev, &vcdev->ccw, intparm, 0, 0); + if (!ret) + vcdev->curr_io |= flag; + spin_unlock_irqrestore(get_ccwdev_lock(vcdev->cdev), flags); + wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0); + return ret ? ret : vcdev->err; +} + +static void virtio_ccw_kvm_notify(struct virtqueue *vq) +{ + struct virtio_ccw_vq_info *info = vq->priv; + struct virtio_ccw_device *vcdev; + struct subchannel_id schid; + __u32 reg2; + + vcdev = to_vc_device(info->vq->vdev); + ccw_device_get_schid(vcdev->cdev, &schid); + reg2 = *(__u32 *)&schid; + kvm_hypercall2(3 /* CCW_NOTIFY */, reg2, info->queue_index); +} + +static int virtio_ccw_read_vq_conf(struct virtio_ccw_device *vcdev, int index) +{ + vcdev->config_block->index = index; + vcdev->ccw.cmd_code = CCW_CMD_READ_VQ_CONF; + vcdev->ccw.flags = 0; + vcdev->ccw.count = sizeof(struct vq_config_block); + vcdev->ccw.cda = (__u32)(unsigned long)(vcdev->config_block); + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_READ_VQ_CONF); + return vcdev->config_block->num; +} + +static void virtio_ccw_del_vq(struct virtqueue *vq) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vq->vdev); + struct virtio_ccw_vq_info *info = vq->priv; + unsigned long flags; + unsigned long size; + int ret; + + /* Remove from our list. */ + spin_lock_irqsave(&vcdev->lock, flags); + list_del(&info->node); + spin_unlock_irqrestore(&vcdev->lock, flags); + + /* Release from host. */ + info->info_block->queue = 0; + info->info_block->num = info->queue_index; + vcdev->ccw.cmd_code = CCW_CMD_SET_VQ; + vcdev->ccw.flags = 0; + vcdev->ccw.count = sizeof(*info->info_block); + vcdev->ccw.cda = (__u32)(unsigned long)(info->info_block); + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_SET_VQ | info->queue_index); + if (ret) + dev_warn(&vq->vdev->dev, "Error %x while deleting queue %d", + ret, info->queue_index); + + vring_del_virtqueue(vq); + size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN)); + free_pages_exact(info->queue, size); + kfree(info->info_block); + kfree(info); +} + +static void virtio_ccw_del_vqs(struct virtio_device *vdev) +{ + struct virtqueue *vq, *n; + + list_for_each_entry_safe(vq, n, &vdev->vqs, list) + virtio_ccw_del_vq(vq); +} + +static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev, + int i, vq_callback_t *callback, + const char *name) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + int err; + struct virtqueue *vq; + struct virtio_ccw_vq_info *info; + unsigned long size; + unsigned long flags; + + /* Allocate queue. */ + info = kzalloc(sizeof(struct virtio_ccw_vq_info), GFP_KERNEL); + if (!info) { + dev_warn(&vcdev->cdev->dev, "no info\n"); + err = -ENOMEM; + goto out_err; + } + info->info_block = kzalloc(sizeof(*info->info_block), + GFP_DMA | GFP_KERNEL); + if (!info->info_block) { + dev_warn(&vcdev->cdev->dev, "no info block\n"); + err = -ENOMEM; + goto out_err; + } + info->queue_index = i; + info->num = virtio_ccw_read_vq_conf(vcdev, i); + size = PAGE_ALIGN(vring_size(info->num, KVM_VIRTIO_CCW_RING_ALIGN)); + info->queue = alloc_pages_exact(size, GFP_KERNEL | __GFP_ZERO); + if (info->queue == NULL) { + dev_warn(&vcdev->cdev->dev, "no queue\n"); + err = -ENOMEM; + goto out_err; + } + vq = vring_new_virtqueue(info->num, KVM_VIRTIO_CCW_RING_ALIGN, vdev, + true, info->queue, virtio_ccw_kvm_notify, + callback, name); + if (!vq) { + dev_warn(&vcdev->cdev->dev, "no vq\n"); + err = -ENOMEM; + free_pages_exact(info->queue, size); + goto out_err; + } + info->vq = vq; + vq->priv = info; + + /* Register it with the host. */ + info->info_block->queue = (__u64)info->queue; + info->info_block->num = info->queue_index; + vcdev->ccw.cmd_code = CCW_CMD_SET_VQ; + vcdev->ccw.flags = 0; + vcdev->ccw.count = sizeof(*info->info_block); + vcdev->ccw.cda = (__u32)(unsigned long)(info->info_block); + err = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_SET_VQ | info->queue_index); + if (err) { + dev_warn(&vcdev->cdev->dev, "SET_VQ failed\n"); + free_pages_exact(info->queue, size); + info->vq = NULL; + vq->priv = NULL; + goto out_err; + } + + /* Save it to our list. */ + spin_lock_irqsave(&vcdev->lock, flags); + list_add(&info->node, &vcdev->virtqueues); + spin_unlock_irqrestore(&vcdev->lock, flags); + + return vq; + +out_err: + if (info) + kfree(info->info_block); + kfree(info); + return ERR_PTR(err); +} + +static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char *names[]) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + int ret, i; + + for (i = 0; i < nvqs; ++i) { + vqs[i] = virtio_ccw_setup_vq(vdev, i, callbacks[i], names[i]); + if (IS_ERR(vqs[i])) { + ret = PTR_ERR(vqs[i]); + vqs[i] = NULL; + goto out; + } + } + /* Register queue indicators with host. */ + vcdev->indicators = 0; + vcdev->ccw.cmd_code = CCW_CMD_SET_IND; + vcdev->ccw.flags = 0; + vcdev->ccw.count = sizeof(vcdev->indicators); + vcdev->ccw.cda = (__u32)(unsigned long)(&vcdev->indicators); + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_SET_IND); + if (ret) + goto out; + return 0; +out: + virtio_ccw_del_vqs(vdev); + return ret; +} + +static void virtio_ccw_reset(struct virtio_device *vdev) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + + /* Send a reset ccw on device. */ + vcdev->ccw.cmd_code = CCW_CMD_VDEV_RESET; + vcdev->ccw.flags = 0; + vcdev->ccw.count = 0; + vcdev->ccw.cda = 0; + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_RESET); +} + +static u32 virtio_ccw_get_features(struct virtio_device *vdev) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + u32 features; + int ret; + + /* Read the feature bits from the host. */ + vcdev->ccw.cmd_code = CCW_CMD_READ_FEAT; + vcdev->ccw.flags = 0; + vcdev->ccw.count = sizeof(features); + vcdev->ccw.cda = vcdev->area; + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_READ_FEAT); + if (ret) + return 0; + + memcpy(&features, (void *)(unsigned long)vcdev->area, + sizeof(features)); + return le32_to_cpu(features); +} + +static void virtio_ccw_finalize_features(struct virtio_device *vdev) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + + /* Give virtio_ring a chance to accept features. */ + vring_transport_features(vdev); + + memcpy((void *)(unsigned long)vcdev->area, vdev->features, + sizeof(*vdev->features)); + /* Write the feature bits to the host. */ + vcdev->ccw.cmd_code = CCW_CMD_WRITE_FEAT; + /* Sigh. The kernel's features may be longer than the host's. */ + vcdev->ccw.flags = CCW_FLAG_SLI; + vcdev->ccw.count = sizeof(*vdev->features); + vcdev->ccw.cda = vcdev->area; + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_WRITE_FEAT); +} + +static void virtio_ccw_get_config(struct virtio_device *vdev, + unsigned int offset, void *buf, unsigned len) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + int ret; + + /* Read the config area from the host. */ + vcdev->ccw.cmd_code = CCW_CMD_READ_CONF; + vcdev->ccw.flags = 0; + vcdev->ccw.count = offset + len; + vcdev->ccw.cda = vcdev->area; + ret = ccw_io_helper(vcdev, VIRTIO_CCW_DOING_READ_CONFIG); + if (ret) + return; + + memcpy(vcdev->config, (void *)(unsigned long)vcdev->area, + sizeof(vcdev->config)); + memcpy(buf, &vcdev->config[offset], len); +} + +static void virtio_ccw_set_config(struct virtio_device *vdev, + unsigned int offset, const void *buf, + unsigned len) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + + memcpy(&vcdev->config[offset], buf, len); + /* Write the config area to the host. */ + memcpy((void *)(unsigned long)vcdev->area, vcdev->config, + sizeof(vcdev->config)); + vcdev->ccw.cmd_code = CCW_CMD_WRITE_CONF; + vcdev->ccw.flags = 0; + vcdev->ccw.count = offset + len; + vcdev->ccw.cda = vcdev->area; + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_WRITE_CONFIG); +} + +static u8 virtio_ccw_get_status(struct virtio_device *vdev) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + + return vcdev->status; +} + +static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status) +{ + struct virtio_ccw_device *vcdev = to_vc_device(vdev); + + /* Write the status to the host. */ + vcdev->status = status; + memcpy((void *)(unsigned long)vcdev->area, &status, sizeof(status)); + vcdev->ccw.cmd_code = CCW_CMD_WRITE_STATUS; + vcdev->ccw.flags = 0; + vcdev->ccw.count = sizeof(status); + vcdev->ccw.cda = vcdev->area; + ccw_io_helper(vcdev, VIRTIO_CCW_DOING_WRITE_STATUS); +} + +static struct virtio_config_ops virtio_ccw_config_ops = { + .get_features = virtio_ccw_get_features, + .finalize_features = virtio_ccw_finalize_features, + .get = virtio_ccw_get_config, + .set = virtio_ccw_set_config, + .get_status = virtio_ccw_get_status, + .set_status = virtio_ccw_set_status, + .reset = virtio_ccw_reset, + .find_vqs = virtio_ccw_find_vqs, + .del_vqs = virtio_ccw_del_vqs, +}; + + +/* + * ccw bus driver related functions + */ + +static void virtio_ccw_release_dev(struct device *_d) +{ + struct virtio_device *dev = container_of(_d, struct virtio_device, + dev); + struct virtio_ccw_device *vcdev = to_vc_device(dev); + + kfree((void *)(unsigned long)vcdev->area); + kfree(vcdev->config_block); + kfree(vcdev); +} + +static int irb_is_error(struct irb *irb) +{ + if (scsw_cstat(&irb->scsw) != 0) + return 1; + if (scsw_dstat(&irb->scsw) & ~(DEV_STAT_CHN_END | DEV_STAT_DEV_END)) + return 1; + if (scsw_cc(&irb->scsw) != 0) + return 1; + return 0; +} + +static struct virtqueue *virtio_ccw_vq_by_ind(struct virtio_ccw_device *vcdev, + int index) +{ + struct virtio_ccw_vq_info *info; + unsigned long flags; + struct virtqueue *vq; + + vq = NULL; + spin_lock_irqsave(&vcdev->lock, flags); + list_for_each_entry(info, &vcdev->virtqueues, node) { + if (info->queue_index == index) { + vq = info->vq; + break; + } + } + spin_unlock_irqrestore(&vcdev->lock, flags); + return vq; +} + +static void virtio_ccw_int_handler(struct ccw_device *cdev, + unsigned long intparm, + struct irb *irb) +{ + __u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK; + struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev); + int i; + struct virtqueue *vq; + + /* Check if it's a notification from the host. */ + if ((intparm == 0) && + (scsw_stctl(&irb->scsw) == + (SCSW_STCTL_ALERT_STATUS | SCSW_STCTL_STATUS_PEND))) { + /* OK */ + } + if (irb_is_error(irb)) + vcdev->err = -EIO; /* XXX - use real error */ + if (vcdev->curr_io & activity) { + switch (activity) { + case VIRTIO_CCW_DOING_READ_FEAT: + case VIRTIO_CCW_DOING_WRITE_FEAT: + case VIRTIO_CCW_DOING_READ_CONFIG: + case VIRTIO_CCW_DOING_WRITE_CONFIG: + case VIRTIO_CCW_DOING_WRITE_STATUS: + case VIRTIO_CCW_DOING_SET_VQ: + case VIRTIO_CCW_DOING_SET_IND: + case VIRTIO_CCW_DOING_RESET: + case VIRTIO_CCW_DOING_READ_VQ_CONF: + vcdev->curr_io &= ~activity; + wake_up(&vcdev->wait_q); + break; + default: + /* don't know what to do... */ + dev_warn(&cdev->dev, "Suspicious activity '%08x'\n", + activity); + WARN_ON(1); + break; + } + } + for_each_set_bit(i, &vcdev->indicators, + sizeof(vcdev->indicators)) { + vq = virtio_ccw_vq_by_ind(vcdev, i); + vring_interrupt(0, vq); + clear_bit(i, &vcdev->indicators); + } +} + +/* + * We usually want to autoonline all devices, but give the admin + * a way to exempt devices from this. + */ +#define __DEV_WORDS ((__MAX_SUBCHANNEL + (8*sizeof(long) - 1)) / \ + (8*sizeof(long))) +static unsigned long devs_no_auto[__MAX_SSID + 1][__DEV_WORDS]; + +static char *no_auto = ""; + +module_param(no_auto, charp, 0444); +MODULE_PARM_DESC(no_auto, "list of ccw bus id ranges not to be auto-onlined"); + +static int virtio_ccw_check_autoonline(struct ccw_device *cdev) +{ + struct ccw_dev_id id; + + ccw_device_get_id(cdev, &id); + if (test_bit(id.devno, devs_no_auto[id.ssid])) + return 0; + return 1; +} + +static void virtio_ccw_auto_online(void *data, async_cookie_t cookie) +{ + struct ccw_device *cdev = data; + int ret; + + ret = ccw_device_set_online(cdev); + if (ret) + dev_warn(&cdev->dev, "Failed to set online: %d\n", ret); +} + +static int virtio_ccw_probe(struct ccw_device *cdev) +{ + cdev->handler = virtio_ccw_int_handler; + + if (virtio_ccw_check_autoonline(cdev)) + async_schedule(virtio_ccw_auto_online, cdev); + return 0; +} + +static void virtio_ccw_remove(struct ccw_device *cdev) +{ + cdev->handler = NULL; +} + +static int virtio_ccw_offline(struct ccw_device *cdev) +{ + struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev); + + unregister_virtio_device(&vcdev->vdev); + dev_set_drvdata(&cdev->dev, NULL); + return 0; +} + + +/* Area needs to be big enough to fit status, features or configuration. */ +#define VIRTIO_AREA_SIZE VIRTIO_CCW_CONFIG_SIZE /* biggest possible use */ + +static int virtio_ccw_online(struct ccw_device *cdev) +{ + int ret; + struct virtio_ccw_device *vcdev; + + vcdev = kzalloc(sizeof(*vcdev), GFP_KERNEL); + if (!vcdev) { + dev_warn(&cdev->dev, "Could not get memory for virtio\n"); + ret = -ENOMEM; + goto out_free; + } + vcdev->area = (__u32)(unsigned long)kzalloc(VIRTIO_AREA_SIZE, + GFP_DMA | GFP_KERNEL); + if (!vcdev->area) { + dev_warn(&cdev->dev, "Cound not get memory for virtio\n"); + ret = -ENOMEM; + goto out_free; + } + vcdev->config_block = kzalloc(sizeof(*vcdev->config_block), + GFP_DMA | GFP_KERNEL); + if (!vcdev->config_block) { + ret = -ENOMEM; + goto out_free; + } + vcdev->vdev.dev.parent = &cdev->dev; + vcdev->vdev.dev.release = virtio_ccw_release_dev; + vcdev->vdev.config = &virtio_ccw_config_ops; + vcdev->cdev = cdev; + init_waitqueue_head(&vcdev->wait_q); + INIT_LIST_HEAD(&vcdev->virtqueues); + + dev_set_drvdata(&cdev->dev, vcdev); + vcdev->vdev.id.vendor = cdev->id.cu_type; + vcdev->vdev.id.device = cdev->id.cu_model; + ret = register_virtio_device(&vcdev->vdev); + if (ret) { + dev_warn(&cdev->dev, "Failed to register virtio device: %d\n", + ret); + goto out_put; + } + return 0; +out_put: + dev_set_drvdata(&cdev->dev, NULL); + put_device(&vcdev->vdev.dev); + return ret; +out_free: + if (vcdev) { + kfree((void *)(unsigned long)vcdev->area); + kfree(vcdev->config_block); + } + kfree(vcdev); + return ret; +} + +static int virtio_ccw_cio_notify(struct ccw_device *cdev, int event) +{ + /* TODO: Check whether we need special handling here. */ + return 0; +} + +static struct ccw_device_id virtio_ids[] = { + { CCW_DEVICE(0x3832, 0) }, + {}, +}; +MODULE_DEVICE_TABLE(ccw, virtio_ids); + +static struct ccw_driver virtio_ccw_driver = { + .driver = { + .owner = THIS_MODULE, + .name = "virtio_ccw", + }, + .ids = virtio_ids, + .probe = virtio_ccw_probe, + .remove = virtio_ccw_remove, + .set_offline = virtio_ccw_offline, + .set_online = virtio_ccw_online, + .notify = virtio_ccw_cio_notify, + .int_class = IOINT_VIR, +}; + +static int __init pure_hex(char **cp, unsigned int *val, int min_digit, + int max_digit, int max_val) +{ + int diff; + + diff = 0; + *val = 0; + + while (diff <= max_digit) { + int value = hex_to_bin(**cp); + + if (value < 0) + break; + *val = *val * 16 + value; + (*cp)++; + diff++; + } + + if ((diff < min_digit) || (diff > max_digit) || (*val > max_val)) + return 1; + + return 0; +} + +static int __init parse_busid(char *str, unsigned int *cssid, + unsigned int *ssid, unsigned int *devno) +{ + char *str_work; + int rc, ret; + + rc = 1; + + if (*str == '\0') + goto out; + + str_work = str; + ret = pure_hex(&str_work, cssid, 1, 2, __MAX_CSSID); + if (ret || (str_work[0] != '.')) + goto out; + str_work++; + ret = pure_hex(&str_work, ssid, 1, 1, __MAX_SSID); + if (ret || (str_work[0] != '.')) + goto out; + str_work++; + ret = pure_hex(&str_work, devno, 4, 4, __MAX_SUBCHANNEL); + if (ret || (str_work[0] != '\0')) + goto out; + + rc = 0; +out: + return rc; +} + +static void __init no_auto_parse(void) +{ + unsigned int from_cssid, to_cssid, from_ssid, to_ssid, from, to; + char *parm, *str; + int rc; + + str = no_auto; + while ((parm = strsep(&str, ","))) { + rc = parse_busid(strsep(&parm, "-"), &from_cssid, + &from_ssid, &from); + if (rc) + continue; + if (parm != NULL) { + rc = parse_busid(parm, &to_cssid, + &to_ssid, &to); + if ((from_ssid > to_ssid) || + ((from_ssid == to_ssid) && (from > to))) + rc = -EINVAL; + } else { + to_cssid = from_cssid; + to_ssid = from_ssid; + to = from; + } + if (rc) + continue; + while ((from_ssid < to_ssid) || + ((from_ssid == to_ssid) && (from <= to))) { + set_bit(from, devs_no_auto[from_ssid]); + from++; + if (from > __MAX_SUBCHANNEL) { + from_ssid++; + from = 0; + } + } + } +} + +static int __init virtio_ccw_init(void) +{ + /* parse no_auto string before we do anything further */ + no_auto_parse(); + return ccw_driver_register(&virtio_ccw_driver); +} +module_init(virtio_ccw_init); + +static void __exit virtio_ccw_exit(void) +{ + ccw_driver_unregister(&virtio_ccw_driver); +} +module_exit(virtio_ccw_exit);
Add a driver for kvm guests that matches virtual ccw devices provided by the host as virtio bridge devices. These virtio-ccw devices use a special set of channel commands in order to perform virtio functions. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> --- arch/s390/include/asm/irq.h | 1 + arch/s390/kernel/irq.c | 1 + drivers/s390/kvm/Makefile | 2 +- drivers/s390/kvm/virtio_ccw.c | 761 ++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 764 insertions(+), 1 deletion(-) create mode 100644 drivers/s390/kvm/virtio_ccw.c