diff mbox

[V5,4/6,net-next] vhost: vhost TX zero-copy support

Message ID 1305574484.3456.30.camel@localhost.localdomain (mailing list archive)
State New, archived
Headers show

Commit Message

Shirley Ma May 16, 2011, 7:34 p.m. UTC
This patch maintains the outstanding userspace buffers in the 
sequence it is delivered to vhost. The outstanding userspace buffers 
will be marked as done once the lower device buffers DMA has finished. 
This is monitored through last reference of kfree_skb callback. Two
buffer index are used for this purpose.

The vhost passes the userspace buffers info to lower device skb 
through message control. Since there will be some done DMAs when
entering vhost handle_tx. The worse case is all buffers in the vq are
in pending/done status, so we need to notify guest to release DMA done 
buffers first before get any new buffers from the vq.

Signed-off-by: Shirley <xma@us.ibm.com>
---

 drivers/vhost/net.c   |   39 +++++++++++++++++++++++++++++++-
 drivers/vhost/vhost.c |   58 +++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/vhost/vhost.h |   12 ++++++++++
 3 files changed, 107 insertions(+), 2 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael S. Tsirkin May 16, 2011, 8:45 p.m. UTC | #1
> +/* Since we need to keep the order of used_idx as avail_idx, it's possible that
> + * DMA done not in order in lower device driver for some reason. To prevent
> + * used_idx out of order, upend_idx is used to track avail_idx order, done_idx
> + * is used to track used_idx order. Once lower device DMA done, then upend_idx
> + * can move to done_idx.

Could you clarify this please? virtio explicitly allows out of order
completion of requests. Does it simplify code that we try to keep
used index updates in-order? Because if not, this is not
really a requirement.
Shirley Ma May 16, 2011, 8:56 p.m. UTC | #2
On Mon, 2011-05-16 at 23:45 +0300, Michael S. Tsirkin wrote:
> > +/* Since we need to keep the order of used_idx as avail_idx, it's
> possible that
> > + * DMA done not in order in lower device driver for some reason. To
> prevent
> > + * used_idx out of order, upend_idx is used to track avail_idx
> order, done_idx
> > + * is used to track used_idx order. Once lower device DMA done,
> then upend_idx
> > + * can move to done_idx.
> 
> Could you clarify this please? virtio explicitly allows out of order
> completion of requests. Does it simplify code that we try to keep
> used index updates in-order? Because if not, this is not
> really a requirement.

Hello Mike,

Based on my testing, vhost_add_used() must be in order from
vhost_get_vq_desc(). Otherwise, virtio_net ring seems get double
freed. I didn't spend time on debugging this.

in virtqueue_get_buf

        if (unlikely(!vq->data[i])) {
                BAD_RING(vq, "id %u is not a head!\n", i);
                return NULL;
        }

That's the reason I created the upend_idx and done_idx.

Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin May 16, 2011, 9:24 p.m. UTC | #3
On Mon, May 16, 2011 at 01:56:54PM -0700, Shirley Ma wrote:
> On Mon, 2011-05-16 at 23:45 +0300, Michael S. Tsirkin wrote:
> > > +/* Since we need to keep the order of used_idx as avail_idx, it's
> > possible that
> > > + * DMA done not in order in lower device driver for some reason. To
> > prevent
> > > + * used_idx out of order, upend_idx is used to track avail_idx
> > order, done_idx
> > > + * is used to track used_idx order. Once lower device DMA done,
> > then upend_idx
> > > + * can move to done_idx.
> > 
> > Could you clarify this please? virtio explicitly allows out of order
> > completion of requests. Does it simplify code that we try to keep
> > used index updates in-order? Because if not, this is not
> > really a requirement.
> 
> Hello Mike,
> 
> Based on my testing, vhost_add_used() must be in order from
> vhost_get_vq_desc(). Otherwise, virtio_net ring seems get double
> freed.

Double-freed or you get NULL below?

> I didn't spend time on debugging this.
> 
> in virtqueue_get_buf
> 
>         if (unlikely(!vq->data[i])) {
>                 BAD_RING(vq, "id %u is not a head!\n", i);
>                 return NULL;
>         }

Yes but i used here is the head that we read from the
ring, not the ring index itself.
	i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id
we must complete any id only once, but in any order.

> That's the reason I created the upend_idx and done_idx.
> 
> Thanks
> Shirley

Very strange, it sounds like a bug, but I can't tell where: in
host or in guest. If it's in the guest, we must fix it.
If in host, we should only fix it if it makes life simpler for us.
Could you try to nail it down pls?  Another question: will code get
simpler or more complex if that restriction's removed?
Michael S. Tsirkin May 16, 2011, 9:27 p.m. UTC | #4
On Mon, May 16, 2011 at 01:56:54PM -0700, Shirley Ma wrote:
> On Mon, 2011-05-16 at 23:45 +0300, Michael S. Tsirkin wrote:
> > > +/* Since we need to keep the order of used_idx as avail_idx, it's
> > possible that
> > > + * DMA done not in order in lower device driver for some reason. To
> > prevent
> > > + * used_idx out of order, upend_idx is used to track avail_idx
> > order, done_idx
> > > + * is used to track used_idx order. Once lower device DMA done,
> > then upend_idx
> > > + * can move to done_idx.
> > 
> > Could you clarify this please? virtio explicitly allows out of order
> > completion of requests. Does it simplify code that we try to keep
> > used index updates in-order? Because if not, this is not
> > really a requirement.
> 
> Hello Mike,
> 
> Based on my testing, vhost_add_used() must be in order from
> vhost_get_vq_desc(). Otherwise, virtio_net ring seems get double
> freed. I didn't spend time on debugging this.
> 
> in virtqueue_get_buf
> 
>         if (unlikely(!vq->data[i])) {
>                 BAD_RING(vq, "id %u is not a head!\n", i);
>                 return NULL;
>         }
> 
> That's the reason I created the upend_idx and done_idx.
> 
> Thanks
> Shirley

One thing of note: it's possible that this actually works
better than trying to complete out of order, as the
ring just keeps going which should be good for cache
utilization. OTOH, this might explain why
you are over-running the TX ring much more with this patch.

So I don't say this should block merging the patch,
but I very much would like to understand the issue,
and it's interesting to experiment with fixing it
and seeing what it does to performance and to code size.
Shirley Ma May 16, 2011, 9:30 p.m. UTC | #5
On Tue, 2011-05-17 at 00:24 +0300, Michael S. Tsirkin wrote:
> Very strange, it sounds like a bug, but I can't tell where: in
> host or in guest. If it's in the guest, we must fix it.
> If in host, we should only fix it if it makes life simpler for us.
> Could you try to nail it down pls?  Another question: will code get
> simpler or more complex if that restriction's removed? 

It should be similar. We still need to maintain the pending list, and
mark the DMA done ids.

I can make a try to narrow this issue down.

Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shirley Ma May 17, 2011, 4:31 a.m. UTC | #6
On Tue, 2011-05-17 at 00:24 +0300, Michael S. Tsirkin wrote:
> On Mon, May 16, 2011 at 01:56:54PM -0700, Shirley Ma wrote:
> > On Mon, 2011-05-16 at 23:45 +0300, Michael S. Tsirkin wrote:
> > > > +/* Since we need to keep the order of used_idx as avail_idx,
> it's
> > > possible that
> > > > + * DMA done not in order in lower device driver for some
> reason. To
> > > prevent
> > > > + * used_idx out of order, upend_idx is used to track avail_idx
> > > order, done_idx
> > > > + * is used to track used_idx order. Once lower device DMA done,
> > > then upend_idx
> > > > + * can move to done_idx.
> > > 
> > > Could you clarify this please? virtio explicitly allows out of
> order
> > > completion of requests. Does it simplify code that we try to keep
> > > used index updates in-order? Because if not, this is not
> > > really a requirement.
> > 
> > Hello Mike,
> > 
> > Based on my testing, vhost_add_used() must be in order from
> > vhost_get_vq_desc(). Otherwise, virtio_net ring seems get double
> > freed.
> 
> Double-freed or you get NULL below?

More likely is NULL.

> > I didn't spend time on debugging this.
> > 
> > in virtqueue_get_buf
> > 
> >         if (unlikely(!vq->data[i])) {
> >                 BAD_RING(vq, "id %u is not a head!\n", i);
> >                 return NULL;
> >         }
> 
> Yes but i used here is the head that we read from the
> ring, not the ring index itself.
>         i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id
> we must complete any id only once, but in any order.

It is in any order of ring id, but must be in the order of "head"
returns  from  vhost_get_vq_desc(), any clue?


Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin May 17, 2011, 5:55 a.m. UTC | #7
On Mon, May 16, 2011 at 09:31:23PM -0700, Shirley Ma wrote:
> On Tue, 2011-05-17 at 00:24 +0300, Michael S. Tsirkin wrote:
> > On Mon, May 16, 2011 at 01:56:54PM -0700, Shirley Ma wrote:
> > > On Mon, 2011-05-16 at 23:45 +0300, Michael S. Tsirkin wrote:
> > > > > +/* Since we need to keep the order of used_idx as avail_idx,
> > it's
> > > > possible that
> > > > > + * DMA done not in order in lower device driver for some
> > reason. To
> > > > prevent
> > > > > + * used_idx out of order, upend_idx is used to track avail_idx
> > > > order, done_idx
> > > > > + * is used to track used_idx order. Once lower device DMA done,
> > > > then upend_idx
> > > > > + * can move to done_idx.
> > > > 
> > > > Could you clarify this please? virtio explicitly allows out of
> > order
> > > > completion of requests. Does it simplify code that we try to keep
> > > > used index updates in-order? Because if not, this is not
> > > > really a requirement.
> > > 
> > > Hello Mike,
> > > 
> > > Based on my testing, vhost_add_used() must be in order from
> > > vhost_get_vq_desc(). Otherwise, virtio_net ring seems get double
> > > freed.
> > 
> > Double-freed or you get NULL below?
> 
> More likely is NULL.
> 
> > > I didn't spend time on debugging this.
> > > 
> > > in virtqueue_get_buf
> > > 
> > >         if (unlikely(!vq->data[i])) {
> > >                 BAD_RING(vq, "id %u is not a head!\n", i);
> > >                 return NULL;
> > >         }
> > 
> > Yes but i used here is the head that we read from the
> > ring, not the ring index itself.
> >         i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id
> > we must complete any id only once, but in any order.
> 
> It is in any order of ring id, but must be in the order of "head"
> returns  from  vhost_get_vq_desc(), any clue?
> 
> 
> Thanks
> Shirley


Something in your patch that overwrites the id in vhost
and makes it put the wrong id in the used ring?

By the way, need to keep in mind that a guest can
give us the same head twice, need to make sure this
at least does not corrupt host memory.
Shirley Ma May 17, 2011, 3:22 p.m. UTC | #8
On Tue, 2011-05-17 at 08:55 +0300, Michael S. Tsirkin wrote:
> Something in your patch that overwrites the id in vhost
> and makes it put the wrong id in the used ring?
> 
> By the way, need to keep in mind that a guest can
> give us the same head twice, need to make sure this
> at least does not corrupt host memory.

I think I didn't explain the problem very well here.

This patch doesn't overwrite the id. It just keeps the same coming
sequence from "head return vhost_get_vq_desc()" to pass to
vhost_add_used.

The same ids can be used many times once it passes to guest from
vhost_add_used. There is no problem. The zero copy patch doesn't have
any issue.

The problem is the order of head from return vhost_get_vq_desc should be
in sequence when it passes to vhost_add_used.

The original code has no problem, because it gets one head and pass that
head to vhost_add_used one by one once done the copy. So it's in
sequence.

This issue can easily recreate without zerocopy patch by simply changing
the order from "head return vhost_get_vq_desc" when passing to
vhost_add_used.

Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin May 17, 2011, 3:28 p.m. UTC | #9
On Tue, May 17, 2011 at 08:22:14AM -0700, Shirley Ma wrote:
> On Tue, 2011-05-17 at 08:55 +0300, Michael S. Tsirkin wrote:
> > Something in your patch that overwrites the id in vhost
> > and makes it put the wrong id in the used ring?
> > 
> > By the way, need to keep in mind that a guest can
> > give us the same head twice, need to make sure this
> > at least does not corrupt host memory.
> 
> I think I didn't explain the problem very well here.
> 
> This patch doesn't overwrite the id. It just keeps the same coming
> sequence from "head return vhost_get_vq_desc()" to pass to
> vhost_add_used.
> 
> The same ids can be used many times once it passes to guest from
> vhost_add_used. There is no problem. The zero copy patch doesn't have
> any issue.
> 
> The problem is the order of head from return vhost_get_vq_desc should be
> in sequence when it passes to vhost_add_used.

Which is the order the descriptors are put on avail ring.
By design, guest should not depend on used ring entries
being in order with avail ring (and btw with virtio block,
they aren't). If it does, it's a bug I think.

> The original code has no problem, because it gets one head and pass that
> head to vhost_add_used one by one once done the copy. So it's in
> sequence.
> 
> This issue can easily recreate without zerocopy patch by simply changing
> the order from "head return vhost_get_vq_desc" when passing to
> vhost_add_used.
> 
> Thanks
> Shirley

Ah, did you try that? Could you post this patch pls?
This seems to imply a bug in guest virtio.
Shirley Ma May 17, 2011, 3:34 p.m. UTC | #10
On Tue, 2011-05-17 at 18:28 +0300, Michael S. Tsirkin wrote:
> Which is the order the descriptors are put on avail ring.
> By design, guest should not depend on used ring entries
> being in order with avail ring (and btw with virtio block,
> they aren't). If it does, it's a bug I think.

Ok, I thought, the order should be maintained.

> > The original code has no problem, because it gets one head and pass
> that
> > head to vhost_add_used one by one once done the copy. So it's in
> > sequence.
> > 
> > This issue can easily recreate without zerocopy patch by simply
> changing
> > the order from "head return vhost_get_vq_desc" when passing to
> > vhost_add_used.
> > 
> > Thanks
> > Shirley
> 
> Ah, did you try that? Could you post this patch pls?
> This seems to imply a bug in guest virtio. 

I am creating the patch against net-next for you to test today.

Thanks
Shirley

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2f7c76a..c964000 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -32,6 +32,10 @@ 
  * Using this limit prevents one virtqueue from starving others. */
 #define VHOST_NET_WEIGHT 0x80000
 
+/* MAX number of TX used buffers for outstanding zerocopy */
+#define VHOST_MAX_PEND 128
+#define VHOST_GOODCOPY_LEN 256
+
 enum {
 	VHOST_NET_VQ_RX = 0,
 	VHOST_NET_VQ_TX = 1,
@@ -129,6 +133,7 @@  static void handle_tx(struct vhost_net *net)
 	int err, wmem;
 	size_t hdr_size;
 	struct socket *sock;
+	struct skb_ubuf_info pend;
 
 	/* TODO: check that we are running from vhost_worker? */
 	sock = rcu_dereference_check(vq->private_data, 1);
@@ -151,6 +156,10 @@  static void handle_tx(struct vhost_net *net)
 	hdr_size = vq->vhost_hlen;
 
 	for (;;) {
+		/* Release DMAs done buffers first */
+		if (atomic_read(&vq->refcnt) > VHOST_MAX_ZEROCOPY_PEND)
+			vhost_zerocopy_signal_used(vq, false);
+
 		head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
 					 ARRAY_SIZE(vq->iov),
 					 &out, &in,
@@ -166,6 +175,12 @@  static void handle_tx(struct vhost_net *net)
 				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
 				break;
 			}
+			/* If more outstanding DMAs, queue the work */
+			if (atomic_read(&vq->refcnt) > VHOST_MAX_PEND) {
+				tx_poll_start(net, sock);
+				set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
+				break;
+			}
 			if (unlikely(vhost_enable_notify(vq))) {
 				vhost_disable_notify(vq);
 				continue;
@@ -188,17 +203,37 @@  static void handle_tx(struct vhost_net *net)
 			       iov_length(vq->hdr, s), hdr_size);
 			break;
 		}
+		/* use msg_control to pass vhost zerocopy ubuf info to skb */
+		if (sock_flag(sock->sk, SOCK_ZEROCOPY)) {
+			vq->heads[vq->upend_idx].id = head;
+			if (len <= VHOST_GOODCOPY_LEN)
+				/* copy don't need to wait for DMA done */
+				vq->heads[vq->upend_idx].len =
+							VHOST_DMA_DONE_LEN;
+			else {
+				vq->heads[vq->upend_idx].len = len;
+				pend.callback = vhost_zerocopy_callback;
+				pend.arg = vq;
+				pend.desc = vq->upend_idx;
+				msg.msg_control = &pend;
+				msg.msg_controllen = sizeof(pend);
+			}
+			atomic_inc(&vq->refcnt);
+			vq->upend_idx = (vq->upend_idx + 1) % UIO_MAXIOV;
+		}
 		/* TODO: Check specific error and bomb out unless ENOBUFS? */
 		err = sock->ops->sendmsg(NULL, sock, &msg, len);
 		if (unlikely(err < 0)) {
-			vhost_discard_vq_desc(vq, 1);
+			if (!sock_flag(sock->sk, SOCK_ZEROCOPY))
+				vhost_discard_vq_desc(vq, 1);
 			tx_poll_start(net, sock);
 			break;
 		}
 		if (err != len)
 			pr_debug("Truncated TX packet: "
 				 " len %d != %zd\n", err, len);
-		vhost_add_used_and_signal(&net->dev, vq, head, 0);
+		if (!sock_flag(sock->sk, SOCK_ZEROCOPY))
+			vhost_add_used_and_signal(&net->dev, vq, head, 0);
 		total_len += len;
 		if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
 			vhost_poll_queue(&vq->poll);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2ab2912..1a40310 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -174,6 +174,9 @@  static void vhost_vq_reset(struct vhost_dev *dev,
 	vq->call_ctx = NULL;
 	vq->call = NULL;
 	vq->log_ctx = NULL;
+	vq->upend_idx = 0;
+	vq->done_idx = 0;
+	atomic_set(&vq->refcnt, 0);
 }
 
 static int vhost_worker(void *data)
@@ -385,16 +388,62 @@  long vhost_dev_reset_owner(struct vhost_dev *dev)
 	return 0;
 }
 
+/* Since we need to keep the order of used_idx as avail_idx, it's possible that
+ * DMA done not in order in lower device driver for some reason. To prevent
+ * used_idx out of order, upend_idx is used to track avail_idx order, done_idx
+ * is used to track used_idx order. Once lower device DMA done, then upend_idx
+ * can move to done_idx.
+ */
+void vhost_zerocopy_signal_used(struct vhost_virtqueue *vq, bool shutdown)
+{
+	int i, j = 0;
+
+	i = vq->done_idx;
+	while (i != vq->upend_idx) {
+		if ((vq->heads[i].len == VHOST_DMA_DONE_LEN) || shutdown) {
+			/* reset len = 0 */
+			vq->heads[i].len = 0;
+			i = (i + 1) % UIO_MAXIOV;
+			++j;
+		} else
+			break;
+	}
+	if (j) {
+		/* heads is an array, i is used to tracking the last done_idx,
+		 * if last done_idx is less than current done_idx, then
+		 * add_used_n is not contiguous */
+		if (i > vq->done_idx)
+			vhost_add_used_n(vq, &vq->heads[vq->done_idx], j);
+		else {
+			vhost_add_used_n(vq, &vq->heads[vq->done_idx],
+					 UIO_MAXIOV - vq->done_idx);
+			vhost_add_used_n(vq, vq->heads, i);
+		}
+		vq->done_idx = i;
+		vhost_signal(vq->dev, vq);
+		atomic_sub(j, &vq->refcnt);
+	}
+}
+
 /* Caller should have device mutex */
 void vhost_dev_cleanup(struct vhost_dev *dev)
 {
 	int i;
+	unsigned long begin = jiffies;
 
 	for (i = 0; i < dev->nvqs; ++i) {
 		if (dev->vqs[i].kick && dev->vqs[i].handle_kick) {
 			vhost_poll_stop(&dev->vqs[i].poll);
 			vhost_poll_flush(&dev->vqs[i].poll);
 		}
+		/* Wait for all lower device DMAs done, then notify virtio_net
+		 * or just notify it without waiting for all DMA done here ?
+		 * in case of DMAs never done for some reason */
+		if (atomic_read(&dev->vqs[i].refcnt)) {
+			/* how long should we wait ? */
+			msleep(1000);
+			vhost_zerocopy_signal_used(&dev->vqs[i], true);
+		}
 		if (dev->vqs[i].error_ctx)
 			eventfd_ctx_put(dev->vqs[i].error_ctx);
 		if (dev->vqs[i].error)
@@ -1416,3 +1465,12 @@  void vhost_disable_notify(struct vhost_virtqueue *vq)
 		vq_err(vq, "Failed to enable notification at %p: %d\n",
 		       &vq->used->flags, r);
 }
+
+void vhost_zerocopy_callback(struct sk_buff *skb)
+{
+	int idx = skb_shinfo(skb)->ubuf.desc;
+	struct vhost_virtqueue *vq = skb_shinfo(skb)->ubuf.arg;
+
+	/* set len = 1 to mark this desc buffers done DMA */
+	vq->heads[idx].len = VHOST_DMA_DONE_LEN;
+}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b3363ae..8e3ecc7 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -13,6 +13,10 @@ 
 #include <linux/virtio_ring.h>
 #include <asm/atomic.h>
 
+/* This is for zerocopy, used buffer len is set to 1 when lower device DMA
+ * done */
+#define VHOST_DMA_DONE_LEN	1
+
 struct vhost_device;
 
 struct vhost_work;
@@ -108,6 +112,12 @@  struct vhost_virtqueue {
 	/* Log write descriptors */
 	void __user *log_base;
 	struct vhost_log *log;
+	/* vhost zerocopy support */
+	atomic_t refcnt; /* num of outstanding zerocopy DMAs */
+	/* copy of avail idx to monitor outstanding DMA zerocopy buffers */
+	int upend_idx;
+	/* copy of used idx to monintor DMA done zerocopy buffers */
+	int done_idx;
 };
 
 struct vhost_dev {
@@ -154,6 +164,8 @@  bool vhost_enable_notify(struct vhost_virtqueue *);
 
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		    unsigned int log_num, u64 len);
+void vhost_zerocopy_callback(struct sk_buff *skb);
+void vhost_zerocopy_signal_used(struct vhost_virtqueue *vq, bool shutdown);
 
 #define vq_err(vq, fmt, ...) do {                                  \
 		pr_debug(pr_fmt(fmt), ##__VA_ARGS__);       \