From patchwork Thu Jun 25 13:57:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 11625405 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 115866C1 for ; Thu, 25 Jun 2020 13:58:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EB3A6206A1 for ; Thu, 25 Jun 2020 13:58:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="a/wItYx+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405194AbgFYN6I (ORCPT ); Thu, 25 Jun 2020 09:58:08 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:50936 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404890AbgFYN6H (ORCPT ); Thu, 25 Jun 2020 09:58:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593093485; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I02RA+QYEkrvsrT/YPtoqYBWwQdCy39TUmoOUKOxREo=; b=a/wItYx+hLrle+LsB2fIkCtWyzfoIbFk9gIszn3n5aCLqGAjNxO4O6W+4V48GN+k02jbH3 JX2krh4NLSnscjOkg/wrT0HjdqbFmhVM/Xbu9J8Io1CAp+n/2QD5EqEi6r/yScXYtGyFHR FjRQVLYrb5iIhgD+Z9Yhuc9BwkDp1PM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-253-K8C6TJENOjamXr91Nc0bMQ-1; Thu, 25 Jun 2020 09:58:03 -0400 X-MC-Unique: K8C6TJENOjamXr91Nc0bMQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 69B86804001; Thu, 25 Jun 2020 13:58:02 +0000 (UTC) Received: from localhost (ovpn-115-49.ams2.redhat.com [10.36.115.49]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F94619D61; Thu, 25 Jun 2020 13:57:58 +0000 (UTC) From: Stefan Hajnoczi To: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, "Michael S. Tsirkin" , Jason Wang Subject: [RFC 1/3] virtio-pci: use NUMA-aware memory allocation in probe Date: Thu, 25 Jun 2020 14:57:50 +0100 Message-Id: <20200625135752.227293-2-stefanha@redhat.com> In-Reply-To: <20200625135752.227293-1-stefanha@redhat.com> References: <20200625135752.227293-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allocate frequently-accessed data structures from the NUMA node associated with this virtio-pci device. This avoids slow cross-NUMA node memory accesses. Only the following memory allocations are made NUMA-aware: 1. Called during probe. If called in the data path then hopefully we're executing on a CPU in the same NUMA node as the device. If the CPU is not in the right NUMA node then it's unclear whether forcing memory allocations to use the device's NUMA node will increase or decrease performance. 2. Memory will be frequently accessed from the data path. There is no need to worry about data that is not accessed from performance-critical code paths. Signed-off-by: Stefan Hajnoczi --- drivers/virtio/virtio_pci_common.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 222d630c41fc..cc6e49f9c698 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -178,11 +178,13 @@ static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, u16 msix_vec) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); - struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL); + int node = dev_to_node(&vdev->dev); + struct virtio_pci_vq_info *info; struct virtqueue *vq; unsigned long flags; /* fill out our structure that represents an active queue */ + info = kmalloc_node(sizeof *info, GFP_KERNEL, node); if (!info) return ERR_PTR(-ENOMEM); @@ -283,10 +285,12 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, struct irq_affinity *desc) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); + int node = dev_to_node(&vdev->dev); u16 msix_vec; int i, err, nvectors, allocated_vectors, queue_idx = 0; - vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL); + vp_dev->vqs = kcalloc_node(nvqs, sizeof(*vp_dev->vqs), + GFP_KERNEL, node); if (!vp_dev->vqs) return -ENOMEM; @@ -355,9 +359,11 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs, const char * const names[], const bool *ctx) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); + int node = dev_to_node(&vdev->dev); int i, err, queue_idx = 0; - vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL); + vp_dev->vqs = kcalloc_node(nvqs, sizeof(*vp_dev->vqs), + GFP_KERNEL, node); if (!vp_dev->vqs) return -ENOMEM; @@ -513,10 +519,12 @@ static int virtio_pci_probe(struct pci_dev *pci_dev, const struct pci_device_id *id) { struct virtio_pci_device *vp_dev, *reg_dev = NULL; + int node = dev_to_node(&pci_dev->dev); int rc; /* allocate our structure and fill it out */ - vp_dev = kzalloc(sizeof(struct virtio_pci_device), GFP_KERNEL); + vp_dev = kzalloc_node(sizeof(struct virtio_pci_device), + GFP_KERNEL, node); if (!vp_dev) return -ENOMEM; From patchwork Thu Jun 25 13:57:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 11625409 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33C096C1 for ; Thu, 25 Jun 2020 13:58:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 179E1206FA for ; Thu, 25 Jun 2020 13:58:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OmwwvlTw" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405248AbgFYN6O (ORCPT ); Thu, 25 Jun 2020 09:58:14 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:36328 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405244AbgFYN6O (ORCPT ); Thu, 25 Jun 2020 09:58:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593093492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+NdyvoauYhpazCCZCRwG2P58IXyrs8oA1xrut4UitX8=; b=OmwwvlTwm/41UyfZJNarDIU1W8kfBBeiYNRMtMyzIxKwj0CgXKkGDNgmD8VWDwtL01RunK vb1ypxER8+z1i6CDQAfZqPtHFf0WjAsJrAwFgvClRfkzTPpP7qY07sVlP1sjz+0ugwIZPP kcSoDXx/LqlP2RzGCx8ACGdJ9BFtsME= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-493-OkTv-wmOPaqh5OxbKzlzpg-1; Thu, 25 Jun 2020 09:58:08 -0400 X-MC-Unique: OkTv-wmOPaqh5OxbKzlzpg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 44B15800C60; Thu, 25 Jun 2020 13:58:07 +0000 (UTC) Received: from localhost (ovpn-115-49.ams2.redhat.com [10.36.115.49]) by smtp.corp.redhat.com (Postfix) with ESMTP id BC3E17C1EC; Thu, 25 Jun 2020 13:58:03 +0000 (UTC) From: Stefan Hajnoczi To: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, "Michael S. Tsirkin" , Jason Wang Subject: [RFC 2/3] virtio_ring: use NUMA-aware memory allocation in probe Date: Thu, 25 Jun 2020 14:57:51 +0100 Message-Id: <20200625135752.227293-3-stefanha@redhat.com> In-Reply-To: <20200625135752.227293-1-stefanha@redhat.com> References: <20200625135752.227293-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allocate frequently-accessed data structures from the NUMA node associated with this device to avoid slow cross-NUMA node memory accesses. Only the following memory allocations are made NUMA-aware: 1. Called during probe. If called in the data path then hopefully we're executing on a CPU in the same NUMA node as the device. If the CPU is not in the right NUMA node then it's unclear whether forcing memory allocations to use the device's NUMA node will increase or decrease performance. 2. Memory will be frequently accessed from the data path. There is no need to worry about data that is not accessed from performance-critical code paths. This patch adds a non-meminit alloc_pages_exact_nid() caller so I've removed the __meminit added by commit e19318116048 ("mm/page_alloc.c: add __meminit to alloc_pages_exact_nid()"). Cc: Fabian Frederick Cc: Andrew Morton Cc: Mel Gorman Signed-off-by: Stefan Hajnoczi --- I have included the alloc_pages_exact_nid() __meminit removal in this patch to provide context for reviewers. --- include/linux/gfp.h | 2 +- drivers/virtio/virtio_ring.c | 26 +++++++++++++++++--------- mm/page_alloc.c | 2 +- 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 4aba4c86c626..9b69df707c7a 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -563,7 +563,7 @@ extern unsigned long get_zeroed_page(gfp_t gfp_mask); void *alloc_pages_exact(size_t size, gfp_t gfp_mask); void free_pages_exact(void *virt, size_t size); -void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask); +void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask); #define __get_free_page(gfp_mask) \ __get_free_pages((gfp_mask), 0) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 58b96baa8d48..d06b42309bed 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -276,7 +276,9 @@ static void *vring_alloc_queue(struct virtio_device *vdev, size_t size, return dma_alloc_coherent(vdev->dev.parent, size, dma_handle, flag); } else { - void *queue = alloc_pages_exact(PAGE_ALIGN(size), flag); + int node = dev_to_node(&vdev->dev); + void *queue = alloc_pages_exact_nid(node, PAGE_ALIGN(size), + flag); if (queue) { phys_addr_t phys_addr = virt_to_phys(queue); @@ -1567,6 +1569,7 @@ static struct virtqueue *vring_create_virtqueue_packed( struct vring_packed_desc_event *driver, *device; dma_addr_t ring_dma_addr, driver_event_dma_addr, device_event_dma_addr; size_t ring_size_in_bytes, event_size_in_bytes; + int node = dev_to_node(&vdev->dev); unsigned int i; ring_size_in_bytes = num * sizeof(struct vring_packed_desc); @@ -1591,7 +1594,7 @@ static struct virtqueue *vring_create_virtqueue_packed( if (!device) goto err_device; - vq = kmalloc(sizeof(*vq), GFP_KERNEL); + vq = kmalloc_node(sizeof(*vq), GFP_KERNEL, node); if (!vq) goto err_vq; @@ -1639,9 +1642,10 @@ static struct virtqueue *vring_create_virtqueue_packed( vq->packed.event_flags_shadow = 0; vq->packed.avail_used_flags = 1 << VRING_PACKED_DESC_F_AVAIL; - vq->packed.desc_state = kmalloc_array(num, + vq->packed.desc_state = kmalloc_array_node(num, sizeof(struct vring_desc_state_packed), - GFP_KERNEL); + GFP_KERNEL, + node); if (!vq->packed.desc_state) goto err_desc_state; @@ -1653,9 +1657,10 @@ static struct virtqueue *vring_create_virtqueue_packed( for (i = 0; i < num-1; i++) vq->packed.desc_state[i].next = i + 1; - vq->packed.desc_extra = kmalloc_array(num, + vq->packed.desc_extra = kmalloc_array_node(num, sizeof(struct vring_desc_extra_packed), - GFP_KERNEL); + GFP_KERNEL, + node); if (!vq->packed.desc_extra) goto err_desc_extra; @@ -2059,13 +2064,14 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, void (*callback)(struct virtqueue *), const char *name) { + int node = dev_to_node(&vdev->dev); unsigned int i; struct vring_virtqueue *vq; if (virtio_has_feature(vdev, VIRTIO_F_RING_PACKED)) return NULL; - vq = kmalloc(sizeof(*vq), GFP_KERNEL); + vq = kmalloc_node(sizeof(*vq), GFP_KERNEL, node); if (!vq) return NULL; @@ -2110,8 +2116,10 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index, vq->split.avail_flags_shadow); } - vq->split.desc_state = kmalloc_array(vring.num, - sizeof(struct vring_desc_state_split), GFP_KERNEL); + vq->split.desc_state = kmalloc_array_node(vring.num, + sizeof(struct vring_desc_state_split), + GFP_KERNEL, + node); if (!vq->split.desc_state) { kfree(vq); return NULL; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 13cc653122b7..2216022d8987 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5053,7 +5053,7 @@ EXPORT_SYMBOL(alloc_pages_exact); * * Return: pointer to the allocated area or %NULL in case of error. */ -void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) +void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) { unsigned int order = get_order(size); struct page *p; From patchwork Thu Jun 25 13:57:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 11625407 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 939DA14B7 for ; Thu, 25 Jun 2020 13:58:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 76710206FA for ; Thu, 25 Jun 2020 13:58:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TGA8QPs1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405221AbgFYN6N (ORCPT ); Thu, 25 Jun 2020 09:58:13 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:34079 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2405218AbgFYN6N (ORCPT ); Thu, 25 Jun 2020 09:58:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593093492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HP9THfq0e57aimyJmnA5TRP+h5ylmLpPTeTOZs2TrL8=; b=TGA8QPs1CLc/mZaLSCExffKxwkck00FCcBzwTxgWfdbIHMbPs1JjUBcxPDW7yGLeUXLNU6 5nyM0aTgl6hro70k9ZWq7NFbCsYxpuUPH1dTDuag6EbUg7lX/gWWTQAolpnotin37Chrad zGaNrRq3hcVAarw3jpngMJVcVzJRFx4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-159-qAq2OK1TOByuX1mhSoJ6uQ-1; Thu, 25 Jun 2020 09:58:10 -0400 X-MC-Unique: qAq2OK1TOByuX1mhSoJ6uQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4EBC7464; Thu, 25 Jun 2020 13:58:09 +0000 (UTC) Received: from localhost (ovpn-115-49.ams2.redhat.com [10.36.115.49]) by smtp.corp.redhat.com (Postfix) with ESMTP id D98C860C1D; Thu, 25 Jun 2020 13:58:08 +0000 (UTC) From: Stefan Hajnoczi To: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, "Michael S. Tsirkin" , Jason Wang Subject: [RFC 3/3] virtio-blk: use NUMA-aware memory allocation in probe Date: Thu, 25 Jun 2020 14:57:52 +0100 Message-Id: <20200625135752.227293-4-stefanha@redhat.com> In-Reply-To: <20200625135752.227293-1-stefanha@redhat.com> References: <20200625135752.227293-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allocate frequently-accessed data structures from the NUMA node associated with this device to avoid slow cross-NUMA node memory accesses. Only the following memory allocations are made NUMA-aware: 1. Called during probe. If called in the data path then hopefully we're executing on a CPU in the same NUMA node as the device. If the CPU is not in the right NUMA node then it's unclear whether forcing memory allocations to use the device's NUMA node will increase or decrease performance. 2. Memory will be frequently accessed from the data path. There is no need to worry about data that is not accessed from performance-critical code paths. Signed-off-by: Stefan Hajnoczi --- drivers/block/virtio_blk.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 9d21bf0f155e..40845e9ad3b1 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -482,6 +482,7 @@ static int init_vq(struct virtio_blk *vblk) unsigned short num_vqs; struct virtio_device *vdev = vblk->vdev; struct irq_affinity desc = { 0, }; + int node = dev_to_node(&vdev->dev); err = virtio_cread_feature(vdev, VIRTIO_BLK_F_MQ, struct virtio_blk_config, num_queues, @@ -491,7 +492,8 @@ static int init_vq(struct virtio_blk *vblk) num_vqs = min_t(unsigned int, nr_cpu_ids, num_vqs); - vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL); + vblk->vqs = kmalloc_array_node(num_vqs, sizeof(*vblk->vqs), + GFP_KERNEL, node); if (!vblk->vqs) return -ENOMEM; @@ -683,6 +685,7 @@ module_param_named(queue_depth, virtblk_queue_depth, uint, 0444); static int virtblk_probe(struct virtio_device *vdev) { + int node = dev_to_node(&vdev->dev); struct virtio_blk *vblk; struct request_queue *q; int err, index; @@ -714,7 +717,7 @@ static int virtblk_probe(struct virtio_device *vdev) /* We need an extra sg elements at head and tail. */ sg_elems += 2; - vdev->priv = vblk = kmalloc(sizeof(*vblk), GFP_KERNEL); + vdev->priv = vblk = kmalloc_node(sizeof(*vblk), GFP_KERNEL, node); if (!vblk) { err = -ENOMEM; goto out_free_index;