From patchwork Thu Aug 22 18:37:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774081 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 123D2EC5; Thu, 22 Aug 2024 18:37:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351870; cv=none; b=h9ZjD4rNEt3+dtPcXrPi4nWX5cqhtAnE9+537rBVShbKfxMabwycbYY4KTovlbNJGU2Ps44t2zqn19Sl0hLsCCaZuE0xENe2tSAVGIjzD+4QSvNqK6EiRw8o0B6Zom3vi2aVyICAkMtQg3EwJNfodIuhUWhMaz1r//rdR2vhJIA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351870; c=relaxed/simple; bh=1MYOjGCIldHgAGMqPUKMCrh2+ldW4PHbipuyT3s2aA8=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Sjp+13dZHw7HZOvPxfm6rdrX4e3QsJHk071PFel0YeOm0y17628jBfjn4Ntn471XAcxsHBhZQYJEd2Tt0aQAZYk5qqMqF4cb3Gr/BK3caIgYpOPYXjHMSwR3Wzi9HEYg4DOk/4jIwl5iaPYLKVGRTRbnJlVl2L4Ntth86DLcAaY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=EzZttTLD; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EzZttTLD" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-201e52ca0caso8967555ad.3; Thu, 22 Aug 2024 11:37:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351867; x=1724956667; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QjfMtdcDLGDQnBLpB5ievgILoCMnyG9pAUBPHV8Ka68=; b=EzZttTLDRnrQZ1OMSLYFaYoxLcomxvv8vmcoj0MUUW2IWXK9bKE64cYxPQHWshyEWd 6AL+nxcxxo+cvRZtEqn8OUBHTLpTHEHu5Dbn96pzOmtM3Wf99fYdly9u0k1HTJVT6mPx kYmDSvgqooZrGUsQWAZKGMFElor18IOgXmdPy3X396lVPfTH204BzKsL+FkpLvn8kfO0 meyAYORHsC+x016OeY0kmzFpuiMUbRPqck/qeyUKI8iTunhOYXvvVzcJj4P39r3cO0aq Q3E67xE5TwIYhamlwCqlsfT3tKwuZX/urfxQe3813bcdw4jMsGy05AYK+CZyseorpmvQ Ooow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351867; x=1724956667; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QjfMtdcDLGDQnBLpB5ievgILoCMnyG9pAUBPHV8Ka68=; b=wjt94UsKOyt8VEeLgqYRYJEuu4yy17Uv9aNbXDGD9kj6r6Vz1AIU5BUevCdDPP3n7N iaTcgW9kHRi+DmhkpYY7BHeLhotLcQfpbGHe2AjN41jCAoid9BY9oQIYDD+yFp6vEmFO 9kYkovXf6BnBY5nYhNLsiqX/5hDYM+xzuIY7av/c/1sxEpdLvQBmTyafXfgfnu/m6iW5 eUPSzCwnkbLilVQJtx9b595od0++2CLSzCghQ5RLmUx82lKnVglEpA0fT9Vd8W+A9380 64fJGrup6+BTfhSXZX/IR4rPWdBf6Br5KCR0EfJBC9OKNPq9w02XKNIgbSyq09/zYnsv PlOQ== X-Forwarded-Encrypted: i=1; AJvYcCUDYn+V/nEydRRnXKYigTiBl38bbJW0ylGoHmTMQaVEjPs58W6Zh/1JLtyrMMK/fkfXBL62IgHKL1VyNR4=@vger.kernel.org, AJvYcCUg+eVhEvOUgkx6aru/gxaiWiX6OJg08mybEW3pQw2/yfqWUpRi6hue3Znzb+03OarcDZnFeXhWooVUJQ==@vger.kernel.org, AJvYcCV4D8EFUoc/MLNSIV6OnVON9FcK1iWxineP32+zoSWBS84/xVf3odMFC8cJu8rj33kzzOR9kVA4Tq98YiDD@vger.kernel.org X-Gm-Message-State: AOJu0Yw524xlW0klrHsz108qW7eekYWfXL1Ri7Lm5p2KNQXNNn5AUV4G yrOvRLeBBsX8D67VdsWdM/wyKLgL7bX0alwO7UfgUokqAh8Hyqz8 X-Google-Smtp-Source: AGHT+IF9sdhX5APdafE8rQTPJt8ZBQnqDD/V3BB+n+PgpkNa+qh0s6M1LpksFMMVRYmHv6tGpAsIfg== X-Received: by 2002:a17:902:ea02:b0:1fd:d56e:2c0d with SMTP id d9443c01a7336-20368197745mr58216485ad.31.1724351866979; Thu, 22 Aug 2024 11:37:46 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:46 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 1/7] swiotlb: Introduce swiotlb throttling Date: Thu, 22 Aug 2024 11:37:12 -0700 Message-Id: <20240822183718.1234-2-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley Implement throttling of swiotlb map requests. Because throttling requires temporarily pending some requests, throttling can only be used by map requests made in contexts that can block. Detecting such contexts at runtime is infeasible, so device driver code must be updated to add DMA_ATTR_MAY_BLOCK on map requests done in a context that can block. Even if a map request is throttled, the corresponding unmap request will never block, so unmap has no context restrictions, just like current code. If a swiotlb map request does *not* have DMA_ATTR_MAY_BLOCK, no throttling is done and there is no functional change. The goal of throttling is to reduce peak usage of swiotlb memory, particularly in environments like CoCo VMs which must use bounce buffering for all DMA I/O. These VMs currently allocate up to 1 GiB for swiotlb memory to ensure that it isn't exhausted. But for many workloads, this memory is effectively wasted because it can't be used for other purposes. Throttling can lower the swiotlb memory requirements without unduly raising the risk of exhaustion, thus making several hundred MiBs of additional memory available for general usage. The high-level implementation is as follows: 1. Each struct io_tlb_mem has a semaphore that is initialized to 1. A semaphore is used instead of a mutex because the semaphore likely won't be released by the same thread that obtained it. 2. Each struct io_tlb_mem has a swiotlb space usage level above which throttling is done. This usage level is initialized to 70% of the total size of that io_tlb_mem, and is tunable at runtime via /sys if CONFIG_DEBUG_FS is set. 3. When swiotlb_tbl_map_single() is invoked with throttling allowed, if the current usage of that io_tlb_mem is above the throttle level, the semaphore must be obtained before proceeding. The semaphore is then released by the corresponding swiotlb unmap call. If the semaphore is already held when swiotlb_tbl_map_single() must obtain it, the calling thread blocks until the semaphore is available. Once the thread obtains the semaphore, it proceeds to allocate swiotlb space in the usual way. The swiotlb map call saves throttling information in the io_tlb_slot, and then swiotlb unmap uses that information to determine if the semaphore is held. If so, it releases the semaphore, potentially allowing a queued request to proceed. Overall, the semaphore queues multiple waiters and wakes them up in the order in which they waited. Effectively, the semaphore single threads map/unmap pairs to reduce peak usage. 4. A "low throttle" level is also implemented and initialized to 65% of the total size of the io_tlb_mem. If the current usage is between the throttle level and the low throttle level, AND the semaphore is held, the requestor must obtain the semaphore. Consider if throttling occurs, so that one map request holds the semaphore, and three others are queued waiting for the semaphore. If swiotlb usage then drops because of unrelated unmap's, a new incoming map request may not get throttled, and bypass the three requests waiting in the semaphore queue. There's not a forward progress issue because the requests in the queue will complete as long as the underlying I/Os make forward progress. But to somewhat address the fairness issue, the low throttle level provides hysteresis in that new incoming requests continue to queue on the semaphore as long as used swiotlb memory is above that lower limit. 5. SGLs are handled in a subsequent patch. In #3 above the check for being above the throttle level is an instantaneous check with no locking and no reservation of space, to avoid atomic operations. Consequently, multiple threads could all make the check and decide they are under the throttle level. They can all proceed without obtaining the semaphore, and potentially generate a peak in usage. Furthermore, other DMA map requests that don't have throttling enabled proceed without even checking, and hence can also push usage toward a peak. So throttling can blunt and reduce peaks in swiotlb memory usage, but does it not guarantee to prevent exhaustion. Signed-off-by: Michael Kelley --- include/linux/dma-mapping.h | 8 +++ include/linux/swiotlb.h | 15 ++++- kernel/dma/Kconfig | 13 ++++ kernel/dma/swiotlb.c | 114 ++++++++++++++++++++++++++++++++---- 4 files changed, 136 insertions(+), 14 deletions(-) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index f693aafe221f..7b78294813be 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -62,6 +62,14 @@ */ #define DMA_ATTR_PRIVILEGED (1UL << 9) +/* + * DMA_ATTR_MAY_BLOCK: Indication by a driver that the DMA map request is + * allowed to block. This flag must only be used on DMA map requests made in + * contexts that allow blocking. The corresponding unmap request will not + * block. + */ +#define DMA_ATTR_MAY_BLOCK (1UL << 10) + /* * A dma_addr_t can hold any valid DMA or bus address for the platform. It can * be given to a device to use as a DMA source or target. It is specific to a diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 3dae0f592063..10d07d0ee00c 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -89,6 +89,10 @@ struct io_tlb_pool { * @defpool: Default (initial) IO TLB memory pool descriptor. * @pool: IO TLB memory pool descriptor (if not dynamic). * @nslabs: Total number of IO TLB slabs in all pools. + * @high_throttle: Slab count above which requests are throttled. + * @low_throttle: Slab count abouve which requests are throttled when + * throttle_sem is already held. + * @throttle_sem: Semaphore that throttled requests must obtain. * @debugfs: The dentry to debugfs. * @force_bounce: %true if swiotlb bouncing is forced * @for_alloc: %true if the pool is used for memory allocation @@ -104,10 +108,17 @@ struct io_tlb_pool { * in debugfs. * @transient_nslabs: The total number of slots in all transient pools that * are currently used across all areas. + * @high_throttle_count: Count of requests throttled because high_throttle + * was exceeded. + * @low_throttle_count: Count of requests throttled because low_throttle was + * exceeded and throttle_sem was already held. */ struct io_tlb_mem { struct io_tlb_pool defpool; unsigned long nslabs; + unsigned long high_throttle; + unsigned long low_throttle; + struct semaphore throttle_sem; struct dentry *debugfs; bool force_bounce; bool for_alloc; @@ -118,11 +129,11 @@ struct io_tlb_mem { struct list_head pools; struct work_struct dyn_alloc; #endif -#ifdef CONFIG_DEBUG_FS atomic_long_t total_used; atomic_long_t used_hiwater; atomic_long_t transient_nslabs; -#endif + unsigned long high_throttle_count; + unsigned long low_throttle_count; }; struct io_tlb_pool *__swiotlb_find_pool(struct device *dev, phys_addr_t paddr); diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index c06e56be0ca1..d45ba62f58c8 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -103,6 +103,19 @@ config SWIOTLB_DYNAMIC If unsure, say N. +config SWIOTLB_THROTTLE + bool "Throttle DMA map requests from enabled drivers" + default n + depends on SWIOTLB + help + Enable throttling of DMA map requests to help avoid exhausting + bounce buffer space, causing request failures. Throttling + applies only where the calling driver has enabled blocking in + DMA map requests. This option is most useful in CoCo VMs where + all DMA operations must go through bounce buffers. + + If unsure, say N. + config DMA_BOUNCE_UNALIGNED_KMALLOC bool depends on SWIOTLB diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index df68d29740a0..940b95cf02b7 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -71,12 +72,15 @@ * from each index. * @pad_slots: Number of preceding padding slots. Valid only in the first * allocated non-padding slot. + * @throttled: Boolean indicating the slot is used by a request that was + * throttled. Valid only in the first allocated non-padding slot. */ struct io_tlb_slot { phys_addr_t orig_addr; size_t alloc_size; unsigned short list; - unsigned short pad_slots; + u8 pad_slots; + u8 throttled; }; static bool swiotlb_force_bounce; @@ -249,6 +253,31 @@ static inline unsigned long nr_slots(u64 val) return DIV_ROUND_UP(val, IO_TLB_SIZE); } +#ifdef CONFIG_SWIOTLB_THROTTLE +static void init_throttling(struct io_tlb_mem *mem) +{ + sema_init(&mem->throttle_sem, 1); + + /* + * The default thresholds are somewhat arbitrary. They are + * conservative to allow space for devices that can't throttle and + * because the determination of whether to throttle is done without + * any atomicity. The low throttle exists to provide a modest amount + * of hysteresis so that the system doesn't flip rapidly between + * throttling and not throttling when usage fluctuates near the high + * throttle level. + */ + mem->high_throttle = (mem->nslabs * 70) / 100; + mem->low_throttle = (mem->nslabs * 65) / 100; +} +#else +static void init_throttling(struct io_tlb_mem *mem) +{ + mem->high_throttle = 0; + mem->low_throttle = 0; +} +#endif + /* * Early SWIOTLB allocation may be too early to allow an architecture to * perform the desired operations. This function allows the architecture to @@ -415,6 +444,8 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags, if (flags & SWIOTLB_VERBOSE) swiotlb_print_info(); + + init_throttling(&io_tlb_default_mem); } void __init swiotlb_init(bool addressing_limit, unsigned int flags) @@ -511,6 +542,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, swiotlb_init_io_tlb_pool(mem, virt_to_phys(vstart), nslabs, true, nareas); add_mem_pool(&io_tlb_default_mem, mem); + init_throttling(&io_tlb_default_mem); swiotlb_print_info(); return 0; @@ -947,7 +979,7 @@ static unsigned int wrap_area_index(struct io_tlb_pool *mem, unsigned int index) * function gives imprecise results because there's no locking across * multiple areas. */ -#ifdef CONFIG_DEBUG_FS +#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_SWIOTLB_THROTTLE) static void inc_used_and_hiwater(struct io_tlb_mem *mem, unsigned int nslots) { unsigned long old_hiwater, new_used; @@ -966,14 +998,14 @@ static void dec_used(struct io_tlb_mem *mem, unsigned int nslots) atomic_long_sub(nslots, &mem->total_used); } -#else /* !CONFIG_DEBUG_FS */ +#else /* !CONFIG_DEBUG_FS && !CONFIG_SWIOTLB_THROTTLE*/ static void inc_used_and_hiwater(struct io_tlb_mem *mem, unsigned int nslots) { } static void dec_used(struct io_tlb_mem *mem, unsigned int nslots) { } -#endif /* CONFIG_DEBUG_FS */ +#endif /* CONFIG_DEBUG_FS || CONFIG_SWIOTLB_THROTTLE */ #ifdef CONFIG_SWIOTLB_DYNAMIC #ifdef CONFIG_DEBUG_FS @@ -1277,7 +1309,7 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr, #endif /* CONFIG_SWIOTLB_DYNAMIC */ -#ifdef CONFIG_DEBUG_FS +#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_SWIOTLB_THROTTLE) /** * mem_used() - get number of used slots in an allocator @@ -1293,7 +1325,7 @@ static unsigned long mem_used(struct io_tlb_mem *mem) return atomic_long_read(&mem->total_used); } -#else /* !CONFIG_DEBUG_FS */ +#else /* !CONFIG_DEBUG_FS && !CONFIG_SWIOTLB_THROTTLE */ /** * mem_pool_used() - get number of used slots in a memory pool @@ -1373,6 +1405,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, struct io_tlb_mem *mem = dev->dma_io_tlb_mem; unsigned int offset; struct io_tlb_pool *pool; + bool throttle = false; unsigned int i; size_t size; int index; @@ -1398,6 +1431,32 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, dev_WARN_ONCE(dev, alloc_align_mask > ~PAGE_MASK, "Alloc alignment may prevent fulfilling requests with max mapping_size\n"); + if (IS_ENABLED(CONFIG_SWIOTLB_THROTTLE) && attrs & DMA_ATTR_MAY_BLOCK) { + unsigned long used = atomic_long_read(&mem->total_used); + + /* + * Determining whether to throttle is intentionally done without + * atomicity. For example, multiple requests could proceed in + * parallel when usage is just under the threshold, putting + * usage above the threshold by the aggregate size of the + * parallel requests. The thresholds must already be set + * conservatively because of drivers that can't enable + * throttling, so this slop in the accounting shouldn't be + * problem. It's better than the potential bottleneck of a + * globally synchronzied reservation mechanism. + */ + if (used > mem->high_throttle) { + throttle = true; + mem->high_throttle_count++; + } else if ((used > mem->low_throttle) && + (mem->throttle_sem.count <= 0)) { + throttle = true; + mem->low_throttle_count++; + } + if (throttle) + down(&mem->throttle_sem); + } + offset = swiotlb_align_offset(dev, alloc_align_mask, orig_addr); size = ALIGN(mapping_size + offset, alloc_align_mask + 1); index = swiotlb_find_slots(dev, orig_addr, size, alloc_align_mask, &pool); @@ -1406,6 +1465,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, dev_warn_ratelimited(dev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n", size, mem->nslabs, mem_used(mem)); + if (throttle) + up(&mem->throttle_sem); return (phys_addr_t)DMA_MAPPING_ERROR; } @@ -1424,6 +1485,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, offset &= (IO_TLB_SIZE - 1); index += pad_slots; pool->slots[index].pad_slots = pad_slots; + pool->slots[index].throttled = throttle; for (i = 0; i < (nr_slots(size) - pad_slots); i++) pool->slots[index + i].orig_addr = slot_addr(orig_addr, i); tlb_addr = slot_addr(pool->start, index) + offset; @@ -1440,7 +1502,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, return tlb_addr; } -static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, +static bool swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, struct io_tlb_pool *mem) { unsigned long flags; @@ -1448,8 +1510,10 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, int index, nslots, aindex; struct io_tlb_area *area; int count, i; + bool throttled; index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT; + throttled = mem->slots[index].throttled; index -= mem->slots[index].pad_slots; nslots = nr_slots(mem->slots[index].alloc_size + offset); aindex = index / mem->area_nslabs; @@ -1478,6 +1542,7 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, mem->slots[i].orig_addr = INVALID_PHYS_ADDR; mem->slots[i].alloc_size = 0; mem->slots[i].pad_slots = 0; + mem->slots[i].throttled = 0; } /* @@ -1492,6 +1557,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, spin_unlock_irqrestore(&area->lock, flags); dec_used(dev->dma_io_tlb_mem, nslots); + + return throttled; } #ifdef CONFIG_SWIOTLB_DYNAMIC @@ -1501,6 +1568,9 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, * @dev: Device which mapped the buffer. * @tlb_addr: Physical address within a bounce buffer. * @pool: Pointer to the transient memory pool to be checked and deleted. + * @throttled: If the function returns %true, return boolean indicating + * if the transient allocation was throttled. Not set if the + * function returns %false. * * Check whether the address belongs to a transient SWIOTLB memory pool. * If yes, then delete the pool. @@ -1508,11 +1578,18 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr, * Return: %true if @tlb_addr belonged to a transient pool that was released. */ static bool swiotlb_del_transient(struct device *dev, phys_addr_t tlb_addr, - struct io_tlb_pool *pool) + struct io_tlb_pool *pool, bool *throttled) { + unsigned int offset; + int index; + if (!pool->transient) return false; + offset = swiotlb_align_offset(dev, 0, tlb_addr); + index = (tlb_addr - offset - pool->start) >> IO_TLB_SHIFT; + *throttled = pool->slots[index].throttled; + dec_used(dev->dma_io_tlb_mem, pool->nslabs); swiotlb_del_pool(dev, pool); dec_transient_used(dev->dma_io_tlb_mem, pool->nslabs); @@ -1522,7 +1599,7 @@ static bool swiotlb_del_transient(struct device *dev, phys_addr_t tlb_addr, #else /* !CONFIG_SWIOTLB_DYNAMIC */ static inline bool swiotlb_del_transient(struct device *dev, - phys_addr_t tlb_addr, struct io_tlb_pool *pool) + phys_addr_t tlb_addr, struct io_tlb_pool *pool, bool *throttled) { return false; } @@ -1536,6 +1613,8 @@ void __swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr, size_t mapping_size, enum dma_data_direction dir, unsigned long attrs, struct io_tlb_pool *pool) { + bool throttled; + /* * First, sync the memory before unmapping the entry */ @@ -1544,9 +1623,11 @@ void __swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr, swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_FROM_DEVICE, pool); - if (swiotlb_del_transient(dev, tlb_addr, pool)) - return; - swiotlb_release_slots(dev, tlb_addr, pool); + if (!swiotlb_del_transient(dev, tlb_addr, pool, &throttled)) + throttled = swiotlb_release_slots(dev, tlb_addr, pool); + + if (throttled) + up(&dev->dma_io_tlb_mem->throttle_sem); } void __swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr, @@ -1719,6 +1800,14 @@ static void swiotlb_create_debugfs_files(struct io_tlb_mem *mem, return; debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, &mem->nslabs); + debugfs_create_ulong("high_throttle", 0600, mem->debugfs, + &mem->high_throttle); + debugfs_create_ulong("low_throttle", 0600, mem->debugfs, + &mem->low_throttle); + debugfs_create_ulong("high_throttle_count", 0600, mem->debugfs, + &mem->high_throttle_count); + debugfs_create_ulong("low_throttle_count", 0600, mem->debugfs, + &mem->low_throttle_count); debugfs_create_file("io_tlb_used", 0400, mem->debugfs, mem, &fops_io_tlb_used); debugfs_create_file("io_tlb_used_hiwater", 0600, mem->debugfs, mem, @@ -1841,6 +1930,7 @@ static int rmem_swiotlb_device_init(struct reserved_mem *rmem, INIT_LIST_HEAD_RCU(&mem->pools); #endif add_mem_pool(mem, pool); + init_throttling(mem); rmem->priv = mem; From patchwork Thu Aug 22 18:37:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774082 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07C611D04A3; Thu, 22 Aug 2024 18:37:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351870; cv=none; b=ERJa/NhxAT4ks4LN+rDYzwnSIw/bMPT/vpGeBCs9tiTAVRVqrzwqf3tplU/cDGvtjP4onwagcrUb++3M/2vLLQ1+Ft74BCvMul7oeL8LkuJswj2Ckz+JJx4r7d/qZCGKSG09lYE5X1a9IkuhdeVgc3ugcveMe7DCP/siLC5qlcw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351870; c=relaxed/simple; bh=t7o/ANRPJFs0l8TeamTdj8dI882TkX1rTKCkymHtYNQ=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sNwVVueQlv0VfXkpeXfO69G4Vwyj2HFtHfkAR7Acq+K56/hUH8Ecgxq2s/kihbcI2TZvd7808riO5yIx3SiGUD2V8TPpd8HLcl6W3ZqbNm7h5Wk8ok59u97R3sZ8ZoZnDVzakFdRm3nQGGag4Zv8ZdcbORoqP9p4MqqzQOERoP4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OzRi/wgo; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OzRi/wgo" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-20208830de8so9578615ad.1; Thu, 22 Aug 2024 11:37:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351868; x=1724956668; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A/ThqndoEMvfHog7jEmoHMSz79N0bCj0y64RGbOiEjY=; b=OzRi/wgoVrZDJ5Rkh+0iPMQIXeyjTV1ftxw2uki2dS9gMIIM2IvlcC8mqXvh4uyX4V xYIOxzeROfx0KcFaXUxID1rP6kMMEqInb8k2oWZIzIYjxvQftsKNj7FfVn5xOsl5uNq2 ic/tESASFCndVqudGGqlPU3MQhPcnwYWY6OYRvNv7xAyJ8poMoBjBytRuOUAKiIknvHH JLROu2RrrpZwFUR5l32nmnz4eorlO0oDmgkfhTTJF8Q//A1vRBMpLA3iwqEiGvB2h5QZ 6IRlDYfbwPVQPikjsKGXegSoWbOtFqJ0h7zejWogeggbKXAQCSM7DptLpALHvNR0vjm3 OWow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351868; x=1724956668; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=A/ThqndoEMvfHog7jEmoHMSz79N0bCj0y64RGbOiEjY=; b=HqALGnFQXwe5N1lefH5KIpRwrNr8+AWISCx5loojud4epryXbGuF0Fo9OzVnxWhe6k SjBb7OBuo0qnux1UW4BnfmHgS7cPzNG/PcsB8HLbw7zDEqLQopPGjk4py94MBFQDyxqB YssJ4y0mIfXrPr0BIb542Mdt08SOmrJt8qw1DwvbkFIq9AIrtfe42zwbxzHdyogZADnZ bD5Ffi/vcA5z4tBoIpBYBtc9p/qXmJjbqNrdt4VjkvJzCL84J6EF9ymCNnEAnx8Dfc6p 7l7QBDkHFtJdO6ZhyWi+d5zbx97OtjoT1QEZ8R8iTX7hsvPTrDNdxgkBvydM43QDHMso VqHw== X-Forwarded-Encrypted: i=1; AJvYcCUKuCWlU1Z0q7UkmU0mLAqETLzVRvPYIQlIrY2mbDGbfYpG/TeRfOcniPZrsWkQQly61YpjQgsi42Kb29hg@vger.kernel.org, AJvYcCW6n9KMU0VEGteCumo55XDN5T5x0PNyopOtOsD+73il4wwIy4dnFcrKWVLa5FJ+gA924XAvF/ySwQmOcA==@vger.kernel.org, AJvYcCXE/Y2TSuX6kiibKbGLXS0pR5SSDECz6gN0Hvbz57qL063JQ4zyb5QbGdfsYtDxxbkUso72SL9BDThd1Y4=@vger.kernel.org X-Gm-Message-State: AOJu0YzFz/losVgYPcgOG1tdi1UwYGh/fgealbrOK296eS0i+beSiuEN I1nXnR5Vy+ENhA+Ql8x2VyZ+8Y1zov1LofhHSzXsyzSgx8eTc6qm X-Google-Smtp-Source: AGHT+IE5HU0h4JUoQwfjPC2hkmon5bCJZ8hz8HnwTaO4uEWN3UXUQmXUdXTWmLmVascnml4B3X8Sbw== X-Received: by 2002:a17:902:d4ca:b0:202:162c:1f36 with SMTP id d9443c01a7336-20367d3b815mr78784005ad.36.1724351868111; Thu, 22 Aug 2024 11:37:48 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:47 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 2/7] dma: Handle swiotlb throttling for SGLs Date: Thu, 22 Aug 2024 11:37:13 -0700 Message-Id: <20240822183718.1234-3-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley When a DMA map request is for a SGL, each SGL entry results in an independent mapping operation. If the mapping requires a bounce buffer due to running in a CoCo VM or due to swiotlb=force on the boot line, swiotlb is invoked. If swiotlb throttling is enabled for the request, each SGL entry results in a separate throttling operation. This is problematic because a thread may be holding swiotlb memory while waiting for memory to become free. Resolve this problem by only allowing throttling on the 0th SGL entry. When unmapping the SGL, unmap entries 1 thru N-1 first, then unmap entry 0 so that the throttle isn't released until all swiotlb memory has been freed. Signed-off-by: Michael Kelley --- This approach to SGLs muddies the line between DMA direct and swiotlb throttling functionality. To keep the MAY_BLOCK attr fully generic, it should propagate to the mapping of all SGL entries. An alternate approach is to define an additional DMA attribute that is internal to the DMA layer. Instead of clearing MAX_BLOCK, this attr is added by dma_direct_map_sg() when mapping SGL entries other than the 0th entry. swiotlb would do throttling only when MAY_BLOCK is set and this new attr is not set. This approach has a modest amount of additional complexity. Given that we currently have no other users of the MAY_BLOCK attr, the conceptual cleanliness may not be warranted until we do. Thoughts? kernel/dma/direct.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 4480a3cd92e0..80e03c0838d4 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -438,6 +438,18 @@ void dma_direct_sync_sg_for_cpu(struct device *dev, arch_sync_dma_for_cpu_all(); } +static void dma_direct_unmap_sgl_entry(struct device *dev, + struct scatterlist *sgl, enum dma_data_direction dir, + unsigned long attrs) + +{ + if (sg_dma_is_bus_address(sgl)) + sg_dma_unmark_bus_address(sgl); + else + dma_direct_unmap_page(dev, sgl->dma_address, + sg_dma_len(sgl), dir, attrs); +} + /* * Unmaps segments, except for ones marked as pci_p2pdma which do not * require any further action as they contain a bus address. @@ -449,12 +461,20 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl, int i; for_each_sg(sgl, sg, nents, i) { - if (sg_dma_is_bus_address(sg)) - sg_dma_unmark_bus_address(sg); - else - dma_direct_unmap_page(dev, sg->dma_address, - sg_dma_len(sg), dir, attrs); + /* + * Skip the 0th SGL entry in case this SGL consists of + * throttled swiotlb mappings. In such a case, any other + * entries should be unmapped first since unmapping the + * 0th entry will release the throttle semaphore. + */ + if (!i) + continue; + dma_direct_unmap_sgl_entry(dev, sg, dir, attrs); } + + /* Now do the 0th SGL entry */ + if (nents) + dma_direct_unmap_sgl_entry(dev, sgl, dir, attrs); } #endif @@ -492,6 +512,11 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, ret = -EIO; goto out_unmap; } + + /* Allow only the 0th SGL entry to block */ + if (!i) + attrs &= ~DMA_ATTR_MAY_BLOCK; + sg_dma_len(sg) = sg->length; } From patchwork Thu Aug 22 18:37:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774083 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0297C1D04AD; Thu, 22 Aug 2024 18:37:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351871; cv=none; b=cYUkhbNOJx4bxcq05TtKDQiO4oW8zlHCDMLFp+XS9hsKxgbDSu2gqrBIwAfTaBFqxe8v6hfzAUtgffMSGn7x7iOOeIe8o0jofb6qH/AE7jWik36PIHE6Jp3zVLy0zQ4XBtVc/GZkZP7YkeyZpVeYO69YoAiQoZ6ZpQu23gwhvH4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351871; c=relaxed/simple; bh=NVFmBvAgubMJWBFshBt0g9Yy9k4ztAxKDlDYqd+NV3E=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TnDc70wGjR4qf5vfiPK4OWkK6ZCGbVWxulLd8QG+g9xXSPla4QEhkb08mhiP2TUJ9k2BNoiChLOqSO9E7NLYO5NLho1NozurVeHSsYiGDYIZzBewna/TyZ/PCYsfbJDSKKHyCksRjrdbzpIifiPJ1sGwH7GihBSPmiEGEis4yBI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LB2gUFjr; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LB2gUFjr" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1fc611a0f8cso10085975ad.2; Thu, 22 Aug 2024 11:37:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351869; x=1724956669; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AYiVVcr8/wboBmezLWjDsQm81ttJxTuVeu5qAg4kVjI=; b=LB2gUFjrgDRg9dMvjS2uMmjJStL4L7Y27GcE99FlX+3ALJfrDoUqyPaDbwpL8z6PKH h/sNdX2kcEooPHOdbQJyOiwbmrrvMq/C7KCng8DFXQFYKeY7q7B20eABLmiqLgmGvMSa pJ8BmKfgFFyRm6IWHGQAhN+PnDavXUrLIXctmPACc78K465ru2mstDm5XfIwkS2OSsP3 e/fmoxgKNABtOlnMryQ8F3K0GgXAveP47/jyLDO5cNQJ12joTOwhPLbZCOfx83Ra6DZy XSRLd7NwOMCljkZ63yTjYZQK4XoBNEqF/USfUSsCzcvchh5t/sIz3x361X1o7GycyDKM uyoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351869; x=1724956669; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=AYiVVcr8/wboBmezLWjDsQm81ttJxTuVeu5qAg4kVjI=; b=Y26DNJ7jRn1veU2M4AwyoASvzR4vqwQ3PFkJtRAW73jNdI9vb+4z7vCE6eDMHGYILd KfVCFVyUfNGeKc5s9FpqvHxJpNTrrXTcsLfjGe7ib5KBsxCH5eQkJ4iQ5/wEQWJXrpTD dmkULtFXE9uKiA8xfFUMTJylCKrqOfoKWfN9ApQlhgh2FB79A0A3L7nyGt+QvoUIptE5 U+uc5Wn6K3WW81Dgm9n2ciVw3UwmUKQgkD3z7g0+KtcqluuUYe7cOyHVegeZusK6IdkC CGcIXaieJIkVkd/0mo7PU9+1b78M+CVHNafVo7jk1/nZDjf+P1HV6hkLGIQ3XmIIrOX0 s/uA== X-Forwarded-Encrypted: i=1; AJvYcCVGhm83E9RRflMzYayJQji1mtLF8PyVpiGC68aBVj+JKaRt7YbBdNx91wtUSiiTWhQ0yx+0f+/BqoXfrrE=@vger.kernel.org, AJvYcCVse2JRJVxySDQxGvxUNMWhwvZm4koT14vSKBLOb4LoThZGuVTu0fmqeVulm3cOTsHvnxUfiJN6/I+2tut0@vger.kernel.org, AJvYcCWkjugV6s9oiPLFJMe3A0DXJKMlhPmgge2n0i+H6sxCGtXL8MhR8+sMIKP0TVV+ycSnMGqT5NzJNsrsBQ==@vger.kernel.org X-Gm-Message-State: AOJu0YxbdjjW00piTWCs1Iqu4sKOq8gdypL/4/T1EdNvEPbYBasuZtSU 4CAjbWxufbfYlaYN8ni/eh/E0MIV7Wq9GpdAKqULgQIazqOLCvfx X-Google-Smtp-Source: AGHT+IH8+45q6oXd8x8II3Jds1dxAoYSz31lmfPW/kQ0FnzP0cCsfuNQ8sVNo0lwZUKJpv9rCx8MxQ== X-Received: by 2002:a17:903:228d:b0:1fc:f65:cd8a with SMTP id d9443c01a7336-20367d1394dmr75819225ad.18.1724351869298; Thu, 22 Aug 2024 11:37:49 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:49 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 3/7] dma: Add function for drivers to know if allowing blocking is useful Date: Thu, 22 Aug 2024 11:37:14 -0700 Message-Id: <20240822183718.1234-4-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley With the addition of swiotlb throttling functionality, storage device drivers may want to know whether using the DMA_ATTR_MAY_BLOCK attribute is useful. In a CoCo VM or environment where swiotlb=force is used, the MAY_BLOCK attribute enables swiotlb throttling. But if throttling is not enable or useful, storage device drivers probably do not want to set BLK_MQ_F_BLOCKING at the blk-mq request queue level. Add function dma_recommend_may_block() that indicates whether the underlying implementation of the DMA map calls would benefit from allowing blocking. If the kernel was built with CONFIG_SWIOTLB_THROTTLE, and swiotlb=force is set (on the kernel command line or due to being a CoCo VM), this function returns true. Otherwise it returns false. Signed-off-by: Michael Kelley Reviewed-by: Petr Tesarik --- include/linux/dma-mapping.h | 5 +++++ kernel/dma/direct.c | 6 ++++++ kernel/dma/direct.h | 1 + kernel/dma/mapping.c | 10 ++++++++++ 4 files changed, 22 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 7b78294813be..ec2edf068218 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -145,6 +145,7 @@ int dma_set_mask(struct device *dev, u64 mask); int dma_set_coherent_mask(struct device *dev, u64 mask); u64 dma_get_required_mask(struct device *dev); bool dma_addressing_limited(struct device *dev); +bool dma_recommend_may_block(struct device *dev); size_t dma_max_mapping_size(struct device *dev); size_t dma_opt_mapping_size(struct device *dev); unsigned long dma_get_merge_boundary(struct device *dev); @@ -252,6 +253,10 @@ static inline bool dma_addressing_limited(struct device *dev) { return false; } +static inline bool dma_recommend_may_block(struct device *dev) +{ + return false; +} static inline size_t dma_max_mapping_size(struct device *dev) { return 0; diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 80e03c0838d4..34d14e4ace64 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -649,6 +649,12 @@ bool dma_direct_all_ram_mapped(struct device *dev) check_ram_in_range_map); } +bool dma_direct_recommend_may_block(struct device *dev) +{ + return IS_ENABLED(CONFIG_SWIOTLB_THROTTLE) && + is_swiotlb_force_bounce(dev); +} + size_t dma_direct_max_mapping_size(struct device *dev) { /* If SWIOTLB is active, use its maximum mapping size */ diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h index d2c0b7e632fc..63516a540276 100644 --- a/kernel/dma/direct.h +++ b/kernel/dma/direct.h @@ -21,6 +21,7 @@ bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr); int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, enum dma_data_direction dir, unsigned long attrs); bool dma_direct_all_ram_mapped(struct device *dev); +bool dma_direct_recommend_may_block(struct device *dev); size_t dma_direct_max_mapping_size(struct device *dev); #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index b1c18058d55f..832982bafd5a 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -858,6 +858,16 @@ bool dma_addressing_limited(struct device *dev) } EXPORT_SYMBOL_GPL(dma_addressing_limited); +bool dma_recommend_may_block(struct device *dev) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + + if (dma_map_direct(dev, ops)) + return dma_direct_recommend_may_block(dev); + return false; +} +EXPORT_SYMBOL_GPL(dma_recommend_may_block); + size_t dma_max_mapping_size(struct device *dev) { const struct dma_map_ops *ops = get_dma_ops(dev); From patchwork Thu Aug 22 18:37:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774084 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4CF801D0DEA; Thu, 22 Aug 2024 18:37:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351873; cv=none; b=GecaLQvsCWp1sn3r7gJ4oyx3RdxISjX+gdc9hoasOSFSBShzkg2XqqsMAmEmrdVax8sJ4qtvktNvbGJy7bibFCjZn8kia/AZy1/2WtOGFweOldrr48sQkN8IU3w4U+Ew4Wgsv+5J9dk2NKwSo9byRrx2AtbXy5owhDoT8yj73tc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351873; c=relaxed/simple; bh=htxc2DqPCapb6n7XzlxbqGRMERf/FZW0+ZevZ0VUTwQ=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ct6TOn77tv/TX4ffr+dz6LrlAw+vixa46+YiUX8Jvr74pCWaPcKBofVIBpu/lziWdI7WUgJ1argiGuD2kJuAWmH+KC+RuHGV/nMrhBzIA/wdpLeVrWjkDc7f4W9kLsjG1XL2VG/jbuZkq35+l1YPGwSW5PN5De85spkr3RBCqCA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WPvtYDEi; arc=none smtp.client-ip=209.85.214.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WPvtYDEi" Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-2021a99af5eso10369085ad.1; Thu, 22 Aug 2024 11:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351870; x=1724956670; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/FSttR6p2yucCC2C1ycZHoGqThcAXEf4kSal+8QkByQ=; b=WPvtYDEixXro0pOMsNsGfzSzierGQTN+12CbTMmNCg44GS74ds/lJJsqRxZKoQnoRR x5dVR5GEqxkzIzwjjkg6xD/EQTumLZjAlVjUkHBFE8OCXoWp6eCKg/lqr5ysssKLh+KO W5qgAfWPLI2sXqC3s1Oex1LL+xvD/Vu8kfW8tFRu347kwi/4CXC+LEaXppE1ivNLepv2 lJJCaodLGzWgnX6IKAk4rnokHFFt6VGOw5LfL06cKheouYqDNvQi/9+vQjuxXLRrefQB 0xXPLl4j4ApA1WyVLbHSzPnUAdQ+ALfceobkEvb9mgTahqzYMdjfJT/8Sem3Uo/BxUH7 rVzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351870; x=1724956670; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=/FSttR6p2yucCC2C1ycZHoGqThcAXEf4kSal+8QkByQ=; b=QzWasYh0vQYdm43KWY6ofECL8hXzdkAW4zePOfQxCe5IPgMdwjj3n7E+KWa/3oQWsw CFicQ6PdXkoKYwrDcGAwqxkfwbBrSASdba4o6hZnnCfeTxSb2K15vnksXNJ0esZCOkDF cAlQD0/sPizXpr1NU33AnR/CqdnT2B3r6PBFF1Wgbhb2t/pejLR2wKX5Sgkp4vaWe6jh BMfuuMY4+V0REinT1kFmf5Jg5Pzi0se2WsVL8WxaLGWjLlkcpwhEc/KUVKy62Hu3c7q/ aMG8wBi/CuR8ggw9tbM2AX7mdHWnrajO9OTe5WeBcb3dmJGTj8zPblgBhMgtaUkg58x9 A2Lg== X-Forwarded-Encrypted: i=1; AJvYcCVL6R8cBoW0QfB/FEmeXUg0HQGbUxisHdylsJxLQWKseu3cCJz910ZpPVDkm+VW8UOB/DmU2MFEzpXd39QP@vger.kernel.org, AJvYcCW8U+PXXNiJSKwvrcELrdxwJQU4TOzZ7WrfatiS45Dd12WHyXoUYm3u7AOYgc+V5kfOROqulU0pClbSbg==@vger.kernel.org, AJvYcCWsFihG8ghEcaOsw+HCWadP8xQdbfRnHYH9FH7ujj2V0GvI0pzH8pvxLk3TgbOo8Pc1IOX1sZ6SJTINVLk=@vger.kernel.org X-Gm-Message-State: AOJu0YzBI4X4++yGG+CfEhPOSf85Yg3p59H0pwqHc3S2W/dY8MlWB58X A5IJI1FYgCOoER3v+w7aP3zFU9LwpD9RI8bL06k6cyttcDuLzm8R X-Google-Smtp-Source: AGHT+IG0xFiyyhUEAVqUeNP4awQwMQmb0ao3xlY9kk2g+DZ4RUL1iV7dvaFZrYLPjHsqT4Fbvb/IyA== X-Received: by 2002:a17:902:ce85:b0:1fc:6740:3ce6 with SMTP id d9443c01a7336-2038822be0cmr36095425ad.20.1724351870494; Thu, 22 Aug 2024 11:37:50 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:50 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 4/7] scsi_lib_dma: Add _attrs variant of scsi_dma_map() Date: Thu, 22 Aug 2024 11:37:15 -0700 Message-Id: <20240822183718.1234-5-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley Extend the SCSI DMA mapping interfaces by adding the "_attrs" variant of scsi_dma_map(). This variant allows passing DMA_ATTR_* values, such as is needed to support swiotlb throttling. The existing scsi_dma_map() interface is unchanged, so no incompatibilities are introduced. Signed-off-by: Michael Kelley Reviewed-by: Petr Tesarik --- drivers/scsi/scsi_lib_dma.c | 13 +++++++------ include/scsi/scsi_cmnd.h | 7 +++++-- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/scsi/scsi_lib_dma.c b/drivers/scsi/scsi_lib_dma.c index 5723915275ad..34453a79be97 100644 --- a/drivers/scsi/scsi_lib_dma.c +++ b/drivers/scsi/scsi_lib_dma.c @@ -14,30 +14,31 @@ #include /** - * scsi_dma_map - perform DMA mapping against command's sg lists + * scsi_dma_map_attrs - perform DMA mapping against command's sg lists * @cmd: scsi command + * @attrs: DMA attribute flags * * Returns the number of sg lists actually used, zero if the sg lists * is NULL, or -ENOMEM if the mapping failed. */ -int scsi_dma_map(struct scsi_cmnd *cmd) +int scsi_dma_map_attrs(struct scsi_cmnd *cmd, unsigned long attrs) { int nseg = 0; if (scsi_sg_count(cmd)) { struct device *dev = cmd->device->host->dma_dev; - nseg = dma_map_sg(dev, scsi_sglist(cmd), scsi_sg_count(cmd), - cmd->sc_data_direction); + nseg = dma_map_sg_attrs(dev, scsi_sglist(cmd), + scsi_sg_count(cmd), cmd->sc_data_direction, attrs); if (unlikely(!nseg)) return -ENOMEM; } return nseg; } -EXPORT_SYMBOL(scsi_dma_map); +EXPORT_SYMBOL(scsi_dma_map_attrs); /** - * scsi_dma_unmap - unmap command's sg lists mapped by scsi_dma_map + * scsi_dma_unmap - unmap command's sg lists mapped by scsi_dma_map_attrs * @cmd: scsi command */ void scsi_dma_unmap(struct scsi_cmnd *cmd) diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index 45c40d200154..6603003bc588 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -170,11 +170,14 @@ extern void scsi_kunmap_atomic_sg(void *virt); blk_status_t scsi_alloc_sgtables(struct scsi_cmnd *cmd); void scsi_free_sgtables(struct scsi_cmnd *cmd); +#define scsi_dma_map(cmd) scsi_dma_map_attrs(cmd, 0) + #ifdef CONFIG_SCSI_DMA -extern int scsi_dma_map(struct scsi_cmnd *cmd); +extern int scsi_dma_map_attrs(struct scsi_cmnd *cmd, unsigned long attrs); extern void scsi_dma_unmap(struct scsi_cmnd *cmd); #else /* !CONFIG_SCSI_DMA */ -static inline int scsi_dma_map(struct scsi_cmnd *cmd) { return -ENOSYS; } +static inline int scsi_dma_map_attrs(struct scsi_cmnd *cmd, unsigned long attrs) + { return -ENOSYS; } static inline void scsi_dma_unmap(struct scsi_cmnd *cmd) { } #endif /* !CONFIG_SCSI_DMA */ From patchwork Thu Aug 22 18:37:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774085 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70AB71D172E; Thu, 22 Aug 2024 18:37:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351874; cv=none; b=r0h7pHuI6AM974cgK9k9kfMwPTYSfhN+m0ady8cZs8BzNQKnOvUm2qeYxaPce5cyTFapGaQaQnFkmtr9+ROpKNSo3NW4bq+p2V5TLpuO+3nJi8A9DKirb/i5WQ7fO9uQx0q7Ji6RsFQHOkzPs94ClqA//u8Eo5Gck1U7qxpmZGM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351874; c=relaxed/simple; bh=KQxA10ny3y7ONujVrMwgPmKcFbuUbkjllbN6PgK2qYA=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nguua3ER5QJZxL8TjvUabpbFhgXBIaJKZLgJwKDt0fKyrFSsYjQ6YdYP7PFoPgoD5t9tmpHbharJaqwgkjf0qBqeAv53XP2x4cyxtk/81DY0FdZfm+QQ7/iXTvslUTLisnVXho46V+DoYihdVdxvFlmDdTVqUx8UID/LSAZtCwY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZTIeQng3; arc=none smtp.client-ip=209.85.214.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZTIeQng3" Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-20203988f37so11041895ad.1; Thu, 22 Aug 2024 11:37:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351872; x=1724956672; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4JDb2lC6vkxiQiFZDx7okJ0uyYWyOojJCyTq1IY1j8o=; b=ZTIeQng36gCA1l50QZUdCs/jcsqPwQA6v6XwDoIvuQtBpMRFflLWzazyeKh8IbFVgV 5X3fiN4DabtNM5eoZLA6rk3lFXvTLxjmceyOpWcK+csNdv91xGlvGbB1ZZJT88rM2zp1 KbsD812IoK/w/2QbtVtSghLcf8cFeh1f+kGO46hYV6jvieH+muOcxg9HGgmGQrfsbRTI dtMteyrCox4BQGoiiIcgv2mZfjta1IhbPC4iV6OvK+PIleImPFXaUawtKN7MUgM1JNbU OaID+1zl5C2B1EgXKg2rJVX0W7TUEQA1Q3j3nKeYZ/qPeswygSAs/I+y9DmLmL7Pdd/F C87A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351872; x=1724956672; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4JDb2lC6vkxiQiFZDx7okJ0uyYWyOojJCyTq1IY1j8o=; b=RfhOSm3CekJXt5KaWw0vPfvE+JieM79ON4/kfTzrIEBG8ZKnng76xikZ5LIefZaMOP ngej+2a5s120OPsB1wov/gUtl/hJtKWHg+RQ6AvyneeNAHRW5iJAUQaxN2DpeMfeSiKW CjQ8npB5tQmYTfAM1ercvq4TXSy+A0hNu16cEEIbaaQg0w3hpNlCzI3Fj2IQoYo0ytKr v0geG++MMtsVsIv+bgWGK+VjV+szYKFnzTQ+qUyYGUzILvVy2tJ4+05amDKRtFggk6JL ZXL5BWbBmexpfGhqYQd7TDcmitfM9LVuSFZbGbqRoAKoK9kFixVoFKhGh2/betewWXem MBJg== X-Forwarded-Encrypted: i=1; AJvYcCUnIFPWpAWX6HMQ764cC60jqmQ3aF0dQdGFfnjSjAmT2V/7Up0GmlV+qG/WNSSpfWRo0Q7Uj2VCkZIbwA==@vger.kernel.org, AJvYcCV3fu8XaGZkZJrlskfLd3mZwJJyPXPAbbp8fypFYhAigtstdddasClmMZ4AJywMJFhkZIx34Ly7SZTFCmI=@vger.kernel.org, AJvYcCWourzPtJ+nbYPWZG/E4yr+CsYl45qQTgMm9MtcwtfFXLxUSAiDwb+q6LlIpLlp6d6Y+HgNQV7bKPmtFLSy@vger.kernel.org X-Gm-Message-State: AOJu0YzqO/8IczjN9xr/Qr6ECWwnM56cwKHSOUftaV5QVpjI7V+x2Llr wsfULn+wx0oNn/OVps8q2QgSYeVePiEjU/1JWTSvW5OI3NMTPgyk X-Google-Smtp-Source: AGHT+IEoJ1gpzNrcpekG/3JPJrpxwsytoY1vLLalws9b/ZRPgjfwB0Ov7rhSrw2br36PLdiErGBT6A== X-Received: by 2002:a17:903:1cc:b0:1fb:9b91:d7c9 with SMTP id d9443c01a7336-20367d32768mr57195805ad.4.1724351871710; Thu, 22 Aug 2024 11:37:51 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:51 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 5/7] scsi: storvsc: Enable swiotlb throttling Date: Thu, 22 Aug 2024 11:37:16 -0700 Message-Id: <20240822183718.1234-6-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley In a CoCo VM, all DMA-based I/O must use swiotlb bounce buffers because DMA cannot be done to private (encrypted) portions of VM memory. The bounce buffer memory is marked shared (decrypted) at boot time, so I/O is done to/from the bounce buffer memory and then copied by the CPU to/from the final target memory (i.e, "bounced"). Storage devices can be large consumers of bounce buffer memory because it is possible to have large numbers of I/Os in flight across multiple devices. Bounce buffer memory must be pre-allocated at boot time, and it is difficult to know how much memory to allocate to handle peak storage I/O loads. Consequently, bounce buffer memory is typically over-provisioned, which wastes memory, and may still not avoid a peak that exhausts bounce buffer memory and cause storage I/O errors. To solve this problem for Coco VMs running on Hyper-V, update the storvsc driver to permit bounce buffer throttling. First, use scsi_dma_map_attrs() instead of scsi_dma_map(). Then gate the throttling behavior on a DMA layer check indicating that throttling is useful, so that no change occurs in a non-CoCo VM. If throttling is useful, pass the DMA_ATTR_MAY_BLOCK attribute, and set the block queue flag indicating that the I/O request submission path may sleep, which could happen when throttling. With these options in place, DMA map requests are pended when necessary to reduce the likelihood of usage peaks caused by storvsc that could exhaust bounce buffer memory and generate errors. Signed-off-by: Michael Kelley --- drivers/scsi/storvsc_drv.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 7ceb982040a5..7bedd5502d07 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -457,6 +457,7 @@ struct hv_host_device { struct workqueue_struct *handle_error_wq; struct work_struct host_scan_work; struct Scsi_Host *host; + unsigned long dma_attrs; }; struct storvsc_scan_work { @@ -1810,7 +1811,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd) payload->range.len = length; payload->range.offset = offset_in_hvpg; - sg_count = scsi_dma_map(scmnd); + sg_count = scsi_dma_map_attrs(scmnd, host_dev->dma_attrs); if (sg_count < 0) { ret = SCSI_MLQUEUE_DEVICE_BUSY; goto err_free_payload; @@ -2030,6 +2031,12 @@ static int storvsc_probe(struct hv_device *device, * have an offset that is a multiple of HV_HYP_PAGE_SIZE. */ host->sg_tablesize = (max_xfer_bytes >> HV_HYP_PAGE_SHIFT) + 1; + + if (dma_recommend_may_block(&device->device)) { + host->queuecommand_may_block = true; + host_dev->dma_attrs = DMA_ATTR_MAY_BLOCK; + } + /* * For non-IDE disks, the host supports multiple channels. * Set the number of HW queues we are supporting. From patchwork Thu Aug 22 18:37:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774086 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 963FF1D1731; Thu, 22 Aug 2024 18:37:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351875; cv=none; b=Ft5rmdTecV3hWsvpT1sf3q0EazCULkU8kD2f2YHcnJCIAM+EJN0j59DtHzWFwkP/RVuPhiPKxeeEzOl310Yjda/5jPPfK24L/GW/m9zNeCoGFBa35rR7yrH4Ir8EjYqVzyTsIFe4yspNtEm2jSCDOM5V9YGeG+HwqRgxJG7L09M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351875; c=relaxed/simple; bh=h7/QaozXkfleYoirlRN16+9s8z8TVAPkUO7bp7owxBs=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GzjP1Up18RCsTZzo8oQgvCOMBvLE2/6Er++9/m1bBd0BrHhW6d4BC1UJHK6gscVY05KJaRSA4USZ4xcxlLuydOVls1v6VtbNpm7ytVGphyn0pBDPXLk3W6TqBD26iPTwgzaeiPYTQ7hk2Sa9UAaIg6tJW5sjYMr9ZRh2JzwwTIU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cWoyRjoO; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cWoyRjoO" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-202508cb8ebso8578655ad.3; Thu, 22 Aug 2024 11:37:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351873; x=1724956673; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6/YtNwLPh5pTk6gSo6t/PeLiCmJLTlpJ+ybh0IcUxoY=; b=cWoyRjoOdU1PCtjbfKWdEKBmTqMJr5AIkGJLWFS0HTb1eNf+BIFtH4hvUT/ddiWeV1 Z5/yohspZdjekhfBBEmQBY4SEuxw2U3YJjig1h/GBkFcCxE+wFwVYg4roqCnGxlF5FkR N71qV4auGHqHJSSx43UpV5474hAkG9VveNUwHK2rDhF7W9Lbaj1+3DbXiuP4msgI4xqr pvYNCvXLoafJUGSoR+MASI8u8yiNOe+0O/0mcccaip3k/3tjDy1rwrbwvos4tAqvaBth Uwf8x+4Q+ivuhIo0Tzfmn/Gv06eCOdwyiIA1l2ntG650z1Y3J9PYvJDk26qtQX4c8cPT UgZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351873; x=1724956673; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6/YtNwLPh5pTk6gSo6t/PeLiCmJLTlpJ+ybh0IcUxoY=; b=J/TlfNdg52M3iaoaRCZYvaML//W1O6+LooRuyWEPUe496gU8MIqMyyeF4Zw1nzN15G TDPpFL3CFtblM8kyt77V8fkd3QkjVRWNUKO+u+Tf7KfbBuZ3S3vCqPq5ft/NHNtVKVeA RmRx9HXiRH/48FR2D8+fEeM0PBIaVflWfSWZOhc9QFFVm3jfVsR8qMmgmmACa26S/MTg rJmCDXUdSXy9unqfgK38Ryvr7gGDVNO/e1+hTQdrIrr71Uam94DQ60RU6x0HzMgiE1Rj 6sEiYcaWGHs4dZwgMxpwYUWbwiDCrThy+3pG63ayC4rMfF8bqjCo2GpSBVJLIUovmp4I rMHQ== X-Forwarded-Encrypted: i=1; AJvYcCUzzKznzgNTwDmOWmvcR4m+OmK3pGt0H/JyGpCyQ/P049/NTNW3f5UxZgMotb4ZUefnJ0mD7KQsawikCno=@vger.kernel.org, AJvYcCVFFlfVanqWCvyAQWzU0g4jL0c9cL9nsvSMNU2bCcXIQBRVaPMNGC0xWQRi+XseLa/S7ZEa1PuFRY8KDg==@vger.kernel.org, AJvYcCVpO/r1qDnTJPVXURWpIu9d2zxsp3j7xJTZRgnJY3n33DV/tecmlGVGH3zcDLXysDhxpr2lgzMOG656Z+40@vger.kernel.org X-Gm-Message-State: AOJu0YyznbbK6v7iFa94a3rxwF9xGZ/rUDlNjgWsSp3mcUYgx2QZ0tcs FxT/RSajwX4tR6KGyjbokelxTF4NtbCWvAHKDihyq8TkGvdxfmP1 X-Google-Smtp-Source: AGHT+IHsbZYUaJ+JFNxFtzKvzmApZirKvGrVSmexHXzuMk4WrKRW/BVna5OKSIa9Rz2aGy+9FiDPnQ== X-Received: by 2002:a17:902:e84a:b0:202:1529:3b01 with SMTP id d9443c01a7336-20367e651d0mr80002225ad.39.1724351872813; Thu, 22 Aug 2024 11:37:52 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:52 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 6/7] nvme: Move BLK_MQ_F_BLOCKING indicator to struct nvme_ctrl Date: Thu, 22 Aug 2024 11:37:17 -0700 Message-Id: <20240822183718.1234-7-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley The NVMe setting that controls the BLK_MQ_F_BLOCKING flag on the request queue is currently a flag in struct nvme_ctrl_ops, where it is not writable. A new use case needs this flag to be writable based on a determination made during the NVMe device probe function. Move this setting to struct nvme_ctrl, and update the only user to set it in the new location. No functional change. Signed-off-by: Michael Kelley Reviewed-by: Petr Tesarik --- drivers/nvme/host/core.c | 4 ++-- drivers/nvme/host/nvme.h | 2 +- drivers/nvme/host/tcp.c | 3 ++- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 33fa01c599ad..f1ce325471f1 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4495,7 +4495,7 @@ int nvme_alloc_admin_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set, set->reserved_tags = 2; set->numa_node = ctrl->numa_node; set->flags = BLK_MQ_F_NO_SCHED; - if (ctrl->ops->flags & NVME_F_BLOCKING) + if (ctrl->blocking) set->flags |= BLK_MQ_F_BLOCKING; set->cmd_size = cmd_size; set->driver_data = ctrl; @@ -4565,7 +4565,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set, set->reserved_tags = 1; set->numa_node = ctrl->numa_node; set->flags = BLK_MQ_F_SHOULD_MERGE; - if (ctrl->ops->flags & NVME_F_BLOCKING) + if (ctrl->blocking) set->flags |= BLK_MQ_F_BLOCKING; set->cmd_size = cmd_size, set->driver_data = ctrl; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index ae5314d32943..28709f166cab 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -338,6 +338,7 @@ struct nvme_ctrl { unsigned int shutdown_timeout; unsigned int kato; bool subsystem; + bool blocking; unsigned long quirks; struct nvme_id_power_state psd[32]; struct nvme_effects_log *effects; @@ -546,7 +547,6 @@ struct nvme_ctrl_ops { unsigned int flags; #define NVME_F_FABRICS (1 << 0) #define NVME_F_METADATA_SUPPORTED (1 << 1) -#define NVME_F_BLOCKING (1 << 2) const struct attribute_group **dev_attr_groups; int (*reg_read32)(struct nvme_ctrl *ctrl, u32 off, u32 *val); diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 9ea6be0b0392..6b9fdf7dc1ac 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2658,7 +2658,7 @@ static const struct blk_mq_ops nvme_tcp_admin_mq_ops = { static const struct nvme_ctrl_ops nvme_tcp_ctrl_ops = { .name = "tcp", .module = THIS_MODULE, - .flags = NVME_F_FABRICS | NVME_F_BLOCKING, + .flags = NVME_F_FABRICS, .reg_read32 = nvmf_reg_read32, .reg_read64 = nvmf_reg_read64, .reg_write32 = nvmf_reg_write32, @@ -2762,6 +2762,7 @@ static struct nvme_tcp_ctrl *nvme_tcp_alloc_ctrl(struct device *dev, if (ret) goto out_kfree_queues; + ctrl->ctrl.blocking = true; return ctrl; out_kfree_queues: kfree(ctrl->queues); From patchwork Thu Aug 22 18:37:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Kelley X-Patchwork-Id: 13774087 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C64191D04A3; Thu, 22 Aug 2024 18:37:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351876; cv=none; b=sea4+MYfz5p9fgxsPi1gLdkeyaKmsvvml9/KpAgac8djYPvMm0RXsqD37T35J7ILqoto4aY9I2zn+VEHMfTiEL9vDIcCm1qdWov80moO1EXTtJb0LLRiszl4i1TFtqTANy6UFEmTNwTq++c2TVuvx4EXne9eNCNYHBGULtaI3Ug= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724351876; c=relaxed/simple; bh=vZUjHFmFvrYfCCwn47JEj2O7xPIiU/Pgnb18WTQj4is=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KsxOE+WsGBQ2KA6WPGpQJ3Qt/jPvGTChFdWvgT7E/pNBmK8FZvzewg4PNw7APvaHM4SIH+HcJGmhpOWnvX4tKRzMn9I9NGvqqGG+l4sS2RlgFAmCFAmXYRz/odaeKVNJuE6woZHKJw7gZOwevrDlBRBNUCbAPJMkf3RHLGtgJ1s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LHW2lTJA; arc=none smtp.client-ip=209.85.214.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LHW2lTJA" Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1fec34f94abso11359465ad.2; Thu, 22 Aug 2024 11:37:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724351874; x=1724956674; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1Hlc/i1KYMFZn1UW3B36jY0bAKWAgeG5KXhGYCaItEQ=; b=LHW2lTJAldE/Uz5Y5H0xOGSRgixDn/dgr6L7llcRddcOvgO+ccohnvpForui0qSCbA V/f8pci7cnQpRFFHwcxtSwtzPeRx4Obkl72eKKpYFHu8/bVigNwNj63kFs4IDhutsJmE 9ix7gy5DROKtaNfgqx4wHYWBWujZdwVjEMD1d4Ui+8jVlRzovHpQwWij1nJ3aIDOPRtb Oe46MiUmVKrDU7IPhKteXV6Pq3OdJYdASSr+0WNl7y0Rn2zSREEymGvWuR93gVfSmSMD +10mHbL1e9XyTOqDEvua/1wldrFy0SOPamrl4h2PSnSbMBSUhrqddFY+7ldtj3RI3JKR wH5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724351874; x=1724956674; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=1Hlc/i1KYMFZn1UW3B36jY0bAKWAgeG5KXhGYCaItEQ=; b=Am7GoIg9w2ciNZz5osSW596XcE9zFFxRhrbaxb9TMzHmnu/Vr3I7v6eCEEmAMIk66Y NpdUEhp8TjosYKDsKTQQIfVJ1zNIbQzkBC8FNu8swPupQKTSNC2BQPC+mmmb5pebw7tA 70Jfa4RVKyKYjmZfmir2IKaXjM/GJpfXTqOUJN0Vf7PjHvbo3nWWv/6DNzmVR5ZG9UjJ 0l3dahQKZoEEEqDcVEKqCxBEs4XKkXSXIfSyAGkCLc6MP6aTieEeyePARvx86DHa5szt onC8urdThoTpElQyJe3SPLstx4T3+p4gZqMANn9EQLk91vzc7bODdMR+O/KMUjeyW8hY 4GzQ== X-Forwarded-Encrypted: i=1; AJvYcCV823huBSNkDOulcDn43oz44jRs9dU8MDClv8GMX+abz8EHCGjlyZSaczzxOrv18P+VIWCuShYiryF3Sjo=@vger.kernel.org, AJvYcCVumV+b/QqllqIC1q4/7dQcuwf45Bc9NesCLGeVA1QUYBPeUPFKF9Syv+0qYb2ypUWV7e1mB/Y2QaSOww==@vger.kernel.org, AJvYcCXw8uxrs0l8OQ8ezRgBb+679gZqSICoReU1PMxiHWuXysQ4WNgo7VkuB8sM5dtMv84WxxgyHDEDZa4enGXd@vger.kernel.org X-Gm-Message-State: AOJu0YwRiavsOizqcw2BtifTbWGO1oVdGygi/9LceJ6nGyEuVKTNdcrd RneB80kQJL9cAJDN1kco8xj+Z/B5Ll61FP6YR3zbT7dMHYWpg9J+ X-Google-Smtp-Source: AGHT+IE+oTsuXbaq9ZvaSNrcslYkvt3uX+Tvp8+A11rBrdRfV8hOJKmntJKX/3gWBiFYZVMV3hYEuw== X-Received: by 2002:a17:902:d2ce:b0:202:13ca:d73e with SMTP id d9443c01a7336-20388240fbcmr30142745ad.28.1724351873947; Thu, 22 Aug 2024 11:37:53 -0700 (PDT) Received: from localhost.localdomain (c-67-160-120-253.hsd1.wa.comcast.net. [67.160.120.253]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2038557e4f9sm15667145ad.65.2024.08.22.11.37.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2024 11:37:53 -0700 (PDT) From: mhkelley58@gmail.com X-Google-Original-From: mhklinux@outlook.com To: kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me, James.Bottomley@HansenPartnership.com, martin.petersen@oracle.com, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, robin.murphy@arm.com, hch@lst.de, m.szyprowski@samsung.com, petr@tesarici.cz, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-coco@lists.linux.dev Subject: [RFC 7/7] nvme: Enable swiotlb throttling for NVMe PCI devices Date: Thu, 22 Aug 2024 11:37:18 -0700 Message-Id: <20240822183718.1234-8-mhklinux@outlook.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240822183718.1234-1-mhklinux@outlook.com> References: <20240822183718.1234-1-mhklinux@outlook.com> Reply-To: mhklinux@outlook.com Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Michael Kelley In a CoCo VM, all DMA-based I/O must use swiotlb bounce buffers because DMA cannot be done to private (encrypted) portions of VM memory. The bounce buffer memory is marked shared (decrypted) at boot time, so I/O is done to/from the bounce buffer memory and then copied by the CPU to/from the final target memory (i.e, "bounced"). Storage devices can be large consumers of bounce buffer memory because it is possible to have large numbers of I/Os in flight across multiple devices. Bounce buffer memory must be pre-allocated at boot time, and it is difficult to know how much memory to allocate to handle peak storage I/O loads. Consequently, bounce buffer memory is typically over-provisioned, which wastes memory, and may still not avoid a peak that exhausts bounce buffer memory and cause storage I/O errors. For Coco VMs running with NVMe PCI devices, update the driver to permit bounce buffer throttling. Gate the throttling behavior on a DMA layer check indicating that throttling is useful, so that no change occurs in a non-CoCo VM. If throttling is useful, enable the BLK_MQ_F_BLOCKING flag, and pass the DMA_ATTR_MAY_BLOCK attribute into dma_map_bvec() and dma_map_sgtable() calls. With these options in place, DMA map requests are pended when necessary to reduce the likelihood of usage peaks caused by the NVMe driver that could exhaust bounce buffer memory and generate errors. Signed-off-by: Michael Kelley Reviewed-by: Petr Tesarik --- drivers/nvme/host/pci.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 6cd9395ba9ec..2c39943a87f8 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -156,6 +156,7 @@ struct nvme_dev { dma_addr_t host_mem_descs_dma; struct nvme_host_mem_buf_desc *host_mem_descs; void **host_mem_desc_bufs; + unsigned long dma_attrs; unsigned int nr_allocated_queues; unsigned int nr_write_queues; unsigned int nr_poll_queues; @@ -735,7 +736,8 @@ static blk_status_t nvme_setup_prp_simple(struct nvme_dev *dev, unsigned int offset = bv->bv_offset & (NVME_CTRL_PAGE_SIZE - 1); unsigned int first_prp_len = NVME_CTRL_PAGE_SIZE - offset; - iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), 0); + iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), + dev->dma_attrs); if (dma_mapping_error(dev->dev, iod->first_dma)) return BLK_STS_RESOURCE; iod->dma_len = bv->bv_len; @@ -754,7 +756,8 @@ static blk_status_t nvme_setup_sgl_simple(struct nvme_dev *dev, { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), 0); + iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), + dev->dma_attrs); if (dma_mapping_error(dev->dev, iod->first_dma)) return BLK_STS_RESOURCE; iod->dma_len = bv->bv_len; @@ -800,7 +803,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, goto out_free_sg; rc = dma_map_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), - DMA_ATTR_NO_WARN); + dev->dma_attrs | DMA_ATTR_NO_WARN); if (rc) { if (rc == -EREMOTEIO) ret = BLK_STS_TARGET; @@ -828,7 +831,8 @@ static blk_status_t nvme_map_metadata(struct nvme_dev *dev, struct request *req, struct nvme_iod *iod = blk_mq_rq_to_pdu(req); struct bio_vec bv = rq_integrity_vec(req); - iod->meta_dma = dma_map_bvec(dev->dev, &bv, rq_dma_dir(req), 0); + iod->meta_dma = dma_map_bvec(dev->dev, &bv, rq_dma_dir(req), + dev->dma_attrs); if (dma_mapping_error(dev->dev, iod->meta_dma)) return BLK_STS_IOERR; cmnd->rw.metadata = cpu_to_le64(iod->meta_dma); @@ -3040,6 +3044,12 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev, * a single integrity segment for the separate metadata pointer. */ dev->ctrl.max_integrity_segments = 1; + + if (dma_recommend_may_block(dev->dev)) { + dev->ctrl.blocking = true; + dev->dma_attrs = DMA_ATTR_MAY_BLOCK; + } + return dev; out_put_device: