diff mbox series

[1/1] scsi: target: Fix write perf due to unneeded throttling

Message ID 20230817192902.346791-1-michael.christie@oracle.com (mailing list archive)
State Accepted
Headers show
Series [1/1] scsi: target: Fix write perf due to unneeded throttling | expand

Commit Message

Mike Christie Aug. 17, 2023, 7:29 p.m. UTC
The write back throttling (WBT) code checks if REQ_SYNC | REQ_IDLE is set
to determine if a write is O_DIRECT vs buffered. If the bits are not set
then it assumes it's a buffered write and will throttle LIO if we hit
certain metrics. LIO itself is not using the buffer cache and is doing
direct IO, so this has us set the direct bits so we are not throttled.

When the initiator application is doing direct IO this can greatly
improve performance. It depends on the backend device but we have seen
where the WBT code is throttling writes to only 20K IOPs with 4K IOs when
the device can support 100K+.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/target/target_core_iblock.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Martin K. Petersen Aug. 21, 2023, 9:21 p.m. UTC | #1
Mike,

> The write back throttling (WBT) code checks if REQ_SYNC | REQ_IDLE is
> set to determine if a write is O_DIRECT vs buffered. If the bits are
> not set then it assumes it's a buffered write and will throttle LIO if
> we hit certain metrics. LIO itself is not using the buffer cache and
> is doing direct IO, so this has us set the direct bits so we are not
> throttled.
>
> When the initiator application is doing direct IO this can greatly
> improve performance. It depends on the backend device but we have seen
> where the WBT code is throttling writes to only 20K IOPs with 4K IOs
> when the device can support 100K+.

Applied to 6.6/scsi-staging, thanks!
Martin K. Petersen Aug. 25, 2023, 1:12 a.m. UTC | #2
On Thu, 17 Aug 2023 14:29:02 -0500, Mike Christie wrote:

> The write back throttling (WBT) code checks if REQ_SYNC | REQ_IDLE is set
> to determine if a write is O_DIRECT vs buffered. If the bits are not set
> then it assumes it's a buffered write and will throttle LIO if we hit
> certain metrics. LIO itself is not using the buffer cache and is doing
> direct IO, so this has us set the direct bits so we are not throttled.
> 
> When the initiator application is doing direct IO this can greatly
> improve performance. It depends on the backend device but we have seen
> where the WBT code is throttling writes to only 20K IOPs with 4K IOs when
> the device can support 100K+.
> 
> [...]

Applied to 6.6/scsi-queue, thanks!

[1/1] scsi: target: Fix write perf due to unneeded throttling
      https://git.kernel.org/mkp/scsi/c/84c073fd89de
diff mbox series

Patch

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 3d1b511ea284..5937a7ed6989 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -740,11 +740,16 @@  iblock_execute_rw(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 
 	if (data_direction == DMA_TO_DEVICE) {
 		struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
+
+		/*
+		 * Set bits to indicate WRITE_ODIRECT so we are not throttled
+		 * by WBT.
+		 */
+		opf = REQ_OP_WRITE | REQ_SYNC | REQ_IDLE;
 		/*
 		 * Force writethrough using REQ_FUA if a volatile write cache
 		 * is not enabled, or if initiator set the Force Unit Access bit.
 		 */
-		opf = REQ_OP_WRITE;
 		miter_dir = SG_MITER_TO_SG;
 		if (bdev_fua(ib_dev->ibd_bd)) {
 			if (cmd->se_cmd_flags & SCF_FUA)