[v2,2/2] block: Add iocontext priority to request

Message ID	1475694007-11999-3-git-send-email-adam.manzanares@hgst.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Adam Manzanares <adam.manzanares@hgst.com> To: <axboe@kernel.dk>, <tj@kernel.org> CC: <linux-block@vger.kernel.org>, <linux-ide@vger.kernel.org>, Adam Manzanares <adam.manzanares@hgst.com> Subject: [PATCH v2 2/2] block: Add iocontext priority to request Date: Wed, 5 Oct 2016 12:00:07 -0700 Message-ID: <1475694007-11999-3-git-send-email-adam.manzanares@hgst.com> In-Reply-To: <1475694007-11999-1-git-send-email-adam.manzanares@hgst.com> References: <1475694007-11999-1-git-send-email-adam.manzanares@hgst.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (protection.outlook.com: hgst.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BLUPR0401MB1713; 23:cL8VhgKMkKllsDytlP3suiK81V5S9S3XNy1qqk9?= =?us-ascii?Q?HUXQgRDR43quEi1qo1z8IQ/eJO3Xb1d7NMzspLGNS23UueczBPRBD/zFEVZ9?= =?us-ascii?Q?sXjTmgh7/tFhMdRhtvmeQ65YAScAMC1bmDc4vSQJ4loN8UVrose8AqcqXTRm?= =?us-ascii?Q?t1XR+iwdicWFiipNW2KNhCSnI//HyOa6AP3faxm9XTN7MNMsWZuTP0teak85?= =?us-ascii?Q?UpFosmmKnd4cHcpMh6aHOK+6/7zlR61WzXdZ3IEQgfPlAHZgAytqc6D0SMbP?= =?us-ascii?Q?vIm06w/cvyUmk98D0zfb78iFGLThCFaBVcOMY9wgCRQLhBuy0eiKvejXNoAZ?= =?us-ascii?Q?rOi2z+C/RhIAJAI736zsDYAgfQztGvmOPQPUpDG78NSrWdW8HJ61P8RPqbCM?= =?us-ascii?Q?9s2Fcs0jRMvwA/nmpxv2RVz2Z2IXY3PFPToi9HyxvzxU+W88HSrFju+Dn7qr?= =?us-ascii?Q?IUisX1M0aQe85oEETVLzB6WBL5j27PZLrJ3VIqt3wmiGC9NrPohg/XtsfHvT?= =?us-ascii?Q?gyteTRdDJeLBBwQ0jr294l4vqarOEJbRj5M30VLdeNZaELACSQ4SPzmuEaMw?= =?us-ascii?Q?70sNz1hGv9z7WSjypmASuIptn54nTLru/nhO4DzS9XNBW/56W4YZTrF71+a9?= =?us-ascii?Q?amJLRYK75+uNy0rVbxOnPDCvlrTf1n8nc6M4JC5ff2ba+8CU9u8YGuRn9ZQw?= =?us-ascii?Q?GmTIg9vLmacKFnP6WKhzgCRJ9o5lPCgNMfyOwZyWKnDTMt9LjgThjruruETT?= =?us-ascii?Q?uZ6Bfds85rkLv1DuP0H+KGrw+RZLb6kVyLUQJrCm9Pxinr9rBfoaRqaZdd7X?= =?us-ascii?Q?z1Up7WhvsjjYCXuXYK+YNke3Ve10F5NNH9lAhnHEfsmt4h++iVGT/MVKjfy1?= =?us-ascii?Q?bHziPE97e7v7N7oF98ZL0aWUJQnbV57bPtiBTD4zeCnxbmLeceKzJ9hbEXqF?= =?us-ascii?Q?oz17MkNPSWXStqBmHUAko+139ViE25PJuQMKlb5x2YyuPZLcR/cXE/BWc5aR?= =?us-ascii?Q?S249y16LZajGEX6qguBjkrF6OiHwZnJgXB9BSMNY/lwri7DOParuxJ+7mcQ8?= =?us-ascii?Q?w/KGS34ONuJ/4n+2XvsIBcNBIE2x/Z8IFGe5aRocuF7g9coDxhjTij8yw0CX?= =?us-ascii?Q?M5Q3UaRn3GWl2GFsbV03Yiu+JUunno/my?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR0401MB1713; 6:kbH8kM48FXXbmsSOc2K84Vy9lCVsU57I1uyR2HMd9Id3IZpQ79uupWiecJYTiE6g7NhETKkONwQ6WFMinXzCxwRr8we7GJYyOeNNogthRw0dnXspmC60V70w3eHNb7VHbK/iqSHu/7H/3nLuFCP36zZGPcOPBedY5b4vdV5bmmUpDw4MgnnyEjRxS4YI4QHUBy7KqFT72ZhWQsjxw7I82M4j/JuRafh+x2a5C3C6anNODuU8mCtiJV1Z+MdMDEqvwNmwcBaOK0o5idP8vX2ouuD19vgIHE1ZH1zUpW/GARyaqljnC8Fybk4FUFp21NGSntC9KEEBJNKjsoON/jDOr7zdGtIE859Mdc/yno6Td/A=; 5:17S5Hn0f4IlP6HHiDkR4H0Hui5CBO71AZlS0H4ZIdo4sZ+DmhKn60TzHVNLil+EXjvl9wmXApmw8GxHEGHonZPDMBOyyrVzvL+fuTzpzhF0BY97uxvZixSSCkM2gMjgBl6qEyhK7mBB5vAFexslf40/DmjHx3emkYlL1pSEK6Ag=; 24:ImKhvUaKvvuVa3VBl38gMjxq5cyyZ66uafvzvm3QcNFrwecVMpue7BzTsj30OL6zBavdt2Is7JwGpMquB75hbpDdupWyOYD/lA1QGQgCZqU= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; BLUPR0401MB1713; 7:iKr+Qwwr/SktMp+RYSUdFZT0fCkaAwS/ERKW4TONl1zw2I5Xom2rchb3Th+e0qJyG9eR/qvvcSKChYhfLSWm0MOTNUiIP/RUsvu4dLINJ7LY7rwosMfX/KfRpjw6koXx5pWqcFEwtm6MyZ4E9AVt6TEIzHqQP2Vqfr7RbvpHqzyXOls9SJQSsCd7zbbwX9lsKEiHyC9fbNMY52e0J74C8r+yef8su333fs2JY+jUoB0RoLzmuXgDWFhR6dEicpcJYQbqlcywDNKJl6Na1KiBmKJdWpz4YWqGx+4aA8SgXgWrxUMVp5Urk6zOSVie+ZpGcPinCmAg5Tzn/uvPldrNCw==; 20:8qErnc26aPpWuWTKR7WgH/JBWHRGqo5+IAT5/4fkQCc8ngeav1rYQ9vkcZwHk1whMbLDOdtYJXlC2S7716X3FpohU+z/EhYdlevbDMxFgNo+V0A5ft6V2plb7F/YVPsoM3ZkaR+LtCLbUF3P7hOl+i6yzWtmqWX6GaqVzCaFVJqvz4GBWrOSLCz99BLD4pHhIQ/dRvhN+OYMK3CyTR7PrAtsSZl/Bf/r3UzWPo6Y/I0Q+KYsydLsASJtSNBE8oST Sender: linux-block-owner@vger.kernel.org Precedence: bulk

Message ID

1475694007-11999-3-git-send-email-adam.manzanares@hgst.com (mailing list archive)

State

New, archived

Headers

From: Adam Manzanares <adam.manzanares@hgst.com>
To: <axboe@kernel.dk>, <tj@kernel.org>
CC: <linux-block@vger.kernel.org>, <linux-ide@vger.kernel.org>,
	Adam Manzanares <adam.manzanares@hgst.com>
Subject: [PATCH v2 2/2] block: Add iocontext priority to request
Date: Wed, 5 Oct 2016 12:00:07 -0700
Message-ID: <1475694007-11999-3-git-send-email-adam.manzanares@hgst.com>
In-Reply-To: <1475694007-11999-1-git-send-email-adam.manzanares@hgst.com>
References: <1475694007-11999-1-git-send-email-adam.manzanares@hgst.com>
MIME-Version: 1.0
Content-Type: text/plain
Received-SPF: None (protection.outlook.com: hgst.com does not designate
	permitted sender hosts)
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Oct 2016 19:00:18.8663
	(UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR0401MB1713
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Commit Message

Adam Manzanares Oct. 5, 2016, 7 p.m. UTC

Patch adds an association between iocontext ioprio and the ioprio of
a request. This feature is only enabled if a queue flag is set to
indicate that requests should have ioprio associated with them. The
queue flag is exposed as the req_prio queue sysfs entry.

Signed-off-by: Adam Mananzanares <adam.manzanares@hgst.com>
---
 block/blk-core.c       |  8 +++++++-
 block/blk-sysfs.c      | 32 ++++++++++++++++++++++++++++++++
 include/linux/blkdev.h |  2 ++
 3 files changed, 41 insertions(+), 1 deletion(-)

Comments

Hannes Reinecke Oct. 6, 2016, 6:24 a.m. UTC | #1

On 10/05/2016 09:00 PM, Adam Manzanares wrote:
> Patch adds an association between iocontext ioprio and the ioprio of
> a request. This feature is only enabled if a queue flag is set to
> indicate that requests should have ioprio associated with them. The
> queue flag is exposed as the req_prio queue sysfs entry.
>
> Signed-off-by: Adam Mananzanares <adam.manzanares@hgst.com>
> ---
>  block/blk-core.c       |  8 +++++++-
>  block/blk-sysfs.c      | 32 ++++++++++++++++++++++++++++++++
>  include/linux/blkdev.h |  2 ++
>  3 files changed, 41 insertions(+), 1 deletion(-)
>
As the previous patch depends on this one it should actually be the 
first in the series.

But other than that:

Reviewed-by: Hannes Reinecke <hare@xuse.com>

Cheers,

Hannes

Jeff Moyer Oct. 6, 2016, 7:46 p.m. UTC | #2

Hi, Adam,

Adam Manzanares <adam.manzanares@hgst.com> writes:

> Patch adds an association between iocontext ioprio and the ioprio of
> a request. This feature is only enabled if a queue flag is set to
> indicate that requests should have ioprio associated with them. The
> queue flag is exposed as the req_prio queue sysfs entry.
>
> Signed-off-by: Adam Mananzanares <adam.manzanares@hgst.com>

I like the idea of the patch, but I have a few comments.

First, don't add a tunable, there's no need for it.  (And in the future,
if you do add tunables, document them.)  That should make your patch
much smaller.

> @@ -1648,6 +1649,7 @@ out:
>  
>  void init_request_from_bio(struct request *req, struct bio *bio)
>  {
> +	struct io_context *ioc = rq_ioc(bio);

That can return NULL, and you blindly dereference it later.

> @@ -1656,7 +1658,11 @@ void init_request_from_bio(struct request *req, struct bio *bio)
>  
>  	req->errors = 0;
>  	req->__sector = bio->bi_iter.bi_sector;
> -	req->ioprio = bio_prio(bio);
> +	if (blk_queue_req_prio(req->q))
> +		req->ioprio = ioprio_best(bio_prio(bio), ioc->ioprio);
> +	else
> +		req->ioprio = bio_prio(bio);
> +

If the bio actually has an ioprio (only happens for bcache at this
point), you should use it.  Something like this:

        req->ioprio = bio_prio(bio);
        if (!req->ioprio && ioc)
		req->ioprio = ioc->ioprio;

Finally, please re-order your series as Hannes suggested.

Thanks!
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Adam Manzanares Oct. 10, 2016, 8:37 p.m. UTC | #3

Hello Jeff,

The 10/06/2016 15:46, Jeff Moyer wrote:
> Hi, Adam,
> 
> Adam Manzanares <adam.manzanares@hgst.com> writes:
> 
> > Patch adds an association between iocontext ioprio and the ioprio of
> > a request. This feature is only enabled if a queue flag is set to
> > indicate that requests should have ioprio associated with them. The
> > queue flag is exposed as the req_prio queue sysfs entry.
> >
> > Signed-off-by: Adam Mananzanares <adam.manzanares@hgst.com>
> 
> I like the idea of the patch, but I have a few comments.
> 
> First, don't add a tunable, there's no need for it.  (And in the future,
> if you do add tunables, document them.)  That should make your patch
> much smaller.
> 

I have a strong preference for making this a tunable for the following 
reason. I am concerned that this could negatively impact performance if this 
feature is not properly implemented on a device. In addition, this feature 
can make a dramatic difference in the performance of prioritized vs 
non-prioritized IO. Priority IO is improved, but it comes at the cost of 
non-prioritized IO. If someone has tuned a system in such a way that things 
work well as is, I do not want to cause any surprises.

I can see the argument for not having the tunable in the block layer, but 
then we need to add a tunable to all request based drivers that may leverage
the iopriority information. This has the potential to generate a lot more 
code and documentation.  I also would like to use the tunable when the 
iopriority is set on the request so we can preserve the default behavior. 
This can be a concern when we have drivers that use request iopriority 
information, such as the fusion mptsas driver.

I will also document the tunable :) if we agree that it is necessary.

> > @@ -1648,6 +1649,7 @@ out:
> >  
> >  void init_request_from_bio(struct request *req, struct bio *bio)
> >  {
> > +	struct io_context *ioc = rq_ioc(bio);
> 
> That can return NULL, and you blindly dereference it later.
>

Ouch, this will be cleaned up in the next revision.

> > @@ -1656,7 +1658,11 @@ void init_request_from_bio(struct request *req, struct bio *bio)
> >  
> >  	req->errors = 0;
> >  	req->__sector = bio->bi_iter.bi_sector;
> > -	req->ioprio = bio_prio(bio);
> > +	if (blk_queue_req_prio(req->q))
> > +		req->ioprio = ioprio_best(bio_prio(bio), ioc->ioprio);
> > +	else
> > +		req->ioprio = bio_prio(bio);
> > +
> 
> If the bio actually has an ioprio (only happens for bcache at this
> point), you should use it.  Something like this:
> 
>         req->ioprio = bio_prio(bio);
>         if (!req->ioprio && ioc)
> 		req->ioprio = ioc->ioprio;
>

I caught this in the explanation of the first patch I sent out. I am still
assuming that this will be a tunable, but I will have the bio_prio take 
precedence in the next patch.

> Finally, please re-order your series as Hannes suggested.

Will do. 

> 
> Thanks!
> Jeff

Take care,
Adam
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/block/blk-core.c b/block/blk-core.c
index 36c7ac3..17c3ce5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -33,6 +33,7 @@ 
 #include <linux/ratelimit.h>
 #include <linux/pm_runtime.h>
 #include <linux/blk-cgroup.h>
+#include <linux/ioprio.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/block.h>
@@ -1648,6 +1649,7 @@  out:
 
 void init_request_from_bio(struct request *req, struct bio *bio)
 {
+	struct io_context *ioc = rq_ioc(bio);
 	req->cmd_type = REQ_TYPE_FS;
 
 	req->cmd_flags |= bio->bi_opf & REQ_COMMON_MASK;
@@ -1656,7 +1658,11 @@  void init_request_from_bio(struct request *req, struct bio *bio)
 
 	req->errors = 0;
 	req->__sector = bio->bi_iter.bi_sector;
-	req->ioprio = bio_prio(bio);
+	if (blk_queue_req_prio(req->q))
+		req->ioprio = ioprio_best(bio_prio(bio), ioc->ioprio);
+	else
+		req->ioprio = bio_prio(bio);
+
 	blk_rq_bio_prep(req->q, req, bio);
 }
 
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f87a7e7..268a71a 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -384,6 +384,31 @@  static ssize_t queue_dax_show(struct request_queue *q, char *page)
 	return queue_var_show(blk_queue_dax(q), page);
 }
 
+static ssize_t queue_req_prio_show(struct request_queue *q, char *page)
+{
+	return queue_var_show(blk_queue_req_prio(q), page);
+}
+
+static ssize_t queue_req_prio_store(struct request_queue *q, const char *page,
+				    size_t count)
+{
+	unsigned long req_prio_on;
+	ssize_t ret;
+
+	ret = queue_var_store(&req_prio_on, page, count);
+	if (ret < 0)
+		return ret;
+
+	spin_lock_irq(q->queue_lock);
+	if (req_prio_on)
+		queue_flag_set(QUEUE_FLAG_REQ_PRIO, q);
+	else
+		queue_flag_clear(QUEUE_FLAG_REQ_PRIO, q);
+	spin_unlock_irq(q->queue_lock);
+
+	return ret;
+}
+
 static struct queue_sysfs_entry queue_requests_entry = {
 	.attr = {.name = "nr_requests", .mode = S_IRUGO | S_IWUSR },
 	.show = queue_requests_show,
@@ -526,6 +551,12 @@  static struct queue_sysfs_entry queue_dax_entry = {
 	.show = queue_dax_show,
 };
 
+static struct queue_sysfs_entry queue_req_prio_entry = {
+	.attr = {.name = "req_prio", .mode = S_IRUGO | S_IWUSR },
+	.show = queue_req_prio_show,
+	.store = queue_req_prio_store,
+};
+
 static struct attribute *default_attrs[] = {
 	&queue_requests_entry.attr,
 	&queue_ra_entry.attr,
@@ -553,6 +584,7 @@  static struct attribute *default_attrs[] = {
 	&queue_poll_entry.attr,
 	&queue_wc_entry.attr,
 	&queue_dax_entry.attr,
+	&queue_req_prio_entry.attr,
 	NULL,
 };
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e79055c..23e1e2d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -505,6 +505,7 @@  struct request_queue {
 #define QUEUE_FLAG_FUA	       24	/* device supports FUA writes */
 #define QUEUE_FLAG_FLUSH_NQ    25	/* flush not queueuable */
 #define QUEUE_FLAG_DAX         26	/* device supports DAX */
+#define QUEUE_FLAG_REQ_PRIO    27	/* Use iocontext ioprio */
 
 #define QUEUE_FLAG_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_STACKABLE)	|	\
@@ -595,6 +596,7 @@  static inline void queue_flag_clear(unsigned int flag, struct request_queue *q)
 #define blk_queue_secure_erase(q) \
 	(test_bit(QUEUE_FLAG_SECERASE, &(q)->queue_flags))
 #define blk_queue_dax(q)	test_bit(QUEUE_FLAG_DAX, &(q)->queue_flags)
+#define blk_queue_req_prio(q)	test_bit(QUEUE_FLAG_REQ_PRIO, &(q)->queue_flags)
 
 #define blk_noretry_request(rq) \
 	((rq)->cmd_flags & (REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT| \