diff mbox series

[V2,1/2] scsi: core: avoid to pre-allocate big chunk for protection meta data

Message ID 20190424093540.15526-2-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series scis: core: avoid big pre-allocation for sg list | expand

Commit Message

Ming Lei April 24, 2019, 9:35 a.m. UTC
Now scsi_mq_setup_tags() pre-allocates a big buffer for protection
sg entries, and the buffer size is scsi_mq_sgl_size().

This way isn't correct, scsi_mq_sgl_size() is used to pre-allocate
sg entries for IO data. And the protection data buffer is much less,
for example, one 512byte sector needs 8byte protection data, and
the max sector number for one request is 2560(BLK_DEF_MAX_SECTORS),
so the max protection data size is just 20k.

The usual case is that one bio builds one single bip segment. Attribute
to bio split, bio merge is seldom done for big IO, and it is only done
in case of small bios. And protection data segment number is usually
same with bio count in the request, so the number won't be very big,
and allocating from slab is fast enough.

Reduce to pre-allocate one sg entry for protection data, and switch
to runtime allocation in case that the protection data segment number
is bigger than 1. Then we can save huge pre-alocation, for example,
500+MB is saved on single lpfc HBA.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/scsi_lib.c | 43 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 32 insertions(+), 11 deletions(-)

Comments

Christoph Hellwig April 24, 2019, 2:37 p.m. UTC | #1
> -	if (scsi_prot_sg_count(cmd))
> -		sg_free_table_chained(&cmd->prot_sdb->table, true);
> +	if (scsi_prot_sg_count(cmd) && cmd->prot_sdb->table.sgl !=
> +			scsi_prot_inline_sg(cmd))
> +		sg_free_table_chained(&cmd->prot_sdb->table, false);

Nipick: I usually find it easier to read if we break around conditions
instead of inside them:

	if (scsi_prot_sg_count(cmd) &&
	    cmd->prot_sdb->table.sgl != scsi_prot_inline_sg(cmd))

> +		if (ivecs <= SCSI_INLINE_PROT_SG_CNT) {
> +			scsi_init_inline_sg_table(&prot_sdb->table,
> +						  scsi_prot_inline_sg(cmd),
> +						  SCSI_INLINE_PROT_SG_CNT);
> +		} else if (sg_alloc_table_chained(&prot_sdb->table,
> +						  ivecs, NULL)) {

Hmm.  Maybe we just need to pass an nr_inline_vecs argument
to sg_alloc_table_chained to replace the SG_CHUNK_SIZE argument instead
of open coding this logic?

If we'd also pass it to sg_free_table_chained we could also simplify
those checks in all callers.
Ming Lei April 25, 2019, 9:37 a.m. UTC | #2
On Wed, Apr 24, 2019 at 04:37:04PM +0200, Christoph Hellwig wrote:
> > -	if (scsi_prot_sg_count(cmd))
> > -		sg_free_table_chained(&cmd->prot_sdb->table, true);
> > +	if (scsi_prot_sg_count(cmd) && cmd->prot_sdb->table.sgl !=
> > +			scsi_prot_inline_sg(cmd))
> > +		sg_free_table_chained(&cmd->prot_sdb->table, false);
> 
> Nipick: I usually find it easier to read if we break around conditions
> instead of inside them:
> 
> 	if (scsi_prot_sg_count(cmd) &&
> 	    cmd->prot_sdb->table.sgl != scsi_prot_inline_sg(cmd))

OK.

> 
> > +		if (ivecs <= SCSI_INLINE_PROT_SG_CNT) {
> > +			scsi_init_inline_sg_table(&prot_sdb->table,
> > +						  scsi_prot_inline_sg(cmd),
> > +						  SCSI_INLINE_PROT_SG_CNT);
> > +		} else if (sg_alloc_table_chained(&prot_sdb->table,
> > +						  ivecs, NULL)) {
> 
> Hmm.  Maybe we just need to pass an nr_inline_vecs argument
> to sg_alloc_table_chained to replace the SG_CHUNK_SIZE argument instead
> of open coding this logic?

We can do that, however the current scatterlist code assumes that size
of each SGL is fixed. We may change the logic to support this feature,
seems not too difficult to do that.

Please let us know if you are fine to change the scatterlist code for
supporting customized size of the 1st SGL.


Thanks,
Ming
Ming Lei April 25, 2019, 11:20 a.m. UTC | #3
On Thu, Apr 25, 2019 at 05:37:29PM +0800, Ming Lei wrote:
> On Wed, Apr 24, 2019 at 04:37:04PM +0200, Christoph Hellwig wrote:
> > > -	if (scsi_prot_sg_count(cmd))
> > > -		sg_free_table_chained(&cmd->prot_sdb->table, true);
> > > +	if (scsi_prot_sg_count(cmd) && cmd->prot_sdb->table.sgl !=
> > > +			scsi_prot_inline_sg(cmd))
> > > +		sg_free_table_chained(&cmd->prot_sdb->table, false);
> > 
> > Nipick: I usually find it easier to read if we break around conditions
> > instead of inside them:
> > 
> > 	if (scsi_prot_sg_count(cmd) &&
> > 	    cmd->prot_sdb->table.sgl != scsi_prot_inline_sg(cmd))
> 
> OK.
> 
> > 
> > > +		if (ivecs <= SCSI_INLINE_PROT_SG_CNT) {
> > > +			scsi_init_inline_sg_table(&prot_sdb->table,
> > > +						  scsi_prot_inline_sg(cmd),
> > > +						  SCSI_INLINE_PROT_SG_CNT);
> > > +		} else if (sg_alloc_table_chained(&prot_sdb->table,
> > > +						  ivecs, NULL)) {
> > 
> > Hmm.  Maybe we just need to pass an nr_inline_vecs argument
> > to sg_alloc_table_chained to replace the SG_CHUNK_SIZE argument instead
> > of open coding this logic?
> 
> We can do that, however the current scatterlist code assumes that size
> of each SGL is fixed. We may change the logic to support this feature,
> seems not too difficult to do that.
> 
> Please let us know if you are fine to change the scatterlist code for
> supporting customized size of the 1st SGL.

Especially, something like the following change is needed:

 include/linux/scatterlist.h | 27 ++++++++++++++++++++-----
 lib/scatterlist.c           | 36 ++++++++++++++++++++++------------
 lib/sg_pool.c               | 48 ++++++++++++++++++++++++++++++---------------
 3 files changed, 77 insertions(+), 34 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index b4be960c7e5d..045d7aa81f2c 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -266,10 +266,11 @@ int sg_split(struct scatterlist *in, const int in_mapped_nents,
 typedef struct scatterlist *(sg_alloc_fn)(unsigned int, gfp_t);
 typedef void (sg_free_fn)(struct scatterlist *, unsigned int);
 
-void __sg_free_table(struct sg_table *, unsigned int, bool, sg_free_fn *);
+void __sg_free_table(struct sg_table *, unsigned int, unsigned int,
+		     sg_free_fn *);
 void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
-		     struct scatterlist *, gfp_t, sg_alloc_fn *);
+		     struct scatterlist *, unsigned int, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
 				unsigned int n_pages, unsigned int offset,
@@ -331,9 +332,25 @@ size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents,
 #endif
 
 #ifdef CONFIG_SG_POOL
-void sg_free_table_chained(struct sg_table *table, bool first_chunk);
-int sg_alloc_table_chained(struct sg_table *table, int nents,
-			   struct scatterlist *first_chunk);
+void __sg_free_table_chained(struct sg_table *table,
+			     unsigned nents_first_chunk);
+int __sg_alloc_table_chained(struct sg_table *table, int nents,
+			     struct scatterlist *first_chunk,
+			     unsigned nents_first_chunk);
+
+static inline void sg_free_table_chained(struct sg_table *table,
+					 bool first_chunk)
+{
+	__sg_free_table_chained(table, first_chunk ? SG_CHUNK_SIZE : 0);
+}
+
+static inline int sg_alloc_table_chained(struct sg_table *table,
+					 int nents,
+					 struct scatterlist *first_chunk)
+{
+	return __sg_alloc_table_chained(table, nents, first_chunk,
+					first_chunk ? SG_CHUNK_SIZE : 0);
+}
 #endif
 
 /*
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 739dc9fe2c55..401dd38080bc 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -181,7 +181,8 @@ static void sg_kfree(struct scatterlist *sg, unsigned int nents)
  * __sg_free_table - Free a previously mapped sg table
  * @table:	The sg table header to use
  * @max_ents:	The maximum number of entries per single scatterlist
- * @skip_first_chunk: don't free the (preallocated) first scatterlist chunk
+ * @nrents_first_chunk: Number of entries int the (preallocated) first
+ * 	scatterlist chunk, 0 means skipping to free the first chunk
  * @free_fn:	Free function
  *
  *  Description:
@@ -191,9 +192,10 @@ static void sg_kfree(struct scatterlist *sg, unsigned int nents)
  *
  **/
 void __sg_free_table(struct sg_table *table, unsigned int max_ents,
-		     bool skip_first_chunk, sg_free_fn *free_fn)
+		     unsigned int nents_first_chunk, sg_free_fn *free_fn)
 {
 	struct scatterlist *sgl, *next;
+	unsigned curr_max_ents = nents_first_chunk ?: max_ents;
 
 	if (unlikely(!table->sgl))
 		return;
@@ -209,9 +211,9 @@ void __sg_free_table(struct sg_table *table, unsigned int max_ents,
 		 * sg_size is then one less than alloc size, since the last
 		 * element is the chain pointer.
 		 */
-		if (alloc_size > max_ents) {
-			next = sg_chain_ptr(&sgl[max_ents - 1]);
-			alloc_size = max_ents;
+		if (alloc_size > curr_max_ents) {
+			next = sg_chain_ptr(&sgl[curr_max_ents - 1]);
+			alloc_size = curr_max_ents;
 			sg_size = alloc_size - 1;
 		} else {
 			sg_size = alloc_size;
@@ -219,11 +221,12 @@ void __sg_free_table(struct sg_table *table, unsigned int max_ents,
 		}
 
 		table->orig_nents -= sg_size;
-		if (skip_first_chunk)
-			skip_first_chunk = false;
+		if (nents_first_chunk)
+			nents_first_chunk = 0;
 		else
 			free_fn(sgl, alloc_size);
 		sgl = next;
+		curr_max_ents = max_ents;
 	}
 
 	table->sgl = NULL;
@@ -246,6 +249,8 @@ EXPORT_SYMBOL(sg_free_table);
  * @table:	The sg table header to use
  * @nents:	Number of entries in sg list
  * @max_ents:	The maximum number of entries the allocator returns per call
+ * @nrents_first_chunk: Number of entries int the (preallocated) first
+ * 	scatterlist chunk, 0 means no such 1st chunk provided by user
  * @gfp_mask:	GFP allocation mask
  * @alloc_fn:	Allocator to use
  *
@@ -262,10 +267,13 @@ EXPORT_SYMBOL(sg_free_table);
  **/
 int __sg_alloc_table(struct sg_table *table, unsigned int nents,
 		     unsigned int max_ents, struct scatterlist *first_chunk,
-		     gfp_t gfp_mask, sg_alloc_fn *alloc_fn)
+		     unsigned int nents_first_chunk, gfp_t gfp_mask,
+		     sg_alloc_fn *alloc_fn)
 {
 	struct scatterlist *sg, *prv;
 	unsigned int left;
+	unsigned curr_max_ents = nents_first_chunk ?: max_ents;
+	unsigned prv_max_ents;
 
 	memset(table, 0, sizeof(*table));
 
@@ -281,8 +289,8 @@ int __sg_alloc_table(struct sg_table *table, unsigned int nents,
 	do {
 		unsigned int sg_size, alloc_size = left;
 
-		if (alloc_size > max_ents) {
-			alloc_size = max_ents;
+		if (alloc_size > curr_max_ents) {
+			alloc_size = curr_max_ents;
 			sg_size = alloc_size - 1;
 		} else
 			sg_size = alloc_size;
@@ -316,7 +324,7 @@ int __sg_alloc_table(struct sg_table *table, unsigned int nents,
 		 * If this is not the first mapping, chain previous part.
 		 */
 		if (prv)
-			sg_chain(prv, max_ents, sg);
+			sg_chain(prv, prv_max_ents, sg);
 		else
 			table->sgl = sg;
 
@@ -327,6 +335,8 @@ int __sg_alloc_table(struct sg_table *table, unsigned int nents,
 			sg_mark_end(&sg[sg_size - 1]);
 
 		prv = sg;
+		prv_max_ents = curr_max_ents;
+		curr_max_ents = max_ents;
 	} while (left);
 
 	return 0;
@@ -349,9 +359,9 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
 	int ret;
 
 	ret = __sg_alloc_table(table, nents, SG_MAX_SINGLE_ALLOC,
-			       NULL, gfp_mask, sg_kmalloc);
+			       NULL, 0, gfp_mask, sg_kmalloc);
 	if (unlikely(ret))
-		__sg_free_table(table, SG_MAX_SINGLE_ALLOC, false, sg_kfree);
+		__sg_free_table(table, SG_MAX_SINGLE_ALLOC, 0, sg_kfree);
 
 	return ret;
 }
diff --git a/lib/sg_pool.c b/lib/sg_pool.c
index d1c1e6388eaa..e0f4bc0de8bd 100644
--- a/lib/sg_pool.c
+++ b/lib/sg_pool.c
@@ -67,56 +67,72 @@ static struct scatterlist *sg_pool_alloc(unsigned int nents, gfp_t gfp_mask)
 }
 
 /**
- * sg_free_table_chained - Free a previously mapped sg table
+ * __sg_free_table_chained - Free a previously mapped sg table
  * @table:	The sg table header to use
- * @first_chunk: was first_chunk not NULL in sg_alloc_table_chained?
+ * @nents_first_chunk: size of the first_chunk SGL passed to
+ *		__sg_alloc_table_chained
  *
  *  Description:
  *    Free an sg table previously allocated and setup with
- *    sg_alloc_table_chained().
+ *    __sg_alloc_table_chained().
+ *
+ *    @nents_first_chunk has to be same with that same parameter passed
+ *    to __sg_alloc_table_chained().
  *
  **/
-void sg_free_table_chained(struct sg_table *table, bool first_chunk)
+void __sg_free_table_chained(struct sg_table *table,
+		unsigned nents_first_chunk)
 {
-	if (first_chunk && table->orig_nents <= SG_CHUNK_SIZE)
+	if (table->orig_nents <= nents_first_chunk)
 		return;
-	__sg_free_table(table, SG_CHUNK_SIZE, first_chunk, sg_pool_free);
+
+	if (nents_first_chunk == 1)
+		nents_first_chunk = 0;
+
+	__sg_free_table(table, SG_CHUNK_SIZE, nents_first_chunk, sg_pool_free);
 }
-EXPORT_SYMBOL_GPL(sg_free_table_chained);
+EXPORT_SYMBOL_GPL(__sg_free_table_chained);
 
 /**
- * sg_alloc_table_chained - Allocate and chain SGLs in an sg table
+ * __sg_alloc_table_chained - Allocate and chain SGLs in an sg table
  * @table:	The sg table header to use
  * @nents:	Number of entries in sg list
  * @first_chunk: first SGL
+ * @nents_first_chunk: number of the SGL of @first_chunk
  *
  *  Description:
  *    Allocate and chain SGLs in an sg table. If @nents@ is larger than
- *    SG_CHUNK_SIZE a chained sg table will be setup.
+ *    @nents_first_chunk a chained sg table will be setup.
  *
  **/
-int sg_alloc_table_chained(struct sg_table *table, int nents,
-		struct scatterlist *first_chunk)
+int __sg_alloc_table_chained(struct sg_table *table, int nents,
+		struct scatterlist *first_chunk, unsigned nents_first_chunk)
 {
 	int ret;
 
 	BUG_ON(!nents);
 
-	if (first_chunk) {
-		if (nents <= SG_CHUNK_SIZE) {
+	if (first_chunk && nents_first_chunk) {
+		if (nents <= nents_first_chunk) {
 			table->nents = table->orig_nents = nents;
 			sg_init_table(table->sgl, nents);
 			return 0;
 		}
 	}
 
+	if (nents_first_chunk == 1) {
+		first_chunk = NULL;
+		nents_first_chunk = 0;
+	}
+
 	ret = __sg_alloc_table(table, nents, SG_CHUNK_SIZE,
-			       first_chunk, GFP_ATOMIC, sg_pool_alloc);
+			       first_chunk, nents_first_chunk,
+			       GFP_ATOMIC, sg_pool_alloc);
 	if (unlikely(ret))
-		sg_free_table_chained(table, (bool)first_chunk);
+		__sg_free_table_chained(table, nents_first_chunk);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(sg_alloc_table_chained);
+EXPORT_SYMBOL_GPL(__sg_alloc_table_chained);
 
 static __init int sg_pool_init(void)
 {

Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 07dfc17d4824..9814eee8014c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -39,6 +39,12 @@ 
 #include "scsi_priv.h"
 #include "scsi_logging.h"
 
+/*
+ * Size of integrity metadata is usually small, 1 inline sg should
+ * cover normal cases.
+ */
+#define  SCSI_INLINE_PROT_SG_CNT  1
+
 static struct kmem_cache *scsi_sdb_cache;
 static struct kmem_cache *scsi_sense_cache;
 static struct kmem_cache *scsi_sense_isadma_cache;
@@ -553,12 +559,27 @@  static void scsi_uninit_cmd(struct scsi_cmnd *cmd)
 	}
 }
 
+static void scsi_init_inline_sg_table(struct sg_table *table,
+				      struct scatterlist *sgl,
+				      unsigned nents)
+{
+	table->nents = table->orig_nents = nents;
+	table->sgl = sgl;
+	sg_init_table(sgl, nents);
+}
+
+static inline struct scatterlist *scsi_prot_inline_sg(struct scsi_cmnd *cmd)
+{
+	return (struct scatterlist *)(cmd->prot_sdb + 1);
+}
+
 static void scsi_mq_free_sgtables(struct scsi_cmnd *cmd)
 {
 	if (cmd->sdb.table.nents)
 		sg_free_table_chained(&cmd->sdb.table, true);
-	if (scsi_prot_sg_count(cmd))
-		sg_free_table_chained(&cmd->prot_sdb->table, true);
+	if (scsi_prot_sg_count(cmd) && cmd->prot_sdb->table.sgl !=
+			scsi_prot_inline_sg(cmd))
+		sg_free_table_chained(&cmd->prot_sdb->table, false);
 }
 
 static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd)
@@ -1044,9 +1065,12 @@  blk_status_t scsi_init_io(struct scsi_cmnd *cmd)
 		}
 
 		ivecs = blk_rq_count_integrity_sg(rq->q, rq->bio);
-
-		if (sg_alloc_table_chained(&prot_sdb->table, ivecs,
-				prot_sdb->table.sgl)) {
+		if (ivecs <= SCSI_INLINE_PROT_SG_CNT) {
+			scsi_init_inline_sg_table(&prot_sdb->table,
+						  scsi_prot_inline_sg(cmd),
+						  SCSI_INLINE_PROT_SG_CNT);
+		} else if (sg_alloc_table_chained(&prot_sdb->table,
+						  ivecs, NULL)) {
 			ret = BLK_STS_RESOURCE;
 			goto out_free_sgtables;
 		}
@@ -1579,13 +1603,9 @@  static blk_status_t scsi_mq_prep_fn(struct request *req)
 	sg = (void *)cmd + sizeof(struct scsi_cmnd) + shost->hostt->cmd_size;
 	cmd->sdb.table.sgl = sg;
 
-	if (scsi_host_get_prot(shost)) {
+	if (scsi_host_get_prot(shost))
 		memset(cmd->prot_sdb, 0, sizeof(struct scsi_data_buffer));
 
-		cmd->prot_sdb->table.sgl =
-			(struct scatterlist *)(cmd->prot_sdb + 1);
-	}
-
 	blk_mq_start_request(req);
 
 	return scsi_setup_cmnd(sdev, req);
@@ -1846,7 +1866,8 @@  int scsi_mq_setup_tags(struct Scsi_Host *shost)
 	sgl_size = scsi_mq_sgl_size(shost);
 	cmd_size = sizeof(struct scsi_cmnd) + shost->hostt->cmd_size + sgl_size;
 	if (scsi_host_get_prot(shost))
-		cmd_size += sizeof(struct scsi_data_buffer) + sgl_size;
+		cmd_size += sizeof(struct scsi_data_buffer) +
+			sizeof(struct scatterlist) * SCSI_INLINE_PROT_SG_CNT;
 
 	memset(&shost->tag_set, 0, sizeof(shost->tag_set));
 	shost->tag_set.ops = &scsi_mq_ops;