Message ID | d27da881-e615-b081-d8db-17ac9b91d4be@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | dm-crypt support for per-sector NVMe metadata | expand |
On Thu, May 23, 2024 at 06:54:47PM +0200, Mikulas Patocka wrote: > > However, the function rq_integrity_vec has a bug - it returns the first > vector of the bio's metadata and completely disregards the metadata > iterator that was advanced when the bio was split. Thus, the second bio > uses the same metadata as the first bio and this leads to metadata > corruption. > > This commit changes rq_integrity_vec, so that it calls mp_bvec_iter_bvec > instead of returning the first vector. mp_bvec_iter_bvec reads the > iterator and advances the vector by the iterator. mp_bvec_iter_bvec does not advance the bvec_iter, it just uses the iter to build a bvec for the current position in the iter. Also please fix the commit log to not use more than 73 characters, as-is it will be unreadable in git show output or email replies. > -static inline struct bio_vec *rq_integrity_vec(struct request *rq) > +static inline struct bio_vec rq_integrity_vec(struct request *rq) > { > if (WARN_ON_ONCE(queue_max_integrity_segments(rq->q) > 1)) > + return (struct bio_vec){ }; > + return mp_bvec_iter_bvec(rq->bio->bi_integrity->bip_vec, > + rq->bio->bi_integrity->bip_iter); The queue_max_integrity_segments check can go away now. Once you use the bvec_iter the function works just fine for multiple integrity segments and always returns the one at the current iter position. That should preferably also documented in a comment. (I'm also pretty sure I've already written this in reply to Anuj's version of the patch)
On Thu, 23 May 2024, Christoph Hellwig wrote: > On Thu, May 23, 2024 at 06:54:47PM +0200, Mikulas Patocka wrote: > > > > However, the function rq_integrity_vec has a bug - it returns the first > > vector of the bio's metadata and completely disregards the metadata > > iterator that was advanced when the bio was split. Thus, the second bio > > uses the same metadata as the first bio and this leads to metadata > > corruption. > > > > This commit changes rq_integrity_vec, so that it calls mp_bvec_iter_bvec > > instead of returning the first vector. mp_bvec_iter_bvec reads the > > iterator and advances the vector by the iterator. > > mp_bvec_iter_bvec does not advance the bvec_iter, it just uses the > iter to build a bvec for the current position in the iter. > > Also please fix the commit log to not use more than 73 characters, > as-is it will be unreadable in git show output or email replies. OK. I thought that the limit was 74. > > -static inline struct bio_vec *rq_integrity_vec(struct request *rq) > > +static inline struct bio_vec rq_integrity_vec(struct request *rq) > > { > > if (WARN_ON_ONCE(queue_max_integrity_segments(rq->q) > 1)) > > + return (struct bio_vec){ }; > > + return mp_bvec_iter_bvec(rq->bio->bi_integrity->bip_vec, > > + rq->bio->bi_integrity->bip_iter); > > The queue_max_integrity_segments check can go away now. Once you > use the bvec_iter the function works just fine for multiple > integrity segments and always returns the one at the current iter > position. That should preferably also documented in a comment. Yes, I dropped this and I am resending a new version. > (I'm also pretty sure I've already written this in reply to Anuj's > version of the patch) Mikulas
Index: linux-2.6/drivers/nvme/host/pci.c =================================================================== --- linux-2.6.orig/drivers/nvme/host/pci.c +++ linux-2.6/drivers/nvme/host/pci.c @@ -825,9 +825,9 @@ static blk_status_t nvme_map_metadata(st struct nvme_command *cmnd) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); + struct bio_vec bv = rq_integrity_vec(req); - iod->meta_dma = dma_map_bvec(dev->dev, rq_integrity_vec(req), - rq_dma_dir(req), 0); + iod->meta_dma = dma_map_bvec(dev->dev, &bv, rq_dma_dir(req), 0); if (dma_mapping_error(dev->dev, iod->meta_dma)) return BLK_STS_IOERR; cmnd->rw.metadata = cpu_to_le64(iod->meta_dma); @@ -966,7 +966,7 @@ static __always_inline void nvme_pci_unm struct nvme_iod *iod = blk_mq_rq_to_pdu(req); dma_unmap_page(dev->dev, iod->meta_dma, - rq_integrity_vec(req)->bv_len, rq_dma_dir(req)); + rq_integrity_vec(req).bv_len, rq_dma_dir(req)); } if (blk_rq_nr_phys_segments(req)) Index: linux-2.6/include/linux/blk-integrity.h =================================================================== --- linux-2.6.orig/include/linux/blk-integrity.h +++ linux-2.6/include/linux/blk-integrity.h @@ -109,11 +109,12 @@ static inline bool blk_integrity_rq(stru * Return the first bvec that contains integrity data. Only drivers that are * limited to a single integrity segment should use this helper. */ -static inline struct bio_vec *rq_integrity_vec(struct request *rq) +static inline struct bio_vec rq_integrity_vec(struct request *rq) { if (WARN_ON_ONCE(queue_max_integrity_segments(rq->q) > 1)) - return NULL; - return rq->bio->bi_integrity->bip_vec; + return (struct bio_vec){ }; + return mp_bvec_iter_bvec(rq->bio->bi_integrity->bip_vec, + rq->bio->bi_integrity->bip_iter); } #else /* CONFIG_BLK_DEV_INTEGRITY */ static inline int blk_rq_count_integrity_sg(struct request_queue *q, @@ -179,7 +180,8 @@ static inline int blk_integrity_rq(struc static inline struct bio_vec *rq_integrity_vec(struct request *rq) { - return NULL; + /* the optimizer will remove all calls to this function */ + return (struct bio_vec){ }; } #endif /* CONFIG_BLK_DEV_INTEGRITY */ #endif /* _LINUX_BLK_INTEGRITY_H */
If we allocate a bio that is larger than NVMe maximum request size, attach integrity metadata to it and send it to the NVMe subsystem, the integrity metadata will be corrupted. Splitting the bio works correctly. The function bio_split will clone the bio, trim the iterator of the first bio and advance the iterator of the second bio. However, the function rq_integrity_vec has a bug - it returns the first vector of the bio's metadata and completely disregards the metadata iterator that was advanced when the bio was split. Thus, the second bio uses the same metadata as the first bio and this leads to metadata corruption. This commit changes rq_integrity_vec, so that it calls mp_bvec_iter_bvec instead of returning the first vector. mp_bvec_iter_bvec reads the iterator and advances the vector by the iterator. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> --- drivers/nvme/host/pci.c | 6 +++--- include/linux/blk-integrity.h | 10 ++++++---- 2 files changed, 9 insertions(+), 7 deletions(-)