Message ID | 20240425183943.6319-3-joshi.k@samsung.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Read/Write with meta/integrity | expand |
On Fri, Apr 26, 2024 at 12:09:35AM +0530, Kanchan Joshi wrote: > From: Anuj Gupta <anuj20.g@samsung.com> > > If bio_integrity_copy_user is used to process the meta buffer, bip_max_vcnt > is one greater than bip_vcnt. In this case bip_max_vcnt vecs needs to be > copied to cloned bip. Can you explain this a bit more? The clone should only allocate what is actually used, so this leaves be a bit confused.
On 4/27/2024 12:33 PM, Christoph Hellwig wrote: >> If bio_integrity_copy_user is used to process the meta buffer, bip_max_vcnt >> is one greater than bip_vcnt. In this case bip_max_vcnt vecs needs to be >> copied to cloned bip. > Can you explain this a bit more? The clone should only allocate what > is actually used, so this leaves be a bit confused. > Will expand the commit description. Usually the meta buffer is pinned and used directly (say N bio vecs). In case kernel has to make a copy (in bio_integrity_copy_user), it factors these N vecs in, and one extra for the bounce buffer. So for read IO, bip_max_vcnt is N+1, while bip_vcnt is N. The clone bio also needs to be aware of all N+1 vecs, so that we can copy the data from the bounce buffer to pinned user pages correctly during read-completion.
On Mon, Apr 29, 2024 at 04:58:37PM +0530, Kanchan Joshi wrote: > On 4/27/2024 12:33 PM, Christoph Hellwig wrote: > >> If bio_integrity_copy_user is used to process the meta buffer, bip_max_vcnt > >> is one greater than bip_vcnt. In this case bip_max_vcnt vecs needs to be > >> copied to cloned bip. > > Can you explain this a bit more? The clone should only allocate what > > is actually used, so this leaves be a bit confused. > > > > Will expand the commit description. > > Usually the meta buffer is pinned and used directly (say N bio vecs). > In case kernel has to make a copy (in bio_integrity_copy_user), it > factors these N vecs in, and one extra for the bounce buffer. > So for read IO, bip_max_vcnt is N+1, while bip_vcnt is N. > > The clone bio also needs to be aware of all N+1 vecs, so that we can > copy the data from the bounce buffer to pinned user pages correctly > during read-completion. An earlier version added a field in the bip to point to the original bvec from the user address. That extra field wouldn't be used in the far majority of cases, so moving the user bvec to the end of the existing bip_vec is a spatial optimization. The code may look a little more confusing that way, but I think it's better than making the bip bigger.
On Mon, Apr 29, 2024 at 01:04:12PM +0100, Keith Busch wrote: > An earlier version added a field in the bip to point to the original > bvec from the user address. That extra field wouldn't be used in the far > majority of cases, so moving the user bvec to the end of the existing > bip_vec is a spatial optimization. The code may look a little more > confusing that way, but I think it's better than making the bip bigger. I think we need to do something like that - just hiding the bounce buffer is not really maintainable once we get multiple levels of stacking and other creative bio cloning.
On Mon, Apr 29, 2024 at 07:07:29PM +0200, Christoph Hellwig wrote: > On Mon, Apr 29, 2024 at 01:04:12PM +0100, Keith Busch wrote: > > An earlier version added a field in the bip to point to the original > > bvec from the user address. That extra field wouldn't be used in the far > > majority of cases, so moving the user bvec to the end of the existing > > bip_vec is a spatial optimization. The code may look a little more > > confusing that way, but I think it's better than making the bip bigger. > > I think we need to do something like that - just hiding the bounce > buffer is not really maintainable once we get multiple levels of stacking > and other creative bio cloning. Not sure I follow that. From patches 2-4 here, I think that pretty much covers it. It's just missing a good code comment, but the implementation side looks complete for any amount of stacking and splitting.
On Tue, Apr 30, 2024 at 09:25:38AM +0100, Keith Busch wrote: > > I think we need to do something like that - just hiding the bounce > > buffer is not really maintainable once we get multiple levels of stacking > > and other creative bio cloning. > > Not sure I follow that. From patches 2-4 here, I think that pretty much > covers it. It's just missing a good code comment, but the implementation > side looks complete for any amount of stacking and splitting. I can't see how it would work, but I'll gladly wait for a better description.
On Mon, Apr 29, 2024 at 04:58:37PM +0530, Kanchan Joshi wrote: > On 4/27/2024 12:33 PM, Christoph Hellwig wrote: > >> If bio_integrity_copy_user is used to process the meta buffer, bip_max_vcnt > >> is one greater than bip_vcnt. In this case bip_max_vcnt vecs needs to be > >> copied to cloned bip. > > Can you explain this a bit more? The clone should only allocate what > > is actually used, so this leaves be a bit confused. > > > > Will expand the commit description. > > Usually the meta buffer is pinned and used directly (say N bio vecs). > In case kernel has to make a copy (in bio_integrity_copy_user), it > factors these N vecs in, and one extra for the bounce buffer. > So for read IO, bip_max_vcnt is N+1, while bip_vcnt is N. > > The clone bio also needs to be aware of all N+1 vecs, so that we can > copy the data from the bounce buffer to pinned user pages correctly > during read-completion. No. The underlying layer below the clone/split/etc should never have to care about your bounce buffer. The bvecs are just data containers, and if they are mapped, copied or used in any other way should remain entirely encapsulated in the caller.
diff --git a/block/bio-integrity.c b/block/bio-integrity.c index e3390424e6b5..c1955f01412e 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -622,12 +622,12 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src, BUG_ON(bip_src == NULL); - bip = bio_integrity_alloc(bio, gfp_mask, bip_src->bip_vcnt); + bip = bio_integrity_alloc(bio, gfp_mask, bip_src->bip_max_vcnt); if (IS_ERR(bip)) return PTR_ERR(bip); memcpy(bip->bip_vec, bip_src->bip_vec, - bip_src->bip_vcnt * sizeof(struct bio_vec)); + bip_src->bip_max_vcnt * sizeof(struct bio_vec)); bip->bip_vcnt = bip_src->bip_vcnt; bip->bip_iter = bip_src->bip_iter;