Message ID | 20250321161407.3333724-1-dhowells@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | iov_iter: Add composite, scatterlist and skbuff iterator types | expand |
This is going entirely in the wrong direction. We don't need more iter types but less. The reason why we have to many is because the underlying representation of the ranges is a mess which goes deeper than just the iterator, because it also means we have to convert between the underlying representations all the time. E.g. the socket code should have (and either has for a while or at least there were patches) been using bio_vecs instead of reinventing them as sk fragment. The crypto code should not be using scatterlists, which are a horrible data structure because they mix up the physical memory description and the dma mapping information which isn't even used for most uses, etc. So instead of more iters let's convert everyone to a common scatter/gather memory definition, which simplifies the iters. For now that is the bio_vec, which really should be converted from storing a struct page to a phys_addr_t (and maybe renamed if that helps adoption). That allows to trivially kill the kvec for example. As for the head/tail - that seems to be a odd NFS/sunrpc fetish. I've actually started a little project to just convert the sunrpc code to use bio_vecs, which massively simplifies the code, and allows directly passing it to the iters in the socket API. It doesn't quite work yet but shows how all these custom (and in this case rather ad-hoc) memory fragment representation cause a huge mess. I don't think the iterlist can work in practice, but it would be nice to have for a few use cases. If it worked it should hopefully allow to kill off the odd xarray iterator.
On 3/23/25 07:21, Christoph Hellwig wrote: > This is going entirely in the wrong direction. We don't need more iter > types but less. The reason why we have to many is because the underlying > representation of the ranges is a mess which goes deeper than just the > iterator, because it also means we have to convert between the > underlying representations all the time. > > E.g. the socket code should have (and either has for a while or at least > there were patches) been using bio_vecs instead of reinventing them as sk > fragment. The crypto code should not be using scatterlists, which are a > horrible data structure because they mix up the physical memory > description and the dma mapping information which isn't even used for > most uses, etc. > > So instead of more iters let's convert everyone to a common > scatter/gather memory definition, which simplifies the iters. For now > that is the bio_vec, which really should be converted from storing a > struct page to a phys_addr_t (and maybe renamed if that helps adoption). > That allows to trivially kill the kvec for example. > > As for the head/tail - that seems to be a odd NFS/sunrpc fetish. I've > actually started a little project to just convert the sunrpc code to > use bio_vecs, which massively simplifies the code, and allows directly > passing it to the iters in the socket API. It doesn't quite work yet > but shows how all these custom (and in this case rather ad-hoc) memory > fragment representation cause a huge mess. > > I don't think the iterlist can work in practice, but it would be nice > to have for a few use cases. If it worked it should hopefully allow > to kill off the odd xarray iterator. > Can we have a session around this? IE define how iterators should be used, and what the iterator elements should be. If we do it properly this will also fix the frozen page discussion we're having; if we define iterators whose data elements are _not_ pages then clearly one cannot take a reference to them. But in either case, we should define the long-term goal such that people can start converting stuff. Cheers, Hannes
On Sat, Mar 22, 2025 at 11:21:28PM -0700, Christoph Hellwig wrote: > This is going entirely in the wrong direction. We don't need more iter > types but less. The reason why we have to many is because the underlying > representation of the ranges is a mess which goes deeper than just the > iterator, because it also means we have to convert between the > underlying representations all the time. > > E.g. the socket code should have (and either has for a while or at least > there were patches) been using bio_vecs instead of reinventing them as sk > fragment. The crypto code should not be using scatterlists, which are a I did this work six years ago -- see 8842d285bafa Unfortunately, networking is full of inconsiderate arseholes who backed it out without even talking to me in 21d2e6737c97
On Sun, Mar 23, 2025 at 02:39:25PM +0100, Hannes Reinecke wrote:
> Can we have a session around this?
Wednesday, 10:30