Message ID | 20230712211115.2174650-5-kent.overstreet@linux.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On Wed, Jul 12, 2023 at 05:10:59PM -0400, Kent Overstreet wrote: > From: Kent Overstreet <kent.overstreet@gmail.com> > > - bio_set_pages_dirty(), bio_check_pages_dirty() - dio path Why? We've so far have been able to get away without file systems reinventing their own DIO path. I'd really like to keep it that way, so if you think the iomap dio code should be improved please explain why. And please also cycle the fsdevel list in. > - blk_status_to_str() - error messages > - bio_add_folio() - this should definitely be exported for everyone, > it's the modern version of bio_add_page() These look ok to me to go in when the actual user shows up.
On Wed, Jul 12, 2023 at 05:10:59PM -0400, Kent Overstreet wrote: > - bio_add_folio() - this should definitely be exported for everyone, > it's the modern version of bio_add_page() Looks like this one got added in cd57b77197a4, so it just needs to be dropped from the changelog.
On Mon, Jul 24, 2023 at 10:31:04AM -0700, Christoph Hellwig wrote: > On Wed, Jul 12, 2023 at 05:10:59PM -0400, Kent Overstreet wrote: > > From: Kent Overstreet <kent.overstreet@gmail.com> > > > > - bio_set_pages_dirty(), bio_check_pages_dirty() - dio path > > Why? We've so far have been able to get away without file systems > reinventing their own DIO path. I'd really like to keep it that way, > so if you think the iomap dio code should be improved please explain > why. And please also cycle the fsdevel list in. It's been discussed at length why bcachefs doesn't use iomap. In short, iomap is heavily callback based, the bcachefs io paths are not - we pass around data structures instead. I discussed this with people when iomap was first being written, but iomap ended up being a much more conservative approach, more in line with the old buffer heads code where the generic code calls into the filesystem to obtain mappings. I'm gradually convincing people of the merits of the bcachefs approach - in particular reducing indirect function calls is getting more attention these days.
On Mon, Jul 24, 2023 at 11:00:37PM -0400, Kent Overstreet wrote: > In short, iomap is heavily callback based, the bcachefs io paths are > not - we pass around data structures instead. I discussed this with > people when iomap was first being written, but iomap ended up being a > much more conservative approach, more in line with the old buffer heads > code where the generic code calls into the filesystem to obtain > mappings. > > I'm gradually convincing people of the merits of the bcachefs approach - > in particular reducing indirect function calls is getting more attention > these days. FYI, Matthew has had patches that convert iomap to be an iterator, and I've massage the first half of them and actuall got them in before. I'd much rather finish off that work (even if only for direct I/O initially) than adding another direct I/O code. But even with out that we should be able to easily pass more private data, in fact btrfs makes pretty heavy use of that.
On Wed, Jul 26, 2023 at 06:20:42AM -0700, Christoph Hellwig wrote: > On Mon, Jul 24, 2023 at 11:00:37PM -0400, Kent Overstreet wrote: > > In short, iomap is heavily callback based, the bcachefs io paths are > > not - we pass around data structures instead. I discussed this with > > people when iomap was first being written, but iomap ended up being a > > much more conservative approach, more in line with the old buffer heads > > code where the generic code calls into the filesystem to obtain > > mappings. > > > > I'm gradually convincing people of the merits of the bcachefs approach - > > in particular reducing indirect function calls is getting more attention > > these days. > > FYI, Matthew has had patches that convert iomap to be an iterator, > and I've massage the first half of them and actuall got them in > before. I'd much rather finish off that work (even if only for > direct I/O initially) than adding another direct I/O code. But > even with out that we should be able to easily pass more private > data, in fact btrfs makes pretty heavy use of that. That's wonderful, but getting iomap up to the level of what bcachefs needs is still going to be a pretty big project and it's not going to be my highest priority. bcachefs also hangs more state off of the pagecache, in bcachefs's equivvalent of iomap_page - we store reservations for dirty data there and a few other things, which means the buffered IO paths don't have to walk any other data structures. I think that's another idea you guys will want to steal, but a higher priority for me is getting a proper FUSE port done - and making bcachefs more tightly weddded to VFS library code is not likely to make that process any easier. Once a proper fuse port is done and we know what that looks like will be a better time for some consolidation.
diff --git a/block/bio.c b/block/bio.c index 043944fd46..1e75840d17 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1481,6 +1481,7 @@ void bio_set_pages_dirty(struct bio *bio) set_page_dirty_lock(bvec->bv_page); } } +EXPORT_SYMBOL_GPL(bio_set_pages_dirty); /* * bio_check_pages_dirty() will check that all the BIO's pages are still dirty. @@ -1540,6 +1541,7 @@ void bio_check_pages_dirty(struct bio *bio) spin_unlock_irqrestore(&bio_dirty_lock, flags); schedule_work(&bio_dirty_work); } +EXPORT_SYMBOL_GPL(bio_check_pages_dirty); static inline bool bio_remaining_done(struct bio *bio) { diff --git a/block/blk-core.c b/block/blk-core.c index 1da77e7d62..b7b0237c36 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -205,6 +205,7 @@ const char *blk_status_to_str(blk_status_t status) return "<null>"; return blk_errors[idx].name; } +EXPORT_SYMBOL_GPL(blk_status_to_str); /** * blk_sync_queue - cancel any pending callbacks on a queue diff --git a/block/blk.h b/block/blk.h index 45547bcf11..f20f9ca03e 100644 --- a/block/blk.h +++ b/block/blk.h @@ -251,7 +251,6 @@ static inline void bio_integrity_free(struct bio *bio) unsigned long blk_rq_timeout(unsigned long timeout); void blk_add_timer(struct request *req); -const char *blk_status_to_str(blk_status_t status); bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio, unsigned int nr_segs); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c0ffe203a6..7a32dc98e1 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -854,6 +854,7 @@ extern const char *blk_op_str(enum req_op op); int blk_status_to_errno(blk_status_t status); blk_status_t errno_to_blk_status(int errno); +const char *blk_status_to_str(blk_status_t status); /* only poll the hardware once, don't continue until a completion was found */ #define BLK_POLL_ONESHOT (1 << 0)