Message ID | 20240304191046.157464-10-aalbersh@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs-verity support for XFS | expand |
On Mon, Mar 04, 2024 at 08:10:31PM +0100, Andrey Albershteyn wrote: > For XFS, fsverity's global workqueue is not really suitable due to: > > 1. High priority workqueues are used within XFS to ensure that data > IO completion cannot stall processing of journal IO completions. > Hence using a WQ_HIGHPRI workqueue directly in the user data IO > path is a potential filesystem livelock/deadlock vector. > > 2. The fsverity workqueue is global - it creates a cross-filesystem > contention point. > > This patch adds per-filesystem, per-cpu workqueue for fsverity > work. This allows iomap to add verification work in the read path on > BIO completion. > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> Should ext4 and f2fs switch over to this by converting fsverity_enqueue_verify_work() to use it? I'd prefer not to have to maintain two separate workqueue strategies as part of the fs/verity/ infrastructure. > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 1fbc72c5f112..5863519ffd51 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1223,6 +1223,8 @@ struct super_block { > #endif > #ifdef CONFIG_FS_VERITY > const struct fsverity_operations *s_vop; > + /* Completion queue for post read verification */ > + struct workqueue_struct *s_read_done_wq; > #endif Maybe s_verity_wq? Or do you anticipate this being used for other purposes too, such as fscrypt? Note that it's behind CONFIG_FS_VERITY and is allocated by an fsverity_* function, so at least at the moment it doesn't feel very generic. > diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h > index 0973b521ac5a..45b7c613148a 100644 > --- a/include/linux/fsverity.h > +++ b/include/linux/fsverity.h > @@ -241,6 +241,22 @@ void fsverity_enqueue_verify_work(struct work_struct *work); > void fsverity_invalidate_block(struct inode *inode, > struct fsverity_blockbuf *block); > > +static inline int fsverity_set_ops(struct super_block *sb, > + const struct fsverity_operations *ops) This doesn't just set the ops, but also allocates a workqueue too. A better name for this function might be fsverity_init_sb. Also this shouldn't really be an inline function. > +{ > + sb->s_vop = ops; > + > + /* Create per-sb workqueue for post read bio verification */ > + struct workqueue_struct *wq = alloc_workqueue( > + "pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id); "pread" is short for "post read", I guess? Should it really be this generic? > +static inline int fsverity_set_ops(struct super_block *sb, > + const struct fsverity_operations *ops) > +{ > + return -EOPNOTSUPP; > +} I think it would be better to just not have a !CONFIG_FS_VERITY stub for this. You *could* make it work like fscrypt_set_ops(), which the ubifs folks added, where it can be called unconditionally if the filesystem has a declaration for the operations (but not necessarily a definition). In that case it would need to return 0, rather than an error. But I think I prefer just omitting the stub and having filesystems guard the call to this by CONFIG_FS_VERITY, as you've already done in XFS. - Eric
On Mon, Mar 04, 2024 at 05:08:05PM -0800, Eric Biggers wrote: > On Mon, Mar 04, 2024 at 08:10:31PM +0100, Andrey Albershteyn wrote: > > For XFS, fsverity's global workqueue is not really suitable due to: > > > > 1. High priority workqueues are used within XFS to ensure that data > > IO completion cannot stall processing of journal IO completions. > > Hence using a WQ_HIGHPRI workqueue directly in the user data IO > > path is a potential filesystem livelock/deadlock vector. > > > > 2. The fsverity workqueue is global - it creates a cross-filesystem > > contention point. > > > > This patch adds per-filesystem, per-cpu workqueue for fsverity > > work. This allows iomap to add verification work in the read path on > > BIO completion. > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > Should ext4 and f2fs switch over to this by converting > fsverity_enqueue_verify_work() to use it? I'd prefer not to have to maintain > two separate workqueue strategies as part of the fs/verity/ infrastructure. (Agreed.) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 1fbc72c5f112..5863519ffd51 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -1223,6 +1223,8 @@ struct super_block { > > #endif > > #ifdef CONFIG_FS_VERITY > > const struct fsverity_operations *s_vop; > > + /* Completion queue for post read verification */ > > + struct workqueue_struct *s_read_done_wq; > > #endif > > Maybe s_verity_wq? Or do you anticipate this being used for other purposes too, > such as fscrypt? Note that it's behind CONFIG_FS_VERITY and is allocated by an > fsverity_* function, so at least at the moment it doesn't feel very generic. Doesn't fscrypt already create its own workqueues? > > diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h > > index 0973b521ac5a..45b7c613148a 100644 > > --- a/include/linux/fsverity.h > > +++ b/include/linux/fsverity.h > > @@ -241,6 +241,22 @@ void fsverity_enqueue_verify_work(struct work_struct *work); > > void fsverity_invalidate_block(struct inode *inode, > > struct fsverity_blockbuf *block); > > > > +static inline int fsverity_set_ops(struct super_block *sb, > > + const struct fsverity_operations *ops) > > This doesn't just set the ops, but also allocates a workqueue too. A better > name for this function might be fsverity_init_sb. > > Also this shouldn't really be an inline function. Yeah. > > +{ > > + sb->s_vop = ops; > > + > > + /* Create per-sb workqueue for post read bio verification */ > > + struct workqueue_struct *wq = alloc_workqueue( > > + "pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id); > > "pread" is short for "post read", I guess? Should it really be this generic? I think it shouldn't use a term that already means "positioned read" to userspace. > > +static inline int fsverity_set_ops(struct super_block *sb, > > + const struct fsverity_operations *ops) > > +{ > > + return -EOPNOTSUPP; > > +} > > I think it would be better to just not have a !CONFIG_FS_VERITY stub for this. > > You *could* make it work like fscrypt_set_ops(), which the ubifs folks added, > where it can be called unconditionally if the filesystem has a declaration for > the operations (but not necessarily a definition). In that case it would need > to return 0, rather than an error. But I think I prefer just omitting the stub > and having filesystems guard the call to this by CONFIG_FS_VERITY, as you've > already done in XFS. Aha, I was going to ask why XFS had its own #ifdef guards when this already exists. :) --D > - Eric >
On Thu, Mar 07, 2024 at 01:58:57PM -0800, Darrick J. Wong wrote: > > Maybe s_verity_wq? Or do you anticipate this being used for other purposes too, > > such as fscrypt? Note that it's behind CONFIG_FS_VERITY and is allocated by an > > fsverity_* function, so at least at the moment it doesn't feel very generic. > > Doesn't fscrypt already create its own workqueues? There's a global workqueue in fs/crypto/ too. IIRC, it has to be separate from the fs/verity/ workqueue to avoid deadlocks when a file has both encryption and verity enabled. This is because the verity step can involve reading and decrypting Merkle tree blocks. The main thing I was wondering was whether there was some plan to use this new per-sb workqueue as a generic post-read processing queue, as seemed to be implied by the chosen naming. If there's no clear plan, it should instead just be named after what it's actually used for, fsverity. - Eric
On Thu, Mar 07, 2024 at 01:58:57PM -0800, Darrick J. Wong wrote: > On Mon, Mar 04, 2024 at 05:08:05PM -0800, Eric Biggers wrote: > > On Mon, Mar 04, 2024 at 08:10:31PM +0100, Andrey Albershteyn wrote: > > > For XFS, fsverity's global workqueue is not really suitable due to: > > > > > > 1. High priority workqueues are used within XFS to ensure that data > > > IO completion cannot stall processing of journal IO completions. > > > Hence using a WQ_HIGHPRI workqueue directly in the user data IO > > > path is a potential filesystem livelock/deadlock vector. > > > > > > 2. The fsverity workqueue is global - it creates a cross-filesystem > > > contention point. > > > > > > This patch adds per-filesystem, per-cpu workqueue for fsverity > > > work. This allows iomap to add verification work in the read path on > > > BIO completion. > > > > > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> > > > > Should ext4 and f2fs switch over to this by converting > > fsverity_enqueue_verify_work() to use it? I'd prefer not to have to maintain > > two separate workqueue strategies as part of the fs/verity/ infrastructure. > > (Agreed.) > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > > index 1fbc72c5f112..5863519ffd51 100644 > > > --- a/include/linux/fs.h > > > +++ b/include/linux/fs.h > > > @@ -1223,6 +1223,8 @@ struct super_block { > > > #endif > > > #ifdef CONFIG_FS_VERITY > > > const struct fsverity_operations *s_vop; > > > + /* Completion queue for post read verification */ > > > + struct workqueue_struct *s_read_done_wq; > > > #endif > > > > Maybe s_verity_wq? Or do you anticipate this being used for other purposes too, > > such as fscrypt? Note that it's behind CONFIG_FS_VERITY and is allocated by an > > fsverity_* function, so at least at the moment it doesn't feel very generic. > > Doesn't fscrypt already create its own workqueues? Yes, but iomap really needs it to be a generic sb workqueue like the sb->s_dio_done_wq. i.e. the completion processing in task context is not isolated to fsverity - it will likely also be needed for fscrypt completion and decompression on read completion, when they get supported by the generic iomap IO infrastruture. i.e. Any read IO that needs a task context for completion side data processing will need this. The reality is that the major filesytsems are already using either generic or internal per-fs workqueues for DIO completion and buffered write completions. Some already use buried workqueues on read IO completion, too, so moving this to being supported by the generic IO infrastructure rather than reimplmenting it just a little bit differently in every filesystem makes a lot of sense. -Dave.
On Thu, Mar 07, 2024 at 02:26:22PM -0800, Eric Biggers wrote: > On Thu, Mar 07, 2024 at 01:58:57PM -0800, Darrick J. Wong wrote: > > > Maybe s_verity_wq? Or do you anticipate this being used for other purposes too, > > > such as fscrypt? Note that it's behind CONFIG_FS_VERITY and is allocated by an > > > fsverity_* function, so at least at the moment it doesn't feel very generic. > > > > Doesn't fscrypt already create its own workqueues? > > There's a global workqueue in fs/crypto/ too. IIRC, it has to be separate from > the fs/verity/ workqueue to avoid deadlocks when a file has both encryption and > verity enabled. This is because the verity step can involve reading and > decrypting Merkle tree blocks. > > The main thing I was wondering was whether there was some plan to use this new > per-sb workqueue as a generic post-read processing queue, as seemed to be > implied by the chosen naming. If there's no clear plan, it should instead just > be named after what it's actually used for, fsverity. My guess is that there's a lot less complexity if each subsystem (crypt, compression, verity, etc) gets its own workqueues instead of sharing so that there aren't starvation issues. It's too bad that's sort of wasteful, when what we really want is to submit a chain of dependent work_structs and have the workqueue(s) run that chain on the same cpu if possible. --D > - Eric >
diff --git a/fs/super.c b/fs/super.c index d6efeba0d0ce..03795ee4d9b9 100644 --- a/fs/super.c +++ b/fs/super.c @@ -637,6 +637,13 @@ void generic_shutdown_super(struct super_block *sb) sb->s_dio_done_wq = NULL; } +#ifdef CONFIG_FS_VERITY + if (sb->s_read_done_wq) { + destroy_workqueue(sb->s_read_done_wq); + sb->s_read_done_wq = NULL; + } +#endif + if (sop->put_super) sop->put_super(sb); diff --git a/include/linux/fs.h b/include/linux/fs.h index 1fbc72c5f112..5863519ffd51 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1223,6 +1223,8 @@ struct super_block { #endif #ifdef CONFIG_FS_VERITY const struct fsverity_operations *s_vop; + /* Completion queue for post read verification */ + struct workqueue_struct *s_read_done_wq; #endif #if IS_ENABLED(CONFIG_UNICODE) struct unicode_map *s_encoding; diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h index 0973b521ac5a..45b7c613148a 100644 --- a/include/linux/fsverity.h +++ b/include/linux/fsverity.h @@ -241,6 +241,22 @@ void fsverity_enqueue_verify_work(struct work_struct *work); void fsverity_invalidate_block(struct inode *inode, struct fsverity_blockbuf *block); +static inline int fsverity_set_ops(struct super_block *sb, + const struct fsverity_operations *ops) +{ + sb->s_vop = ops; + + /* Create per-sb workqueue for post read bio verification */ + struct workqueue_struct *wq = alloc_workqueue( + "pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id); + if (!wq) + return -ENOMEM; + + sb->s_read_done_wq = wq; + + return 0; +} + #else /* !CONFIG_FS_VERITY */ static inline struct fsverity_info *fsverity_get_info(const struct inode *inode) @@ -318,6 +334,12 @@ static inline void fsverity_enqueue_verify_work(struct work_struct *work) WARN_ON_ONCE(1); } +static inline int fsverity_set_ops(struct super_block *sb, + const struct fsverity_operations *ops) +{ + return -EOPNOTSUPP; +} + #endif /* !CONFIG_FS_VERITY */ static inline bool fsverity_verify_folio(struct folio *folio)
For XFS, fsverity's global workqueue is not really suitable due to: 1. High priority workqueues are used within XFS to ensure that data IO completion cannot stall processing of journal IO completions. Hence using a WQ_HIGHPRI workqueue directly in the user data IO path is a potential filesystem livelock/deadlock vector. 2. The fsverity workqueue is global - it creates a cross-filesystem contention point. This patch adds per-filesystem, per-cpu workqueue for fsverity work. This allows iomap to add verification work in the read path on BIO completion. Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com> --- fs/super.c | 7 +++++++ include/linux/fs.h | 2 ++ include/linux/fsverity.h | 22 ++++++++++++++++++++++ 3 files changed, 31 insertions(+)