Message ID | 20250325-fr_pending-race-v2-1-487945a6c197@ddn.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] fuse: {io-uring} Fix a possible req cancellation race | expand |
Hi Bernd! On Tue, Mar 25 2025, Bernd Schubert wrote: > task-A (application) might be in request_wait_answer and > try to remove the request when it has FR_PENDING set. > > task-B (a fuse-server io-uring task) might handle this > request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when > fetching the next request and accessed the req from > the pending list in fuse_uring_ent_assign_req(). > That code path was not protected by fiq->lock and so > might race with task-A. > > For scaling reasons we better don't use fiq->lock, but > add a handler to remove canceled requests from the queue. > > This also removes usage of fiq->lock from > fuse_uring_add_req_to_ring_ent() altogether, as it was > there just to protect against this race and incomplete. > > Also added is a comment why FR_PENDING is not cleared. At first, this patch looked OK to me. However, after looking closer, I'm not sure if this doesn't break fuse_abort_conn(). Because that function assumes it is safe to walk through all the requests using fiq->lock, it could race against fuse_uring_remove_pending_req(), which uses queue->lock instead. Am I missing something (quite likely!), or does fuse_abort_conn() also needs to be modified? [ Another scenario that is not problematic, but that could become messy in the future is if we want to add support for the FUSE_NOTIFY_RESEND operation through uring. Obviously, that's not an issue right now, but this patch probably will make it harder to add that support. ] Cheers,
On 3/25/25 22:38, Luis Henriques wrote: > Hi Bernd! > > On Tue, Mar 25 2025, Bernd Schubert wrote: > >> task-A (application) might be in request_wait_answer and >> try to remove the request when it has FR_PENDING set. >> >> task-B (a fuse-server io-uring task) might handle this >> request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when >> fetching the next request and accessed the req from >> the pending list in fuse_uring_ent_assign_req(). >> That code path was not protected by fiq->lock and so >> might race with task-A. >> >> For scaling reasons we better don't use fiq->lock, but >> add a handler to remove canceled requests from the queue. >> >> This also removes usage of fiq->lock from >> fuse_uring_add_req_to_ring_ent() altogether, as it was >> there just to protect against this race and incomplete. >> >> Also added is a comment why FR_PENDING is not cleared. Hi Luis, thanks for your review! > > At first, this patch looked OK to me. However, after looking closer, I'm > not sure if this doesn't break fuse_abort_conn(). Because that function > assumes it is safe to walk through all the requests using fiq->lock, it > could race against fuse_uring_remove_pending_req(), which uses queue->lock > instead. Am I missing something (quite likely!), or does fuse_abort_conn() > also needs to be modified? I don't think there is an issue with abort fuse_abort_conn() spin_lock(&fiq->lock); list_for_each_entry(req, &fiq->pending, list) ... spin_unlock(&fiq->lock); ... fuse_uring_abort(fc); Iterating fiq->pending will not handle any uring request, as these are in per queue lists. The per uring queues are then handled in fuse_uring_abort(). I.e. I don't think this commit changes anything for abort. > > [ Another scenario that is not problematic, but that could become messy in > the future is if we want to add support for the FUSE_NOTIFY_RESEND > operation through uring. Obviously, that's not an issue right now, but > this patch probably will make it harder to add that support. ] Oh yeah, this needs to be fixed. Though I don't think that this patch changes much. We need to iterate through the per fpq and apply the same logic? Thanks, Bernd
On Tue, Mar 25 2025, Bernd Schubert wrote: > On 3/25/25 22:38, Luis Henriques wrote: >> Hi Bernd! >> >> On Tue, Mar 25 2025, Bernd Schubert wrote: >> >>> task-A (application) might be in request_wait_answer and >>> try to remove the request when it has FR_PENDING set. >>> >>> task-B (a fuse-server io-uring task) might handle this >>> request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when >>> fetching the next request and accessed the req from >>> the pending list in fuse_uring_ent_assign_req(). >>> That code path was not protected by fiq->lock and so >>> might race with task-A. >>> >>> For scaling reasons we better don't use fiq->lock, but >>> add a handler to remove canceled requests from the queue. >>> >>> This also removes usage of fiq->lock from >>> fuse_uring_add_req_to_ring_ent() altogether, as it was >>> there just to protect against this race and incomplete. >>> >>> Also added is a comment why FR_PENDING is not cleared. > > Hi Luis, > > thanks for your review! > >> >> At first, this patch looked OK to me. However, after looking closer, I'm >> not sure if this doesn't break fuse_abort_conn(). Because that function >> assumes it is safe to walk through all the requests using fiq->lock, it >> could race against fuse_uring_remove_pending_req(), which uses queue->lock >> instead. Am I missing something (quite likely!), or does fuse_abort_conn() >> also needs to be modified? > > I don't think there is an issue with abort > > fuse_abort_conn() > spin_lock(&fiq->lock); > list_for_each_entry(req, &fiq->pending, list) > ... > spin_unlock(&fiq->lock); > > ... > > fuse_uring_abort(fc); > > Iterating fiq->pending will not handle any uring request, as these are > in per queue lists. The per uring queues are then handled in > fuse_uring_abort(). > > I.e. I don't think this commit changes anything for abort. Yeah, you're right. Thanks for looking, Bernd. >> >> [ Another scenario that is not problematic, but that could become messy in >> the future is if we want to add support for the FUSE_NOTIFY_RESEND >> operation through uring. Obviously, that's not an issue right now, but >> this patch probably will make it harder to add that support. ] > > Oh yeah, this needs to be fixed. Though I don't think that this patch > changes much. We need to iterate through the per fpq and apply the > same logic? Right, I agree this patch doesn't change anything here. And I guess I also misunderstood the problem here as well -- I thought this would be an issue when adding support for iouring, but in fact it is already a problem. The ideal solution would be to implement NOTIFY_RESEND over iouring, but that would be a bit more evolving. Cheers,
On Tue, Mar 25, 2025 at 10:30 AM Bernd Schubert <bschubert@ddn.com> wrote: > > task-A (application) might be in request_wait_answer and > try to remove the request when it has FR_PENDING set. > > task-B (a fuse-server io-uring task) might handle this > request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when > fetching the next request and accessed the req from > the pending list in fuse_uring_ent_assign_req(). > That code path was not protected by fiq->lock and so > might race with task-A. > > For scaling reasons we better don't use fiq->lock, but > add a handler to remove canceled requests from the queue. > > This also removes usage of fiq->lock from > fuse_uring_add_req_to_ring_ent() altogether, as it was > there just to protect against this race and incomplete. > > Also added is a comment why FR_PENDING is not cleared. > > Fixes: c090c8abae4b ("fuse: Add io-uring sqe commit and fetch support") > Reported-by: Joanne Koong <joannelkoong@gmail.com> > Closes: https://lore.kernel.org/all/CAJnrk1ZgHNb78dz-yfNTpxmW7wtT88A=m-zF0ZoLXKLUHRjNTw@mail.gmail.com/ > Signed-off-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Joanne Koong <joannelkoong@gmail.com> > --- > Changes in v2: > - Removed patch 1 that unset FR_PENDING > - Added a comment as part of this patch why FR_PENDING > is not cleared > - Replaced function pointer by direct call of > fuse_remove_pending_req > --- > fs/fuse/dev.c | 34 +++++++++++++++++++++++++--------- > fs/fuse/dev_uring.c | 15 +++++++++++---- > fs/fuse/dev_uring_i.h | 6 ++++++ > fs/fuse/fuse_dev_i.h | 1 + > fs/fuse/fuse_i.h | 3 +++ > 5 files changed, 46 insertions(+), 13 deletions(-) > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c > index 2c3a4d09e500f98232d5d9412a012235af6bec2e..2645cd8accfd081c518d3e22127e899ad5a09127 100644 > --- a/fs/fuse/dev.c > +++ b/fs/fuse/dev.c > @@ -407,6 +407,24 @@ static int queue_interrupt(struct fuse_req *req) > return 0; > } > > +bool fuse_remove_pending_req(struct fuse_req *req, spinlock_t *lock) > +{ > + spin_lock(lock); > + if (test_bit(FR_PENDING, &req->flags)) { > + /* > + * FR_PENDING does not get cleared as the request will end > + * up in destruction anyway. > + */ > + list_del(&req->list); > + spin_unlock(lock); > + __fuse_put_request(req); > + req->out.h.error = -EINTR; > + return true; > + } > + spin_unlock(lock); > + return false; > +} > + > static void request_wait_answer(struct fuse_req *req) > { > struct fuse_conn *fc = req->fm->fc; > @@ -428,22 +446,20 @@ static void request_wait_answer(struct fuse_req *req) > } > > if (!test_bit(FR_FORCE, &req->flags)) { > + bool removed; > + > /* Only fatal signals may interrupt this */ > err = wait_event_killable(req->waitq, > test_bit(FR_FINISHED, &req->flags)); > if (!err) > return; > > - spin_lock(&fiq->lock); > - /* Request is not yet in userspace, bail out */ > - if (test_bit(FR_PENDING, &req->flags)) { > - list_del(&req->list); > - spin_unlock(&fiq->lock); > - __fuse_put_request(req); > - req->out.h.error = -EINTR; > + if (test_bit(FR_URING, &req->flags)) > + removed = fuse_uring_remove_pending_req(req); > + else > + removed = fuse_remove_pending_req(req, &fiq->lock); > + if (removed) > return; > - } > - spin_unlock(&fiq->lock); > } > > /* > diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c > index ebd2931b4f2acac461091b6b1f1176cde759e2d1..add7273c8dc4a23a23e50b879db470fc06bd3d20 100644 > --- a/fs/fuse/dev_uring.c > +++ b/fs/fuse/dev_uring.c > @@ -726,8 +726,6 @@ static void fuse_uring_add_req_to_ring_ent(struct fuse_ring_ent *ent, > struct fuse_req *req) > { > struct fuse_ring_queue *queue = ent->queue; > - struct fuse_conn *fc = req->fm->fc; > - struct fuse_iqueue *fiq = &fc->iq; > > lockdep_assert_held(&queue->lock); > > @@ -737,9 +735,7 @@ static void fuse_uring_add_req_to_ring_ent(struct fuse_ring_ent *ent, > ent->state); > } > > - spin_lock(&fiq->lock); > clear_bit(FR_PENDING, &req->flags); > - spin_unlock(&fiq->lock); > ent->fuse_req = req; > ent->state = FRRS_FUSE_REQ; > list_move(&ent->list, &queue->ent_w_req_queue); > @@ -1238,6 +1234,8 @@ void fuse_uring_queue_fuse_req(struct fuse_iqueue *fiq, struct fuse_req *req) > if (unlikely(queue->stopped)) > goto err_unlock; > > + set_bit(FR_URING, &req->flags); > + req->ring_queue = queue; > ent = list_first_entry_or_null(&queue->ent_avail_queue, > struct fuse_ring_ent, list); > if (ent) > @@ -1276,6 +1274,8 @@ bool fuse_uring_queue_bq_req(struct fuse_req *req) > return false; > } > > + set_bit(FR_URING, &req->flags); > + req->ring_queue = queue; > list_add_tail(&req->list, &queue->fuse_req_bg_queue); > > ent = list_first_entry_or_null(&queue->ent_avail_queue, > @@ -1306,6 +1306,13 @@ bool fuse_uring_queue_bq_req(struct fuse_req *req) > return true; > } > > +bool fuse_uring_remove_pending_req(struct fuse_req *req) > +{ > + struct fuse_ring_queue *queue = req->ring_queue; > + > + return fuse_remove_pending_req(req, &queue->lock); > +} > + > static const struct fuse_iqueue_ops fuse_io_uring_ops = { > /* should be send over io-uring as enhancement */ > .send_forget = fuse_dev_queue_forget, > diff --git a/fs/fuse/dev_uring_i.h b/fs/fuse/dev_uring_i.h > index 2102b3d0c1aed1105e9c1200c91e1cb497b9a597..e5b39a92b7ca0e371512e8071f15c89bb30caf59 100644 > --- a/fs/fuse/dev_uring_i.h > +++ b/fs/fuse/dev_uring_i.h > @@ -142,6 +142,7 @@ void fuse_uring_abort_end_requests(struct fuse_ring *ring); > int fuse_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags); > void fuse_uring_queue_fuse_req(struct fuse_iqueue *fiq, struct fuse_req *req); > bool fuse_uring_queue_bq_req(struct fuse_req *req); > +bool fuse_uring_remove_pending_req(struct fuse_req *req); > > static inline void fuse_uring_abort(struct fuse_conn *fc) > { > @@ -200,6 +201,11 @@ static inline bool fuse_uring_ready(struct fuse_conn *fc) > return false; > } > > +static inline bool fuse_uring_remove_pending_req(struct fuse_req *req) > +{ > + return false; > +} > + > #endif /* CONFIG_FUSE_IO_URING */ > > #endif /* _FS_FUSE_DEV_URING_I_H */ > diff --git a/fs/fuse/fuse_dev_i.h b/fs/fuse/fuse_dev_i.h > index 3b2bfe1248d3573abe3b144a6d4bf6a502f56a40..2481da3388c5feec944143bfabb8d430a447d322 100644 > --- a/fs/fuse/fuse_dev_i.h > +++ b/fs/fuse/fuse_dev_i.h > @@ -61,6 +61,7 @@ int fuse_copy_out_args(struct fuse_copy_state *cs, struct fuse_args *args, > void fuse_dev_queue_forget(struct fuse_iqueue *fiq, > struct fuse_forget_link *forget); > void fuse_dev_queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req); > +bool fuse_remove_pending_req(struct fuse_req *req, spinlock_t *lock); > > #endif > > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index fee96fe7887b30cd57b8a6bbda11447a228cf446..2086dac7243ba82e1ce6762e2d1406014566aaaa 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -378,6 +378,7 @@ struct fuse_io_priv { > * FR_FINISHED: request is finished > * FR_PRIVATE: request is on private list > * FR_ASYNC: request is asynchronous > + * FR_URING: request is handled through fuse-io-uring > */ > enum fuse_req_flag { > FR_ISREPLY, > @@ -392,6 +393,7 @@ enum fuse_req_flag { > FR_FINISHED, > FR_PRIVATE, > FR_ASYNC, > + FR_URING, > }; > > /** > @@ -441,6 +443,7 @@ struct fuse_req { > > #ifdef CONFIG_FUSE_IO_URING > void *ring_entry; > + void *ring_queue; > #endif > }; > > > --- > base-commit: 81e4f8d68c66da301bb881862735bd74c6241a19 > change-id: 20250218-fr_pending-race-3e362f22f319 > > Best regards, > -- > Bernd Schubert <bschubert@ddn.com> >
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 2c3a4d09e500f98232d5d9412a012235af6bec2e..2645cd8accfd081c518d3e22127e899ad5a09127 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -407,6 +407,24 @@ static int queue_interrupt(struct fuse_req *req) return 0; } +bool fuse_remove_pending_req(struct fuse_req *req, spinlock_t *lock) +{ + spin_lock(lock); + if (test_bit(FR_PENDING, &req->flags)) { + /* + * FR_PENDING does not get cleared as the request will end + * up in destruction anyway. + */ + list_del(&req->list); + spin_unlock(lock); + __fuse_put_request(req); + req->out.h.error = -EINTR; + return true; + } + spin_unlock(lock); + return false; +} + static void request_wait_answer(struct fuse_req *req) { struct fuse_conn *fc = req->fm->fc; @@ -428,22 +446,20 @@ static void request_wait_answer(struct fuse_req *req) } if (!test_bit(FR_FORCE, &req->flags)) { + bool removed; + /* Only fatal signals may interrupt this */ err = wait_event_killable(req->waitq, test_bit(FR_FINISHED, &req->flags)); if (!err) return; - spin_lock(&fiq->lock); - /* Request is not yet in userspace, bail out */ - if (test_bit(FR_PENDING, &req->flags)) { - list_del(&req->list); - spin_unlock(&fiq->lock); - __fuse_put_request(req); - req->out.h.error = -EINTR; + if (test_bit(FR_URING, &req->flags)) + removed = fuse_uring_remove_pending_req(req); + else + removed = fuse_remove_pending_req(req, &fiq->lock); + if (removed) return; - } - spin_unlock(&fiq->lock); } /* diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c index ebd2931b4f2acac461091b6b1f1176cde759e2d1..add7273c8dc4a23a23e50b879db470fc06bd3d20 100644 --- a/fs/fuse/dev_uring.c +++ b/fs/fuse/dev_uring.c @@ -726,8 +726,6 @@ static void fuse_uring_add_req_to_ring_ent(struct fuse_ring_ent *ent, struct fuse_req *req) { struct fuse_ring_queue *queue = ent->queue; - struct fuse_conn *fc = req->fm->fc; - struct fuse_iqueue *fiq = &fc->iq; lockdep_assert_held(&queue->lock); @@ -737,9 +735,7 @@ static void fuse_uring_add_req_to_ring_ent(struct fuse_ring_ent *ent, ent->state); } - spin_lock(&fiq->lock); clear_bit(FR_PENDING, &req->flags); - spin_unlock(&fiq->lock); ent->fuse_req = req; ent->state = FRRS_FUSE_REQ; list_move(&ent->list, &queue->ent_w_req_queue); @@ -1238,6 +1234,8 @@ void fuse_uring_queue_fuse_req(struct fuse_iqueue *fiq, struct fuse_req *req) if (unlikely(queue->stopped)) goto err_unlock; + set_bit(FR_URING, &req->flags); + req->ring_queue = queue; ent = list_first_entry_or_null(&queue->ent_avail_queue, struct fuse_ring_ent, list); if (ent) @@ -1276,6 +1274,8 @@ bool fuse_uring_queue_bq_req(struct fuse_req *req) return false; } + set_bit(FR_URING, &req->flags); + req->ring_queue = queue; list_add_tail(&req->list, &queue->fuse_req_bg_queue); ent = list_first_entry_or_null(&queue->ent_avail_queue, @@ -1306,6 +1306,13 @@ bool fuse_uring_queue_bq_req(struct fuse_req *req) return true; } +bool fuse_uring_remove_pending_req(struct fuse_req *req) +{ + struct fuse_ring_queue *queue = req->ring_queue; + + return fuse_remove_pending_req(req, &queue->lock); +} + static const struct fuse_iqueue_ops fuse_io_uring_ops = { /* should be send over io-uring as enhancement */ .send_forget = fuse_dev_queue_forget, diff --git a/fs/fuse/dev_uring_i.h b/fs/fuse/dev_uring_i.h index 2102b3d0c1aed1105e9c1200c91e1cb497b9a597..e5b39a92b7ca0e371512e8071f15c89bb30caf59 100644 --- a/fs/fuse/dev_uring_i.h +++ b/fs/fuse/dev_uring_i.h @@ -142,6 +142,7 @@ void fuse_uring_abort_end_requests(struct fuse_ring *ring); int fuse_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags); void fuse_uring_queue_fuse_req(struct fuse_iqueue *fiq, struct fuse_req *req); bool fuse_uring_queue_bq_req(struct fuse_req *req); +bool fuse_uring_remove_pending_req(struct fuse_req *req); static inline void fuse_uring_abort(struct fuse_conn *fc) { @@ -200,6 +201,11 @@ static inline bool fuse_uring_ready(struct fuse_conn *fc) return false; } +static inline bool fuse_uring_remove_pending_req(struct fuse_req *req) +{ + return false; +} + #endif /* CONFIG_FUSE_IO_URING */ #endif /* _FS_FUSE_DEV_URING_I_H */ diff --git a/fs/fuse/fuse_dev_i.h b/fs/fuse/fuse_dev_i.h index 3b2bfe1248d3573abe3b144a6d4bf6a502f56a40..2481da3388c5feec944143bfabb8d430a447d322 100644 --- a/fs/fuse/fuse_dev_i.h +++ b/fs/fuse/fuse_dev_i.h @@ -61,6 +61,7 @@ int fuse_copy_out_args(struct fuse_copy_state *cs, struct fuse_args *args, void fuse_dev_queue_forget(struct fuse_iqueue *fiq, struct fuse_forget_link *forget); void fuse_dev_queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req); +bool fuse_remove_pending_req(struct fuse_req *req, spinlock_t *lock); #endif diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index fee96fe7887b30cd57b8a6bbda11447a228cf446..2086dac7243ba82e1ce6762e2d1406014566aaaa 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -378,6 +378,7 @@ struct fuse_io_priv { * FR_FINISHED: request is finished * FR_PRIVATE: request is on private list * FR_ASYNC: request is asynchronous + * FR_URING: request is handled through fuse-io-uring */ enum fuse_req_flag { FR_ISREPLY, @@ -392,6 +393,7 @@ enum fuse_req_flag { FR_FINISHED, FR_PRIVATE, FR_ASYNC, + FR_URING, }; /** @@ -441,6 +443,7 @@ struct fuse_req { #ifdef CONFIG_FUSE_IO_URING void *ring_entry; + void *ring_queue; #endif };
task-A (application) might be in request_wait_answer and try to remove the request when it has FR_PENDING set. task-B (a fuse-server io-uring task) might handle this request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when fetching the next request and accessed the req from the pending list in fuse_uring_ent_assign_req(). That code path was not protected by fiq->lock and so might race with task-A. For scaling reasons we better don't use fiq->lock, but add a handler to remove canceled requests from the queue. This also removes usage of fiq->lock from fuse_uring_add_req_to_ring_ent() altogether, as it was there just to protect against this race and incomplete. Also added is a comment why FR_PENDING is not cleared. Fixes: c090c8abae4b ("fuse: Add io-uring sqe commit and fetch support") Reported-by: Joanne Koong <joannelkoong@gmail.com> Closes: https://lore.kernel.org/all/CAJnrk1ZgHNb78dz-yfNTpxmW7wtT88A=m-zF0ZoLXKLUHRjNTw@mail.gmail.com/ Signed-off-by: Bernd Schubert <bschubert@ddn.com> --- Changes in v2: - Removed patch 1 that unset FR_PENDING - Added a comment as part of this patch why FR_PENDING is not cleared - Replaced function pointer by direct call of fuse_remove_pending_req --- fs/fuse/dev.c | 34 +++++++++++++++++++++++++--------- fs/fuse/dev_uring.c | 15 +++++++++++---- fs/fuse/dev_uring_i.h | 6 ++++++ fs/fuse/fuse_dev_i.h | 1 + fs/fuse/fuse_i.h | 3 +++ 5 files changed, 46 insertions(+), 13 deletions(-) --- base-commit: 81e4f8d68c66da301bb881862735bd74c6241a19 change-id: 20250218-fr_pending-race-3e362f22f319 Best regards,