Message ID | 1442493587-32499-2-git-send-email-jeff.layton@primarydata.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 17 Sep 2015 08:39:44 -0400 Jeff Layton <jlayton@poochiereds.net> wrote: > I think there's a potential race in flush_delayed_fput. A kthread does > an fput() and that file gets added to the list and the delayed work is > scheduled. More than 1 jiffy passes, and the workqueue thread picks up > the work and starts running it. Then the kthread calls > flush_delayed_work. It sees that the list is empty and returns > immediately, even though the __fput for its file may not have run yet. > > Close this by making flush_delayed_fput use flush_delayed_work instead, > which should immediately schedule the work to run if it's not already, > and block until the workqueue job completes. > > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> > --- > fs/file_table.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > It should be noted that the only current user of flush_delayed_fput() can call it before the workqueue threads are ever started. Looking at the code, I *think* this will still do the right thing -- block until those threads are started and then flush the work as usual, but do let me know if I've misread it. > diff --git a/fs/file_table.c b/fs/file_table.c > index ad17e05ebf95..52cc6803c07a 100644 > --- a/fs/file_table.c > +++ b/fs/file_table.c > @@ -244,6 +244,8 @@ static void ____fput(struct callback_head *work) > __fput(container_of(work, struct file, f_u.fu_rcuhead)); > } > > +static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); > + > /* > * If kernel thread really needs to have the final fput() it has done > * to complete, call this. The only user right now is the boot - we > @@ -256,11 +258,9 @@ static void ____fput(struct callback_head *work) > */ > void flush_delayed_fput(void) > { > - delayed_fput(NULL); > + flush_delayed_work(&delayed_fput_work); > } > > -static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); > - > void fput(struct file *file) > { > if (atomic_long_dec_and_test(&file->f_count)) {
diff --git a/fs/file_table.c b/fs/file_table.c index ad17e05ebf95..52cc6803c07a 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -244,6 +244,8 @@ static void ____fput(struct callback_head *work) __fput(container_of(work, struct file, f_u.fu_rcuhead)); } +static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); + /* * If kernel thread really needs to have the final fput() it has done * to complete, call this. The only user right now is the boot - we @@ -256,11 +258,9 @@ static void ____fput(struct callback_head *work) */ void flush_delayed_fput(void) { - delayed_fput(NULL); + flush_delayed_work(&delayed_fput_work); } -static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput); - void fput(struct file *file) { if (atomic_long_dec_and_test(&file->f_count)) {
I think there's a potential race in flush_delayed_fput. A kthread does an fput() and that file gets added to the list and the delayed work is scheduled. More than 1 jiffy passes, and the workqueue thread picks up the work and starts running it. Then the kthread calls flush_delayed_work. It sees that the list is empty and returns immediately, even though the __fput for its file may not have run yet. Close this by making flush_delayed_fput use flush_delayed_work instead, which should immediately schedule the work to run if it's not already, and block until the workqueue job completes. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> --- fs/file_table.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)