Message ID | 20221117115023.1350181-1-dwysocha@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 17 Nov 2022 at 11:52, Dave Wysochanski <dwysocha@redhat.com> wrote: > > This patch should fix the oops seen by Daire while testing the latest > NFS netfs fscache conversion patches [1][2]. What follows is a detailed > explanation of the analysis, mostly for reference and in case any of > the patch header is unclear. I can now confirm that this does indeed fix the issue I was hitting - it has been over 4 days and I have not seen the crash that I was reliably reproducing at least once a day. Many thanks for tracking this down Dave. I will try to switch over to the v2 patch sometime this week but I don't expect a change in functionality. The important thing is you found the place where it was going wrong and why. Tested-by: Daire Byrne <daire@dneg.com> Daire
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index 451d8a077e12..a90c743fec79 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -605,6 +605,13 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify) set_bit(FSCACHE_COOKIE_DO_PREP_TO_WRITE, &cookie->flags); queue = true; } + /* + * We could race with cookie_lru which may set LRU_DISCARD bit + * but has yet to run the cookie state machine. If this happens + * and another thread tries to use the cookie, clear LRU_DISCARD + * so we don't end up withdrawing the cookie while in use. + */ + clear_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags); break; case FSCACHE_COOKIE_STATE_FAILED: