Message ID | 20191122235324.17245-1-agruenba@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | Rework the gfs2 read and page fault locking | expand |
On Sat, Nov 23, 2019 at 12:53:21AM +0100, Andreas Gruenbacher wrote: > Hello, > > this patch series moves the glock lock taking in gfs2 from the > ->readpage and ->readpages inode operations to the ->read_iter file and > ->fault vm operations. To achieve that, we add flags to the > generic_file_read_iter and filemap_fault generic helpers. > > This proposal was triggered by the following discussion: > > https://lore.kernel.org/linux-fsdevel/157225677483.3442.4227193290486305330.stgit@buzz/ > > In that thread, Linus argued that filesystems should make sure the inode > size is sufficiently up-to-date before calling the generic helpers, and > that filesystems can do it themselves if they want more than that. > That's surely doable. However, implementing those operations properly > at the filesystem level quickly becomes complicated when it gets to > things like readahead. In addition, those slightly modified copies of > those helpers would surely diverge from their originals over time, and > maintaining them properly would become hard. So I hope the relatively > small changes to make the original helpers slightly more flexible will > be acceptable instead. > > With the IOCB_CACHED flag added by one of the patches in this series, > the code that Konstantin's initial patch adds to > generic_file_buffered_read could be made conditional on the IOCB_CACHED > flag being cleared. That way, it won't misfire on filesystems that > allow a stale inode size. (I'm not sure if any filesystems other than > gfs2 are actually affected.) > > Some additional explanation: > > The cache consistency model of filesystems like gfs2 is such that if > pages are found in an inode's address space, those pages as well as the > inode size are up to date and can be used without taking any filesystem > locks. If a page is not cached, filesystem locks must be taken before > the page can be read; this will also bring the inode size up to date. > > Thus far, gfs2 has taken the filesystem locks inside the ->readpage and > ->readpages address space operations. A better approach seems to be to > take those locks earlier, in the ->read_iter file and ->fault vm > operations. This would also avoid a lock inversion in ->readpages. > > We obviously want to avoid taking the filesystem locks unnecessarily > when the pages we are looking for are already cached; otherwise, we > would cripple performance. So we need to check if those pages are > present first. That's actually exactly what the generic_file_read_iter > and filemap_fault helpers do already anyway, except that they will call > into ->readpage and ->readpages when they find pages missing. Instead > of that, we'd like those helpers to return with an error code that > allows us to retry the operation after taking the filesystem locks. Do you see IOCB_CACHED/FAULT_FLAG_CACHED semantics being usable for anyting beyond gfs2?