Message ID | 20200320142231.2402-13-willy@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Change readahead API | expand |
On Fri, Mar 20, 2020 at 07:22:18AM -0700, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > By reducing nr_to_read, we can eliminate this check from inside the loop. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > Reviewed-by: John Hubbard <jhubbard@nvidia.com> > Reviewed-by: William Kucharski <william.kucharski@oracle.com> > --- > mm/readahead.c | 17 +++++++++++------ > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/mm/readahead.c b/mm/readahead.c > index d01531ef9f3c..a37b68f66233 100644 > --- a/mm/readahead.c > +++ b/mm/readahead.c > @@ -167,8 +167,6 @@ void __do_page_cache_readahead(struct address_space *mapping, > unsigned long lookahead_size) > { > struct inode *inode = mapping->host; > - struct page *page; > - unsigned long end_index; /* The last page we want to read */ > LIST_HEAD(page_pool); > loff_t isize = i_size_read(inode); > gfp_t gfp_mask = readahead_gfp_mask(mapping); > @@ -178,22 +176,29 @@ void __do_page_cache_readahead(struct address_space *mapping, > ._index = index, > }; > unsigned long i; > + pgoff_t end_index; /* The last page we want to read */ > > if (isize == 0) > return; > > - end_index = ((isize - 1) >> PAGE_SHIFT); > + end_index = (isize - 1) >> PAGE_SHIFT; > + if (index > end_index) > + return; > + /* Avoid wrapping to the beginning of the file */ > + if (index + nr_to_read < index) > + nr_to_read = ULONG_MAX - index + 1; > + /* Don't read past the page containing the last byte of the file */ > + if (index + nr_to_read >= end_index) > + nr_to_read = end_index - index + 1; There seem to be a couple off-by-one errors here. Shouldn't it be: /* Avoid wrapping to the beginning of the file */ if (index + nr_to_read < index) nr_to_read = ULONG_MAX - index; /* Don't read past the page containing the last byte of the file */ if (index + nr_to_read > end_index) nr_to_read = end_index - index + 1; I.e., 'ULONG_MAX - index' rather than 'ULONG_MAX - index + 1', so that 'index + nr_to_read' is then ULONG_MAX rather than overflowed to 0. Then 'index + nr_to_read > end_index' rather 'index + nr_to_read >= end_index', since otherwise nr_to_read can be increased by 1 rather than decreased or stay the same as expected. - Eric
On Fri, Mar 20, 2020 at 09:58:28AM -0700, Eric Biggers wrote: > On Fri, Mar 20, 2020 at 07:22:18AM -0700, Matthew Wilcox wrote: > > + /* Avoid wrapping to the beginning of the file */ > > + if (index + nr_to_read < index) > > + nr_to_read = ULONG_MAX - index + 1; > > + /* Don't read past the page containing the last byte of the file */ > > + if (index + nr_to_read >= end_index) > > + nr_to_read = end_index - index + 1; > > There seem to be a couple off-by-one errors here. Shouldn't it be: > > /* Avoid wrapping to the beginning of the file */ > if (index + nr_to_read < index) > nr_to_read = ULONG_MAX - index; I think it's right. Imagine that index is ULONG_MAX. We should read one page (the one at ULONG_MAX). That would be ULONG_MAX - ULONG_MAX + 1. > /* Don't read past the page containing the last byte of the file */ > if (index + nr_to_read > end_index) > nr_to_read = end_index - index + 1; > > I.e., 'ULONG_MAX - index' rather than 'ULONG_MAX - index + 1', so that > 'index + nr_to_read' is then ULONG_MAX rather than overflowed to 0. > > Then 'index + nr_to_read > end_index' rather 'index + nr_to_read >= end_index', > since otherwise nr_to_read can be increased by 1 rather than decreased or stay > the same as expected. Ooh, I missed the overflow case here. It should be: + if (index + nr_to_read - 1 > end_index) + nr_to_read = end_index - index + 1; Let's say index comes in at ULONG_MAX - 2, end_index is ULONG_MAX - 1 and nr_to_read is 8. The first condition triggers and nr_to_read is reduced to 3. But then the second condition wouldn't trigger because ULONG_MAX - 2 + 3 is 0. With the rewrite I have in this message, ULONG_MAX - 2 + 3 - 1 is ULONG_MAX, which is > ULONG_MAX - 1. So the condition triggers and nr_to_read becomes (ULONG_MAX - 1) - (ULONG_MAX - 2) + 1. Which is -1 + 2 + 1, which is 2. Which is the right answer because we want to read two pages; the one at ULONG_MAX - 2 and the one at ULONG_MAX - 1. Thank you!
On Fri, Mar 20, 2020 at 10:30:40AM -0700, Matthew Wilcox wrote: > On Fri, Mar 20, 2020 at 09:58:28AM -0700, Eric Biggers wrote: > > On Fri, Mar 20, 2020 at 07:22:18AM -0700, Matthew Wilcox wrote: > > > + /* Avoid wrapping to the beginning of the file */ > > > + if (index + nr_to_read < index) > > > + nr_to_read = ULONG_MAX - index + 1; > > > + /* Don't read past the page containing the last byte of the file */ > > > + if (index + nr_to_read >= end_index) > > > + nr_to_read = end_index - index + 1; > > > > There seem to be a couple off-by-one errors here. Shouldn't it be: > > > > /* Avoid wrapping to the beginning of the file */ > > if (index + nr_to_read < index) > > nr_to_read = ULONG_MAX - index; > > I think it's right. Imagine that index is ULONG_MAX. We should read one > page (the one at ULONG_MAX). That would be ULONG_MAX - ULONG_MAX + 1. > > > /* Don't read past the page containing the last byte of the file */ > > if (index + nr_to_read > end_index) > > nr_to_read = end_index - index + 1; > > > > I.e., 'ULONG_MAX - index' rather than 'ULONG_MAX - index + 1', so that > > 'index + nr_to_read' is then ULONG_MAX rather than overflowed to 0. > > > > Then 'index + nr_to_read > end_index' rather 'index + nr_to_read >= end_index', > > since otherwise nr_to_read can be increased by 1 rather than decreased or stay > > the same as expected. > > Ooh, I missed the overflow case here. It should be: > > + if (index + nr_to_read - 1 > end_index) > + nr_to_read = end_index - index + 1; > But then if someone passes index=0 and nr_to_read=0, this underflows and the entire file gets read. The page cache isn't actually supposed to contain a page at index ULONG_MAX, since MAX_LFS_FILESIZE is at most ((loff_t)ULONG_MAX << PAGE_SHIFT), right? So I don't think we need to worry about reading the page with index ULONG_MAX. I.e. I think it's fine to limit nr_to_read to 'ULONG_MAX - index', if that makes it easier to avoid an overflow or underflow in the next check. - Eric
On Fri, Mar 20, 2020 at 11:00:17AM -0700, Eric Biggers wrote: > On Fri, Mar 20, 2020 at 10:30:40AM -0700, Matthew Wilcox wrote: > > On Fri, Mar 20, 2020 at 09:58:28AM -0700, Eric Biggers wrote: > > > On Fri, Mar 20, 2020 at 07:22:18AM -0700, Matthew Wilcox wrote: > > > > + /* Avoid wrapping to the beginning of the file */ > > > > + if (index + nr_to_read < index) > > > > + nr_to_read = ULONG_MAX - index + 1; > > > > + /* Don't read past the page containing the last byte of the file */ > > > > + if (index + nr_to_read >= end_index) > > > > + nr_to_read = end_index - index + 1; > > > > > > There seem to be a couple off-by-one errors here. Shouldn't it be: > > > > > > /* Avoid wrapping to the beginning of the file */ > > > if (index + nr_to_read < index) > > > nr_to_read = ULONG_MAX - index; > > > > I think it's right. Imagine that index is ULONG_MAX. We should read one > > page (the one at ULONG_MAX). That would be ULONG_MAX - ULONG_MAX + 1. > > > > > /* Don't read past the page containing the last byte of the file */ > > > if (index + nr_to_read > end_index) > > > nr_to_read = end_index - index + 1; > > > > > > I.e., 'ULONG_MAX - index' rather than 'ULONG_MAX - index + 1', so that > > > 'index + nr_to_read' is then ULONG_MAX rather than overflowed to 0. > > > > > > Then 'index + nr_to_read > end_index' rather 'index + nr_to_read >= end_index', > > > since otherwise nr_to_read can be increased by 1 rather than decreased or stay > > > the same as expected. > > > > Ooh, I missed the overflow case here. It should be: > > > > + if (index + nr_to_read - 1 > end_index) > > + nr_to_read = end_index - index + 1; > > > > But then if someone passes index=0 and nr_to_read=0, this underflows and the > entire file gets read. nr_to_read == 0 doesn't make sense ... I thought we filtered that out earlier, but I can't find anywhere that does that right now. I'd rather return early from __do_page_cache_readahead() to fix that. > The page cache isn't actually supposed to contain a page at index ULONG_MAX, > since MAX_LFS_FILESIZE is at most ((loff_t)ULONG_MAX << PAGE_SHIFT), right? So > I don't think we need to worry about reading the page with index ULONG_MAX. > I.e. I think it's fine to limit nr_to_read to 'ULONG_MAX - index', if that makes > it easier to avoid an overflow or underflow in the next check. I think we can get a page at ULONG_MAX on 32-bit systems? I mean, we can buy hard drives which are larger than 16TiB these days: https://www.pcmag.com/news/seagate-will-ship-18tb-and-20tb-hard-drives-in-2020 (even ignoring RAID devices)
On Fri, Mar 20, 2020 at 11:11:32AM -0700, Matthew Wilcox wrote: > On Fri, Mar 20, 2020 at 11:00:17AM -0700, Eric Biggers wrote: > > On Fri, Mar 20, 2020 at 10:30:40AM -0700, Matthew Wilcox wrote: > > > On Fri, Mar 20, 2020 at 09:58:28AM -0700, Eric Biggers wrote: > > > > On Fri, Mar 20, 2020 at 07:22:18AM -0700, Matthew Wilcox wrote: > > > > > + /* Avoid wrapping to the beginning of the file */ > > > > > + if (index + nr_to_read < index) > > > > > + nr_to_read = ULONG_MAX - index + 1; > > > > > + /* Don't read past the page containing the last byte of the file */ > > > > > + if (index + nr_to_read >= end_index) > > > > > + nr_to_read = end_index - index + 1; > > > > > > > > There seem to be a couple off-by-one errors here. Shouldn't it be: > > > > > > > > /* Avoid wrapping to the beginning of the file */ > > > > if (index + nr_to_read < index) > > > > nr_to_read = ULONG_MAX - index; > > > > > > I think it's right. Imagine that index is ULONG_MAX. We should read one > > > page (the one at ULONG_MAX). That would be ULONG_MAX - ULONG_MAX + 1. > > > > > > > /* Don't read past the page containing the last byte of the file */ > > > > if (index + nr_to_read > end_index) > > > > nr_to_read = end_index - index + 1; > > > > > > > > I.e., 'ULONG_MAX - index' rather than 'ULONG_MAX - index + 1', so that > > > > 'index + nr_to_read' is then ULONG_MAX rather than overflowed to 0. > > > > > > > > Then 'index + nr_to_read > end_index' rather 'index + nr_to_read >= end_index', > > > > since otherwise nr_to_read can be increased by 1 rather than decreased or stay > > > > the same as expected. > > > > > > Ooh, I missed the overflow case here. It should be: > > > > > > + if (index + nr_to_read - 1 > end_index) > > > + nr_to_read = end_index - index + 1; > > > > > > > But then if someone passes index=0 and nr_to_read=0, this underflows and the > > entire file gets read. > > nr_to_read == 0 doesn't make sense ... I thought we filtered that out > earlier, but I can't find anywhere that does that right now. I'd > rather return early from __do_page_cache_readahead() to fix that. > > > The page cache isn't actually supposed to contain a page at index ULONG_MAX, > > since MAX_LFS_FILESIZE is at most ((loff_t)ULONG_MAX << PAGE_SHIFT), right? So > > I don't think we need to worry about reading the page with index ULONG_MAX. > > I.e. I think it's fine to limit nr_to_read to 'ULONG_MAX - index', if that makes > > it easier to avoid an overflow or underflow in the next check. > > I think we can get a page at ULONG_MAX on 32-bit systems? I mean, we can buy > hard drives which are larger than 16TiB these days: > https://www.pcmag.com/news/seagate-will-ship-18tb-and-20tb-hard-drives-in-2020 > (even ignoring RAID devices) The max file size is ((loff_t)ULONG_MAX << PAGE_SHIFT) which means the maximum page *index* is ULONG_MAX - 1, not ULONG_MAX. Anyway, I think we may be making this much too complicated. How about just: pgoff_t i_nrpages = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); if (index >= i_nrpages) return; /* Don't read past the end of the file */ nr_to_read = min(nr_to_read, i_nrpages - index); That's 2 branches instead of 4. (Note that assigning to i_nrpages can't overflow, since the max number of pages is ULONG_MAX not ULONG_MAX + 1.) - Eric
On Fri, Mar 20, 2020 at 11:24:52AM -0700, Eric Biggers wrote: > On Fri, Mar 20, 2020 at 11:11:32AM -0700, Matthew Wilcox wrote: > > On Fri, Mar 20, 2020 at 11:00:17AM -0700, Eric Biggers wrote: > > > But then if someone passes index=0 and nr_to_read=0, this underflows and the > > > entire file gets read. > > > > nr_to_read == 0 doesn't make sense ... I thought we filtered that out > > earlier, but I can't find anywhere that does that right now. I'd > > rather return early from __do_page_cache_readahead() to fix that. > > > > > The page cache isn't actually supposed to contain a page at index ULONG_MAX, > > > since MAX_LFS_FILESIZE is at most ((loff_t)ULONG_MAX << PAGE_SHIFT), right? So > > > I don't think we need to worry about reading the page with index ULONG_MAX. > > > I.e. I think it's fine to limit nr_to_read to 'ULONG_MAX - index', if that makes > > > it easier to avoid an overflow or underflow in the next check. > > > > I think we can get a page at ULONG_MAX on 32-bit systems? I mean, we can buy > > hard drives which are larger than 16TiB these days: > > https://www.pcmag.com/news/seagate-will-ship-18tb-and-20tb-hard-drives-in-2020 > > (even ignoring RAID devices) > > The max file size is ((loff_t)ULONG_MAX << PAGE_SHIFT) which means the maximum > page *index* is ULONG_MAX - 1, not ULONG_MAX. I see where we set that for _files_ ... I can't find anywhere that we prevent i_size getting that big for block devices. Maybe I'm missing something. > Anyway, I think we may be making this much too complicated. How about just: > > pgoff_t i_nrpages = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); > > if (index >= i_nrpages) > return; > /* Don't read past the end of the file */ > nr_to_read = min(nr_to_read, i_nrpages - index); > > That's 2 branches instead of 4. (Note that assigning to i_nrpages can't > overflow, since the max number of pages is ULONG_MAX not ULONG_MAX + 1.) I like where you're going with this. Just to be on the safe side, I'd prefer to do this: @@ -266,11 +266,8 @@ void __do_page_cache_readahead(struct address_space *mapping, end_index = (isize - 1) >> PAGE_SHIFT; if (index > end_index) return; - /* Avoid wrapping to the beginning of the file */ - if (index + nr_to_read < index) - nr_to_read = ULONG_MAX - index + 1; /* Don't read past the page containing the last byte of the file */ - if (index + nr_to_read >= end_index) + if (nr_to_read > end_index - index) nr_to_read = end_index - index + 1; page_cache_readahead_unbounded(mapping, file, index, nr_to_read, end_index - index + 1 could only overflow if end_index is ULONG_MAX and index is 0. But if end_index is ULONG_MAX and index is 0, then nr_to_read is necessarily <= ULONG_MAX, so the condition is false. And if nr_to_read is 0, then the condition is also false, so it won't increase nr_to_read from 0 to 1. It might assign x to nr_to_read when nr_to_read is already x, but that's harmless. Thanks!
diff --git a/mm/readahead.c b/mm/readahead.c index d01531ef9f3c..a37b68f66233 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -167,8 +167,6 @@ void __do_page_cache_readahead(struct address_space *mapping, unsigned long lookahead_size) { struct inode *inode = mapping->host; - struct page *page; - unsigned long end_index; /* The last page we want to read */ LIST_HEAD(page_pool); loff_t isize = i_size_read(inode); gfp_t gfp_mask = readahead_gfp_mask(mapping); @@ -178,22 +176,29 @@ void __do_page_cache_readahead(struct address_space *mapping, ._index = index, }; unsigned long i; + pgoff_t end_index; /* The last page we want to read */ if (isize == 0) return; - end_index = ((isize - 1) >> PAGE_SHIFT); + end_index = (isize - 1) >> PAGE_SHIFT; + if (index > end_index) + return; + /* Avoid wrapping to the beginning of the file */ + if (index + nr_to_read < index) + nr_to_read = ULONG_MAX - index + 1; + /* Don't read past the page containing the last byte of the file */ + if (index + nr_to_read >= end_index) + nr_to_read = end_index - index + 1; /* * Preallocate as many pages as we will need. */ for (i = 0; i < nr_to_read; i++) { - if (index + i > end_index) - break; + struct page *page = xa_load(&mapping->i_pages, index + i); BUG_ON(index + i != rac._index + rac._nr_pages); - page = xa_load(&mapping->i_pages, index + i); if (page && !xa_is_value(page)) { /* * Page already present? Kick off the current batch of