Message ID | 20200908075230.86856-16-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: add read-only support for subpage sector size | expand |
Hi Qu, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on v5.9-rc4] [also build test WARNING on next-20200903] [cannot apply to kdave/for-next btrfs/next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Qu-Wenruo/btrfs-add-read-only-support-for-subpage-sector-size/20200908-155601 base: f4d51dffc6c01a9e94650d95ce0104964f8ae822 config: xtensa-allyesconfig (attached as .config) compiler: xtensa-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=xtensa If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> fs/btrfs/extent_io.c:6309:5: warning: no previous prototype for 'try_release_subpage_ebs' [-Wmissing-prototypes] 6309 | int try_release_subpage_ebs(struct page *page) | ^~~~~~~~~~~~~~~~~~~~~~~ # https://github.com/0day-ci/linux/commit/3ef1cb4eb96ce4dce4dc94e3f06c4dd41879e977 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Qu-Wenruo/btrfs-add-read-only-support-for-subpage-sector-size/20200908-155601 git checkout 3ef1cb4eb96ce4dce4dc94e3f06c4dd41879e977 vim +/try_release_subpage_ebs +6309 fs/btrfs/extent_io.c 6308 > 6309 int try_release_subpage_ebs(struct page *page) 6310 { 6311 struct subpage_eb_mapping *mapping; 6312 int i; 6313 6314 assert_spin_locked(&page->mapping->private_lock); 6315 if (!PagePrivate(page)) 6316 return 1; 6317 6318 mapping = (struct subpage_eb_mapping *)page->private; 6319 for (i = 0; i < SUBPAGE_NR_EXTENT_BUFFERS && PagePrivate(page); i++) { 6320 struct btrfs_fs_info *fs_info = page_to_fs_info(page); 6321 struct extent_buffer *eb; 6322 int ret; 6323 6324 if (!test_bit(i, &mapping->bitmap)) 6325 continue; 6326 6327 eb = mapping->buffers[i]; 6328 spin_unlock(&page->mapping->private_lock); 6329 spin_lock(&eb->refs_lock); 6330 ret = release_extent_buffer(eb); 6331 spin_lock(&page->mapping->private_lock); 6332 6333 /* 6334 * Extent buffer can't be freed yet, must jump to next slot 6335 * and avoid calling release_extent_buffer(). 6336 */ 6337 if (!ret) 6338 i += (fs_info->nodesize / fs_info->sectorsize - 1); 6339 } 6340 /* 6341 * detach_subpage_mapping() from release_extent_buffer() has detached 6342 * all ebs from this page. All related ebs are released. 6343 */ 6344 if (!PagePrivate(page)) 6345 return 1; 6346 return 0; 6347 } 6348 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Qu, url: https://github.com/0day-ci/linux/commits/Qu-Wenruo/btrfs-add-read-only-support-for-subpage-sector-size/20200908-155601 base: f4d51dffc6c01a9e94650d95ce0104964f8ae822 config: x86_64-randconfig-m001-20200907 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> New smatch warnings: fs/btrfs/extent_io.c:5516 alloc_extent_buffer() error: uninitialized symbol 'num_pages'. Old smatch warnings: fs/btrfs/extent_io.c:6397 try_release_extent_buffer() warn: inconsistent returns 'eb->refs_lock'. # https://github.com/0day-ci/linux/commit/3ef1cb4eb96ce4dce4dc94e3f06c4dd41879e977 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Qu-Wenruo/btrfs-add-read-only-support-for-subpage-sector-size/20200908-155601 git checkout 3ef1cb4eb96ce4dce4dc94e3f06c4dd41879e977 vim +/subpage_mapping +5511 fs/btrfs/extent_io.c f28491e0a6c46d Josef Bacik 2013-12-16 5389 struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, ce3e69847e3ec7 David Sterba 2014-06-15 5390 u64 start) d1310b2e0cd98e Chris Mason 2008-01-24 5391 { da17066c40472c Jeff Mahoney 2016-06-15 5392 unsigned long len = fs_info->nodesize; cc5e31a4775d0d David Sterba 2018-03-01 5393 int num_pages; cc5e31a4775d0d David Sterba 2018-03-01 5394 int i; 09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 5395 unsigned long index = start >> PAGE_SHIFT; d1310b2e0cd98e Chris Mason 2008-01-24 5396 struct extent_buffer *eb; 6af118ce51b52c Chris Mason 2008-07-22 5397 struct extent_buffer *exists = NULL; d1310b2e0cd98e Chris Mason 2008-01-24 5398 struct page *p; f28491e0a6c46d Josef Bacik 2013-12-16 5399 struct address_space *mapping = fs_info->btree_inode->i_mapping; 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5400 struct subpage_eb_mapping *subpage_mapping = NULL; 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5401 bool subpage = (fs_info->sectorsize < PAGE_SIZE); d1310b2e0cd98e Chris Mason 2008-01-24 5402 int uptodate = 1; 19fe0a8b787d7c Miao Xie 2010-10-26 5403 int ret; d1310b2e0cd98e Chris Mason 2008-01-24 5404 da17066c40472c Jeff Mahoney 2016-06-15 5405 if (!IS_ALIGNED(start, fs_info->sectorsize)) { c871b0f2fd27e7 Liu Bo 2016-06-06 5406 btrfs_err(fs_info, "bad tree block start %llu", start); c871b0f2fd27e7 Liu Bo 2016-06-06 5407 return ERR_PTR(-EINVAL); c871b0f2fd27e7 Liu Bo 2016-06-06 5408 } 039b297b76d816 Qu Wenruo 2020-09-08 5409 if (fs_info->sectorsize < PAGE_SIZE && round_down(start, PAGE_SIZE) != 039b297b76d816 Qu Wenruo 2020-09-08 5410 round_down(start + len - 1, PAGE_SIZE)) { 039b297b76d816 Qu Wenruo 2020-09-08 5411 btrfs_err(fs_info, 039b297b76d816 Qu Wenruo 2020-09-08 5412 "tree block crosses page boundary, start %llu nodesize %lu", 039b297b76d816 Qu Wenruo 2020-09-08 5413 start, len); 039b297b76d816 Qu Wenruo 2020-09-08 5414 return ERR_PTR(-EINVAL); 039b297b76d816 Qu Wenruo 2020-09-08 5415 } c871b0f2fd27e7 Liu Bo 2016-06-06 5416 f28491e0a6c46d Josef Bacik 2013-12-16 5417 eb = find_extent_buffer(fs_info, start); 452c75c3d21870 Chandra Seetharaman 2013-10-07 5418 if (eb) 6af118ce51b52c Chris Mason 2008-07-22 5419 return eb; 6af118ce51b52c Chris Mason 2008-07-22 5420 23d79d81b13431 David Sterba 2014-06-15 5421 eb = __alloc_extent_buffer(fs_info, start, len); 2b114d1d33551a Peter 2008-04-01 5422 if (!eb) c871b0f2fd27e7 Liu Bo 2016-06-06 5423 return ERR_PTR(-ENOMEM); d1310b2e0cd98e Chris Mason 2008-01-24 5424 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5425 if (subpage) { 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5426 subpage_mapping = kmalloc(sizeof(*subpage_mapping), GFP_NOFS); 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5427 if (!mapping) { This was probably supposed to be if "subpage_mapping" instead of "mapping". 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5428 exists = ERR_PTR(-ENOMEM); 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5429 goto free_eb; The "num_pages" variable is uninitialized on this goto path. 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5430 } 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5431 } 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5432 65ad010488a5cc David Sterba 2018-06-29 5433 num_pages = num_extent_pages(eb); 727011e07cbdf8 Chris Mason 2010-08-06 5434 for (i = 0; i < num_pages; i++, index++) { d1b5c5671d010d Michal Hocko 2015-08-19 5435 p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL); c871b0f2fd27e7 Liu Bo 2016-06-06 5436 if (!p) { c871b0f2fd27e7 Liu Bo 2016-06-06 5437 exists = ERR_PTR(-ENOMEM); 6af118ce51b52c Chris Mason 2008-07-22 5438 goto free_eb; c871b0f2fd27e7 Liu Bo 2016-06-06 5439 } 4f2de97acee653 Josef Bacik 2012-03-07 5440 4f2de97acee653 Josef Bacik 2012-03-07 5441 spin_lock(&mapping->private_lock); 4f2de97acee653 Josef Bacik 2012-03-07 5442 if (PagePrivate(p)) { 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5443 exists = grab_extent_buffer_from_page(p, start); b76bd038ff9290 Qu Wenruo 2020-09-08 5444 if (exists && atomic_inc_not_zero(&exists->refs)) { ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This increment doesn't pair perfectly with the decrement in the error handling. In other words, we sometimes decrement it when it was never incremented. This presumably will lead to a use after free. 4f2de97acee653 Josef Bacik 2012-03-07 5445 spin_unlock(&mapping->private_lock); 4f2de97acee653 Josef Bacik 2012-03-07 5446 unlock_page(p); 09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 5447 put_page(p); 2457aec63745e2 Mel Gorman 2014-06-04 5448 mark_extent_buffer_accessed(exists, p); 4f2de97acee653 Josef Bacik 2012-03-07 5449 goto free_eb; 4f2de97acee653 Josef Bacik 2012-03-07 5450 } 5ca64f45e92dc5 Omar Sandoval 2015-02-24 5451 exists = NULL; 4f2de97acee653 Josef Bacik 2012-03-07 5452 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5453 if (!subpage) { 4f2de97acee653 Josef Bacik 2012-03-07 5454 /* 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5455 * Do this so attach doesn't complain and we 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5456 * need to drop the ref the old guy had. 4f2de97acee653 Josef Bacik 2012-03-07 5457 */ 4f2de97acee653 Josef Bacik 2012-03-07 5458 ClearPagePrivate(p); 0b32f4bbb423f0 Josef Bacik 2012-03-13 5459 WARN_ON(PageDirty(p)); 09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 5460 put_page(p); d1310b2e0cd98e Chris Mason 2008-01-24 5461 } 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5462 } 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5463 attach_extent_buffer_page(eb, p, subpage_mapping); 4f2de97acee653 Josef Bacik 2012-03-07 5464 spin_unlock(&mapping->private_lock); 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5465 subpage_mapping = NULL; 0b32f4bbb423f0 Josef Bacik 2012-03-13 5466 WARN_ON(PageDirty(p)); 727011e07cbdf8 Chris Mason 2010-08-06 5467 eb->pages[i] = p; d1310b2e0cd98e Chris Mason 2008-01-24 5468 if (!PageUptodate(p)) d1310b2e0cd98e Chris Mason 2008-01-24 5469 uptodate = 0; eb14ab8ed24a04 Chris Mason 2011-02-10 5470 eb14ab8ed24a04 Chris Mason 2011-02-10 5471 /* b16d011e79fb35 Nikolay Borisov 2018-07-04 5472 * We can't unlock the pages just yet since the extent buffer b16d011e79fb35 Nikolay Borisov 2018-07-04 5473 * hasn't been properly inserted in the radix tree, this b16d011e79fb35 Nikolay Borisov 2018-07-04 5474 * opens a race with btree_releasepage which can free a page b16d011e79fb35 Nikolay Borisov 2018-07-04 5475 * while we are still filling in all pages for the buffer and b16d011e79fb35 Nikolay Borisov 2018-07-04 5476 * we could crash. eb14ab8ed24a04 Chris Mason 2011-02-10 5477 */ d1310b2e0cd98e Chris Mason 2008-01-24 5478 } d1310b2e0cd98e Chris Mason 2008-01-24 5479 if (uptodate) b4ce94de9b4d64 Chris Mason 2009-02-04 5480 set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); 115391d2315239 Josef Bacik 2012-03-09 5481 again: e1860a7724828a David Sterba 2016-05-09 5482 ret = radix_tree_preload(GFP_NOFS); c871b0f2fd27e7 Liu Bo 2016-06-06 5483 if (ret) { c871b0f2fd27e7 Liu Bo 2016-06-06 5484 exists = ERR_PTR(ret); 19fe0a8b787d7c Miao Xie 2010-10-26 5485 goto free_eb; c871b0f2fd27e7 Liu Bo 2016-06-06 5486 } 19fe0a8b787d7c Miao Xie 2010-10-26 5487 f28491e0a6c46d Josef Bacik 2013-12-16 5488 spin_lock(&fs_info->buffer_lock); f28491e0a6c46d Josef Bacik 2013-12-16 5489 ret = radix_tree_insert(&fs_info->buffer_radix, d31cf6896006df Qu Wenruo 2020-09-08 5490 start / fs_info->sectorsize, eb); f28491e0a6c46d Josef Bacik 2013-12-16 5491 spin_unlock(&fs_info->buffer_lock); 19fe0a8b787d7c Miao Xie 2010-10-26 5492 radix_tree_preload_end(); 452c75c3d21870 Chandra Seetharaman 2013-10-07 5493 if (ret == -EEXIST) { f28491e0a6c46d Josef Bacik 2013-12-16 5494 exists = find_extent_buffer(fs_info, start); 452c75c3d21870 Chandra Seetharaman 2013-10-07 5495 if (exists) 6af118ce51b52c Chris Mason 2008-07-22 5496 goto free_eb; 452c75c3d21870 Chandra Seetharaman 2013-10-07 5497 else 452c75c3d21870 Chandra Seetharaman 2013-10-07 5498 goto again; 6af118ce51b52c Chris Mason 2008-07-22 5499 } 6af118ce51b52c Chris Mason 2008-07-22 5500 /* add one reference for the tree */ 0b32f4bbb423f0 Josef Bacik 2012-03-13 5501 check_buffer_tree_ref(eb); 34b41acec1ccc0 Josef Bacik 2013-12-13 5502 set_bit(EXTENT_BUFFER_IN_TREE, &eb->bflags); eb14ab8ed24a04 Chris Mason 2011-02-10 5503 eb14ab8ed24a04 Chris Mason 2011-02-10 5504 /* b16d011e79fb35 Nikolay Borisov 2018-07-04 5505 * Now it's safe to unlock the pages because any calls to b16d011e79fb35 Nikolay Borisov 2018-07-04 5506 * btree_releasepage will correctly detect that a page belongs to a b16d011e79fb35 Nikolay Borisov 2018-07-04 5507 * live buffer and won't free them prematurely. eb14ab8ed24a04 Chris Mason 2011-02-10 5508 */ 28187ae569e8a6 Nikolay Borisov 2018-07-04 5509 for (i = 0; i < num_pages; i++) 28187ae569e8a6 Nikolay Borisov 2018-07-04 5510 unlock_page(eb->pages[i]); d1310b2e0cd98e Chris Mason 2008-01-24 @5511 return eb; d1310b2e0cd98e Chris Mason 2008-01-24 5512 6af118ce51b52c Chris Mason 2008-07-22 5513 free_eb: 5ca64f45e92dc5 Omar Sandoval 2015-02-24 5514 WARN_ON(!atomic_dec_and_test(&eb->refs)); 3ef1cb4eb96ce4 Qu Wenruo 2020-09-08 5515 kfree(subpage_mapping); 727011e07cbdf8 Chris Mason 2010-08-06 @5516 for (i = 0; i < num_pages; i++) { 727011e07cbdf8 Chris Mason 2010-08-06 5517 if (eb->pages[i]) 727011e07cbdf8 Chris Mason 2010-08-06 5518 unlock_page(eb->pages[i]); 727011e07cbdf8 Chris Mason 2010-08-06 5519 } eb14ab8ed24a04 Chris Mason 2011-02-10 5520 897ca6e9b4fef8 Miao Xie 2010-10-26 5521 btrfs_release_extent_buffer(eb); 6af118ce51b52c Chris Mason 2008-07-22 5522 return exists; d1310b2e0cd98e Chris Mason 2008-01-24 5523 } --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a83b63ecc5f8..87b3bb781532 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -29,6 +29,34 @@ static struct kmem_cache *extent_state_cache; static struct kmem_cache *extent_buffer_cache; static struct bio_set btrfs_bioset; +/* Upper limit of how many extent buffers can be stored in one page */ +#define SUBPAGE_NR_EXTENT_BUFFERS (SZ_64K / SZ_4K) +/* + * Structure for subpage support, recording the page -> extent buffer mapping + * + * For subpage support, one 64K page can contain several tree blocks, other than + * 1:1 page <-> extent buffer mapping from sectorsize == PAGE_SIZE case. + */ +struct subpage_eb_mapping { + /* + * Which range has extent buffer. + * + * One bit represents one sector, bit nr represents the offset in page. + * At most 16 bits are utilized. + */ + unsigned long bitmap; + + /* We only support 64K PAGE_SIZE system to mount 4K sectorsize fs */ + struct extent_buffer *buffers[SUBPAGE_NR_EXTENT_BUFFERS]; +}; + +struct btrfs_fs_info *page_to_fs_info(struct page *page) +{ + ASSERT(page && page->mapping); + + return BTRFS_I(page->mapping->host)->root->fs_info; +} + static inline bool extent_state_in_tree(const struct extent_state *state) { return !RB_EMPTY_NODE(&state->rb_node); @@ -3098,12 +3126,50 @@ static int submit_extent_page(unsigned int opf, return ret; } +static void attach_subpage_mapping(struct extent_buffer *eb, + struct page *page, + struct subpage_eb_mapping *mapping) +{ + u32 sectorsize = eb->fs_info->sectorsize; + u32 nodesize = eb->fs_info->nodesize; + int index_start = (eb->start - page_offset(page)) / sectorsize; + int nr_bits = nodesize / sectorsize; + int i; + + ASSERT(mapping); + if (!PagePrivate(page)) { + /* Attach mapping to page::private and initialize */ + memset(mapping, 0, sizeof(*mapping)); + attach_page_private(page, mapping); + } else { + /* Use the existing page::private as mapping */ + kfree(mapping); + mapping = (struct subpage_eb_mapping *) page->private; + } + + /* Set the bitmap and pointers */ + for (i = index_start; i < index_start + nr_bits; i++) { + set_bit(i, &mapping->bitmap); + mapping->buffers[i] = eb; + } +} + static void attach_extent_buffer_page(struct extent_buffer *eb, - struct page *page) + struct page *page, + struct subpage_eb_mapping *mapping) { + bool subpage = (eb->fs_info->sectorsize < PAGE_SIZE); if (page->mapping) assert_spin_locked(&page->mapping->private_lock); + if (subpage && page->mapping) { + attach_subpage_mapping(eb, page, mapping); + return; + } + /* + * Anonymous page and sectorsize == PAGE_SIZE uses page::private as a + * pointer to eb directly. + */ if (!PagePrivate(page)) attach_page_private(page, eb); else @@ -4928,16 +4994,61 @@ int extent_buffer_under_io(const struct extent_buffer *eb) test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); } +static void detach_subpage_mapping(struct extent_buffer *eb, struct page *page) +{ + struct subpage_eb_mapping *mapping; + u32 sectorsize = eb->fs_info->sectorsize; + int start_index; + int nr_bits = eb->fs_info->nodesize / sectorsize; + int i; + + /* Page already detached */ + if (!PagePrivate(page)) + return; + + assert_spin_locked(&page->mapping->private_lock); + ASSERT(eb->start >= page_offset(page) && + eb->start < page_offset(page) + PAGE_SIZE); + + mapping = (struct subpage_eb_mapping *)page->private; + start_index = (eb->start - page_offset(page)) / sectorsize; + + for (i = start_index; i < start_index + nr_bits; i++) { + if (test_bit(i, &mapping->bitmap) && + mapping->buffers[i] == eb) { + clear_bit(i, &mapping->bitmap); + mapping->buffers[i] = NULL; + } + } + + /* Are we the last owner ? */ + if (mapping->bitmap == 0) { + kfree(mapping); + detach_page_private(page); + /* One for the first time allocated the page */ + put_page(page); + } +} + static void detach_extent_buffer_page(struct extent_buffer *eb, struct page *page) { bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); + bool subpage = (eb->fs_info->sectorsize < PAGE_SIZE); if (!page) return; if (mapped) spin_lock(&page->mapping->private_lock); + + if (subpage && page->mapping) { + detach_subpage_mapping(eb, page); + if (mapped) + spin_unlock(&page->mapping->private_lock); + return; + } + if (PagePrivate(page) && page->private == (unsigned long)eb) { BUG_ON(test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); BUG_ON(PageDirty(page)); @@ -5035,7 +5146,7 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) btrfs_release_extent_buffer(new); return NULL; } - attach_extent_buffer_page(new, p); + attach_extent_buffer_page(new, p, NULL); WARN_ON(PageDirty(p)); SetPageUptodate(p); new->pages[i] = p; @@ -5243,8 +5354,31 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, * The function here is to ensure we have proper locking and detect such race * so we won't allocating an eb twice. */ -static struct extent_buffer *grab_extent_buffer_from_page(struct page *page) +static struct extent_buffer *grab_extent_buffer_from_page(struct page *page, + u64 bytenr) { + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + bool subpage = (fs_info->sectorsize < PAGE_SIZE); + + if (!PagePrivate(page)) + return NULL; + + if (subpage) { + struct subpage_eb_mapping *mapping; + u32 sectorsize = fs_info->sectorsize; + int start_index; + + ASSERT(bytenr >= page_offset(page) && + bytenr < page_offset(page) + PAGE_SIZE); + + start_index = (bytenr - page_offset(page)) / sectorsize; + mapping = (struct subpage_eb_mapping *)page->private; + + if (test_bit(start_index, &mapping->bitmap)) + return mapping->buffers[start_index]; + return NULL; + } + /* * For PAGE_SIZE == sectorsize case, a btree_inode page should have its * private pointer as extent buffer who owns this page. @@ -5263,6 +5397,8 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, struct extent_buffer *exists = NULL; struct page *p; struct address_space *mapping = fs_info->btree_inode->i_mapping; + struct subpage_eb_mapping *subpage_mapping = NULL; + bool subpage = (fs_info->sectorsize < PAGE_SIZE); int uptodate = 1; int ret; @@ -5286,6 +5422,14 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, if (!eb) return ERR_PTR(-ENOMEM); + if (subpage) { + subpage_mapping = kmalloc(sizeof(*subpage_mapping), GFP_NOFS); + if (!mapping) { + exists = ERR_PTR(-ENOMEM); + goto free_eb; + } + } + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++, index++) { p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL); @@ -5296,7 +5440,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, spin_lock(&mapping->private_lock); if (PagePrivate(p)) { - exists = grab_extent_buffer_from_page(p); + exists = grab_extent_buffer_from_page(p, start); if (exists && atomic_inc_not_zero(&exists->refs)) { spin_unlock(&mapping->private_lock); unlock_page(p); @@ -5306,16 +5450,19 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, } exists = NULL; - /* - * Do this so attach doesn't complain and we need to - * drop the ref the old guy had. - */ - ClearPagePrivate(p); - WARN_ON(PageDirty(p)); - put_page(p); + if (!subpage) { + /* + * Do this so attach doesn't complain and we + * need to drop the ref the old guy had. + */ + ClearPagePrivate(p); + WARN_ON(PageDirty(p)); + put_page(p); + } } - attach_extent_buffer_page(eb, p); + attach_extent_buffer_page(eb, p, subpage_mapping); spin_unlock(&mapping->private_lock); + subpage_mapping = NULL; WARN_ON(PageDirty(p)); eb->pages[i] = p; if (!PageUptodate(p)) @@ -5365,6 +5512,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, free_eb: WARN_ON(!atomic_dec_and_test(&eb->refs)); + kfree(subpage_mapping); for (i = 0; i < num_pages; i++) { if (eb->pages[i]) unlock_page(eb->pages[i]); @@ -6158,8 +6306,49 @@ void memmove_extent_buffer(const struct extent_buffer *dst, } } +int try_release_subpage_ebs(struct page *page) +{ + struct subpage_eb_mapping *mapping; + int i; + + assert_spin_locked(&page->mapping->private_lock); + if (!PagePrivate(page)) + return 1; + + mapping = (struct subpage_eb_mapping *)page->private; + for (i = 0; i < SUBPAGE_NR_EXTENT_BUFFERS && PagePrivate(page); i++) { + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_buffer *eb; + int ret; + + if (!test_bit(i, &mapping->bitmap)) + continue; + + eb = mapping->buffers[i]; + spin_unlock(&page->mapping->private_lock); + spin_lock(&eb->refs_lock); + ret = release_extent_buffer(eb); + spin_lock(&page->mapping->private_lock); + + /* + * Extent buffer can't be freed yet, must jump to next slot + * and avoid calling release_extent_buffer(). + */ + if (!ret) + i += (fs_info->nodesize / fs_info->sectorsize - 1); + } + /* + * detach_subpage_mapping() from release_extent_buffer() has detached + * all ebs from this page. All related ebs are released. + */ + if (!PagePrivate(page)) + return 1; + return 0; +} + int try_release_extent_buffer(struct page *page) { + bool subpage = (page_to_fs_info(page)->sectorsize < PAGE_SIZE); struct extent_buffer *eb; /* @@ -6172,6 +6361,14 @@ int try_release_extent_buffer(struct page *page) return 1; } + if (subpage) { + int ret; + + ret = try_release_subpage_ebs(page); + spin_unlock(&page->mapping->private_lock); + return ret; + } + eb = (struct extent_buffer *)page->private; BUG_ON(!eb); diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index e16c5449ba48..6593b6883438 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -184,6 +184,9 @@ static inline int extent_compress_type(unsigned long bio_flags) return bio_flags >> EXTENT_BIO_FLAG_SHIFT; } +/* Unable to inline it due to the requirement for both ASSERT() and BTRFS_I() */ +struct btrfs_fs_info *page_to_fs_info(struct page *page); + struct extent_map_tree; typedef struct extent_map *(get_extent_t)(struct btrfs_inode *inode,
One of the design blockage for subpage support is the btree inode page::private mapping. Currently page::private for btree inode is a pointer to extent buffer who owns this page. This is fine for sectorsize == PAGE_SIZE case, but not suitable for subpage size support, as in that case one page can hold multiple tree blocks. So to support subpage, here we introduce a new structure, subpage_eb_mapping, to record how many extent buffers are referring to one page. It uses a bitmap (at most 16 bits used) to record tree blocks, and a extent buffer pointers array (at most 16 too) to record the owners. This patch will modify the following functions to add subpage support using subpage_eb_mapping structure: - attach_extent_buffer_page() - detach_extent_buffer_page() - grab_extent_buffer_from_page() - try_release_extent_buffer() Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/extent_io.c | 221 ++++++++++++++++++++++++++++++++++++++++--- fs/btrfs/extent_io.h | 3 + 2 files changed, 212 insertions(+), 12 deletions(-)