Message ID | 20200908075230.86856-1-wqu@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | btrfs: add read-only support for subpage sector size | expand |
On 2020/9/8 下午3:52, Qu Wenruo wrote: > Patches can be fetched from github: > https://github.com/adam900710/linux/tree/subpage > > Currently btrfs only allows to mount fs with sectorsize == PAGE_SIZE. > > That means, for 64K page size system, they can only use 64K sector size > fs. > This brings a big compatible problem for btrfs. > > This patch is going to slightly solve the problem by, allowing 64K > system to mount 4K sectorsize fs in read-only mode. > > The main objective here, is to remove the blockage in the code base, and > pave the road to full RW mount support. > > == What works == > > Existing regular page sized sector size support > Subpage read-only Mount (with all self tests and ASSERT) > Subpage metadata read (including all trees and inline extents, and csum checking) > Subpage uncompressed data read (with csum checking) > > == What doesn't work == > > Read-write mount (see the subject) > Compressed data read > > == Challenge we meet == > > The main problem is metadata, where we have several limitations: > - We always read the full page of a metadata > In subpage case, one full page can contain several tree blocks. > > - We use page::private to point to extent buffer > This means we currently can only support one-page-to-one-extent-buffer > mapping. > For subpage size support, we need one-page-to-multiple-extent-buffer > mapping. > > > == Solutions == > > So here for the metadata part, we use the following methods to > workaround the problem: This is pretty different from what Chanda submitted several years ago. Chanda chooses to base its work on Josef's attempt to kill btree_inode and use kmalloc memory for tree blocks. That idea is in fact pretty awesome, it solves a lot of problem and makes btree read/write way easier. The problem is, that attempt to kill btree_inode is exposing a big behavior change, which brings a lot of uncertainty to the following development. While this patchset choose to use the existing btree_inode mechanism to make it easier to be merged. Personally speaking, I still like the idea of btree_inode kill. If we could get an agreement on the direction we take, it would be much better for the future. Thanks, Qu > > - Introduce subpage_eb_mapping structure to do bitmap > Now for subpage, page::private points to a subpage_eb_mapping > structure, which has a bitmap to mapping one page to multiple extent > buffers. > > - Still do full page read for metadata > This means, at read time, we're not reading just one extent buffer, > but possibly many more. > In that case, we first do tree block verification for the tree blocks > triggering the read, and mark the page uptodate. > > For newly incoming tree block read, they will check if the tree block > is verified. If not verified, even if the page is uptodate, we still > need to check the extent buffer. > > By this all subpage extent buffers are verified properly. > > For data part, it's pretty simple, all existing infrastructure can be > easily converted to support subpage read, without any subpage specific > handing yet. > > == Patchset structure == > > The structure of the patchset: > Patch 01~11: Preparation patches for metadata subpage read support. > These patches can be merged without problem, and work for > both regular and subpage case. > Patch 12~14: Patches for data subpage read support. > These patches works for both cases. > > That means, patch 01~14 can be applied to current kernel, and shouldn't > affect any existing behavior. > > Patch 15~17: Subpage metadata read specific patches. > These patches introduces the main part of the subpage > metadata read support. > > The number of patches is the main reason I'm submitting them to the mail > list. As there are too many preparation patches already. > > Qu Wenruo (17): > btrfs: extent-io-tests: remove invalid tests > btrfs: calculate inline extent buffer page size based on page size > btrfs: remove the open-code to read disk-key > btrfs: make btrfs_fs_info::buffer_radix to take sector size devided > values > btrfs: don't allow tree block to cross page boundary for subpage > support > btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors > btrfs: make csum_tree_block() handle sectorsize smaller than page size > btrfs: refactor how we extract extent buffer from page for > alloc_extent_buffer() > btrfs: refactor btrfs_release_extent_buffer_pages() > btrfs: add assert_spin_locked() for attach_extent_buffer_page() > btrfs: extract the extent buffer verification from > btree_readpage_end_io_hook() > btrfs: remove the unnecessary parameter @start and @len for > check_data_csum() > btrfs: extent_io: only require sector size alignment for page read > btrfs: make btrfs_readpage_end_io_hook() follow sector size > btrfs: introduce subpage_eb_mapping for extent buffers > btrfs: handle extent buffer verification proper for subpage size > btrfs: allow RO mount of 4K sector size fs on 64K page system > > fs/btrfs/ctree.c | 13 +- > fs/btrfs/ctree.h | 38 ++- > fs/btrfs/disk-io.c | 111 ++++--- > fs/btrfs/disk-io.h | 1 + > fs/btrfs/extent_io.c | 538 +++++++++++++++++++++++++------ > fs/btrfs/extent_io.h | 19 +- > fs/btrfs/inode.c | 40 ++- > fs/btrfs/struct-funcs.c | 18 +- > fs/btrfs/super.c | 7 + > fs/btrfs/tests/extent-io-tests.c | 26 +- > 10 files changed, 633 insertions(+), 178 deletions(-) >
On 2020/9/8 下午3:52, Qu Wenruo wrote: > Patches can be fetched from github: > https://github.com/adam900710/linux/tree/subpage > > Currently btrfs only allows to mount fs with sectorsize == PAGE_SIZE. > > That means, for 64K page size system, they can only use 64K sector size > fs. > This brings a big compatible problem for btrfs. > > This patch is going to slightly solve the problem by, allowing 64K > system to mount 4K sectorsize fs in read-only mode. > > The main objective here, is to remove the blockage in the code base, and > pave the road to full RW mount support. > > == What works == > > Existing regular page sized sector size support > Subpage read-only Mount (with all self tests and ASSERT) > Subpage metadata read (including all trees and inline extents, and csum checking) > Subpage uncompressed data read (with csum checking) > > == What doesn't work == > > Read-write mount (see the subject) > Compressed data read > > == Challenge we meet == > > The main problem is metadata, where we have several limitations: > - We always read the full page of a metadata > In subpage case, one full page can contain several tree blocks. > > - We use page::private to point to extent buffer > This means we currently can only support one-page-to-one-extent-buffer > mapping. > For subpage size support, we need one-page-to-multiple-extent-buffer > mapping. > > > == Solutions == > > So here for the metadata part, we use the following methods to > workaround the problem: > > - Introduce subpage_eb_mapping structure to do bitmap > Now for subpage, page::private points to a subpage_eb_mapping > structure, which has a bitmap to mapping one page to multiple extent > buffers. > > - Still do full page read for metadata > This means, at read time, we're not reading just one extent buffer, > but possibly many more. > In that case, we first do tree block verification for the tree blocks > triggering the read, and mark the page uptodate. > > For newly incoming tree block read, they will check if the tree block > is verified. If not verified, even if the page is uptodate, we still > need to check the extent buffer. > > By this all subpage extent buffers are verified properly. > > For data part, it's pretty simple, all existing infrastructure can be > easily converted to support subpage read, without any subpage specific > handing yet. > > == Patchset structure == > > The structure of the patchset: > Patch 01~11: Preparation patches for metadata subpage read support. > These patches can be merged without problem, and work for > both regular and subpage case. > Patch 12~14: Patches for data subpage read support. > These patches works for both cases. > > That means, patch 01~14 can be applied to current kernel, and shouldn't > affect any existing behavior. > > Patch 15~17: Subpage metadata read specific patches. > These patches introduces the main part of the subpage > metadata read support. For the last 3 patches, it turns out that, we may get rid of page::private pointer completely, and greatly simplify the bits handling by relying on extent_io_tree. So if possible, please ignore these last 3 patches for now. They would be the backup solution if the extent_io_tree idea doesn't go well. Thanks, Qu > > The number of patches is the main reason I'm submitting them to the mail > list. As there are too many preparation patches already. > > Qu Wenruo (17): > btrfs: extent-io-tests: remove invalid tests > btrfs: calculate inline extent buffer page size based on page size > btrfs: remove the open-code to read disk-key > btrfs: make btrfs_fs_info::buffer_radix to take sector size devided > values > btrfs: don't allow tree block to cross page boundary for subpage > support > btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors > btrfs: make csum_tree_block() handle sectorsize smaller than page size > btrfs: refactor how we extract extent buffer from page for > alloc_extent_buffer() > btrfs: refactor btrfs_release_extent_buffer_pages() > btrfs: add assert_spin_locked() for attach_extent_buffer_page() > btrfs: extract the extent buffer verification from > btree_readpage_end_io_hook() > btrfs: remove the unnecessary parameter @start and @len for > check_data_csum() > btrfs: extent_io: only require sector size alignment for page read > btrfs: make btrfs_readpage_end_io_hook() follow sector size > btrfs: introduce subpage_eb_mapping for extent buffers > btrfs: handle extent buffer verification proper for subpage size > btrfs: allow RO mount of 4K sector size fs on 64K page system > > fs/btrfs/ctree.c | 13 +- > fs/btrfs/ctree.h | 38 ++- > fs/btrfs/disk-io.c | 111 ++++--- > fs/btrfs/disk-io.h | 1 + > fs/btrfs/extent_io.c | 538 +++++++++++++++++++++++++------ > fs/btrfs/extent_io.h | 19 +- > fs/btrfs/inode.c | 40 ++- > fs/btrfs/struct-funcs.c | 18 +- > fs/btrfs/super.c | 7 + > fs/btrfs/tests/extent-io-tests.c | 26 +- > 10 files changed, 633 insertions(+), 178 deletions(-) >