Message ID | 2f1ccb99432d65ebd1500f685e53ec98bdc259e7.1425382454.git.sebth@naju.se (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Tue, Mar 03, 2015 at 12:47:39PM +0100, Sebastian Thorarensen wrote: > Allow btrfs-convert to use nodesizes other than 4096. It defaults to > max(16384, pagesize), like mkfs. The constant DEFAULT_MKFS_LEAF_SIZE has > been moved to utils.h and renamed to BTRFS_MKFS_DEFAULT_NODE_SIZE for > consistency. convert-tests now test both 4096 and 16384 nodesizes. Thanks, this sounds very useful. > +static int ext2_free_block_range(ext2_filsys fs, u64 block, int num) > +{ > + BUG_ON(block != (blk_t)block); What's the purpose of this? Some kind of overflow check? > + ext2fs_fast_unmark_block_bitmap_range(fs->block_map, block, num); > + return 0; > +} > + > static int cache_free_extents(struct btrfs_root *root, ext2_filsys ext2_fs) > > { > @@ -2231,10 +2257,28 @@ err: > return ret; > } > > +static int check_nodesize(u32 nodesize, u32 sectorsize) This is duplicating check_leaf_or_node_size from mkfs.c, the right way is to move that to utils.c and then use in convert. > +{ > + if (nodesize < sectorsize) { > + fprintf(stderr, "illegal nodesize %u (smaller than %u)\n", > + nodesize, sectorsize); > + return -1; > + } else if (nodesize > BTRFS_MAX_METADATA_BLOCKSIZE) { > + fprintf(stderr, "illegal nodesize %u (larger than %u)\n", > + nodesize, BTRFS_MAX_METADATA_BLOCKSIZE); > + return -1; > + } else if (nodesize & (sectorsize - 1)) { > + fprintf(stderr, "illegal nodesize %u (not aligned to %u)\n", > + nodesize, sectorsize); > + return -1; > + } > + return 0; > +} Otherwise looks good. If you're going to resend, please split the patch into one logical change per patch: 1) generic and preparatory changes, like moving the check_node to utils.c, moving the mkfs default leaf size 2) the changes to convert, ie. most of this patch minus 1 and 3 3) the tests Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 03, 2015 at 12:57:36PM +0100, Sebastian Thorarensen wrote: > This patch passes `make test', and I have also tested the patched > btrfs-convert on populated ext4 filesystems and reverting back again, > without errors; but note that I have very little experience with ext2/3/4 > and btrfs, so there is a large possibility that I have missed something. > I'm also a bit unsure if my implementation of `ext2_alloc_block_range' is > the best way to find continuous blocks. I think it's right, however I'm not sure if there are any guarantees about the range alignment. Btrfs should be fine with 4k (ie. sectorsize which is PAGE_SIZE) alignment for a node, but this could lead to some pathological layout and I'm not 100% sure the exact nodesize alignment is not required somewhere. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 16 Mar 2015, David Sterba wrote: > I think it's right, however I'm not sure if there are any guarantees > about the range alignment. Btrfs should be fine with 4k (ie. sectorsize > which is PAGE_SIZE) alignment for a node, but this could lead to some > pathological layout and I'm not 100% sure the exact nodesize alignment > is not required somewhere. Ah, I didn't think of that. Yes, ext2_alloc_block_range can allocate unaligned (with respect to nodesize) ranges. I made a test with a conversion that resulted in unaligned blocks[], and neither btrfs check nor scrub complained on the converted filesystem, but I will do some research to find out whether blocks[] must be aligned or not.
On Mon, 16 Mar 2015, David Sterba wrote: > > +static int ext2_free_block_range(ext2_filsys fs, u64 block, int num) > > +{ > > + BUG_ON(block != (blk_t)block); > > What's the purpose of this? Some kind of overflow check? ext2_free_block contains the same check, so I put one in ext2_free_block_range for consistency. I assumed it was an overflow check as blk_t is u32. > Otherwise looks good. If you're going to resend, please split the patch > into one logical change per patch: > > 1) generic and preparatory changes, like moving the check_node to > utils.c, moving the mkfs default leaf size > > 2) the changes to convert, ie. most of this patch minus 1 and 3 > > 3) the tests Thanks for the review! Will resend with the fixes when I know whether blocks[] must be aligned to nodesize or not.
On Wed, Mar 18, 2015 at 02:57:31AM +0100, Sebastian Thorarensen wrote: > On Mon, 16 Mar 2015, David Sterba wrote: > > I think it's right, however I'm not sure if there are any guarantees > > about the range alignment. Btrfs should be fine with 4k (ie. sectorsize > > which is PAGE_SIZE) alignment for a node, but this could lead to some > > pathological layout and I'm not 100% sure the exact nodesize alignment > > is not required somewhere. > > Ah, I didn't think of that. Yes, ext2_alloc_block_range can allocate > unaligned (with respect to nodesize) ranges. > > I made a test with a conversion that resulted in unaligned blocks[], and > neither btrfs check nor scrub complained on the converted filesystem, but > I will do some research to find out whether blocks[] must be aligned or > not. Thanks for the test. So far I haven't found any requirements for the exact nodesize alignment for metadata blocsk, I think it's safe. Given that the node allocation is opencoded inside convert, we can easily add a fallback to request a bit larger range and satisfy the alignment. This would leave the small unallocatable ranges behind and outside of the btrfs' sight. I'm fine with current code for first implementation. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Mar 18, 2015 at 03:12:28AM +0100, Sebastian Thorarensen wrote: > On Mon, 16 Mar 2015, David Sterba wrote: > > > +static int ext2_free_block_range(ext2_filsys fs, u64 block, int num) > > > +{ > > > + BUG_ON(block != (blk_t)block); > > > > What's the purpose of this? Some kind of overflow check? > > ext2_free_block contains the same check, so I put one in > ext2_free_block_range for consistency. I assumed it was an overflow check > as blk_t is u32. Ok, keep it there so we don't forget. A better error handling is desirable though. > > Otherwise looks good. If you're going to resend, please split the patch > > into one logical change per patch: > > > > 1) generic and preparatory changes, like moving the check_node to > > utils.c, moving the mkfs default leaf size > > > > 2) the changes to convert, ie. most of this patch minus 1 and 3 > > > > 3) the tests > > Thanks for the review! Will resend with the fixes when I know whether > blocks[] must be aligned to nodesize or not. As mentioned in the other mail, current implementation is fine. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/btrfs-convert.txt b/Documentation/btrfs-convert.txt index e508c34..5cf9d03 100644 --- a/Documentation/btrfs-convert.txt +++ b/Documentation/btrfs-convert.txt @@ -23,6 +23,10 @@ Disable data checksum. Ignore xattrs and ACLs. -n:: Disable packing of small files. +-N <SIZE>:: +Set filesystem nodesize, the tree block size in which btrfs stores data. +The default value is 16KB (16384) or the page size, whichever is bigger. +Must be a multiple of the ext2/3/4 block size, but not larger than 65536. -r:: Roll back to ext2fs. -l <LABEL>:: diff --git a/btrfs-convert.c b/btrfs-convert.c index da10ad6..4163076 100644 --- a/btrfs-convert.c +++ b/btrfs-convert.c @@ -140,6 +140,25 @@ static int ext2_alloc_block(ext2_filsys fs, u64 goal, u64 *block_ret) return -ENOSPC; } +static int ext2_alloc_block_range(ext2_filsys fs, u64 goal, int num, + u64 *block_ret) +{ + blk_t block; + ext2fs_block_bitmap bitmap = fs->block_map; + blk_t start = ext2fs_get_block_bitmap_start(bitmap); + blk_t end = ext2fs_get_block_bitmap_end(bitmap); + + for (block = max_t(u64, goal, start); block + num < end; block++) { + if (ext2fs_fast_test_block_bitmap_range(bitmap, block, num)) { + ext2fs_fast_mark_block_bitmap_range(bitmap, block, + num); + *block_ret = block; + return 0; + } + } + return -ENOSPC; +} + static int ext2_free_block(ext2_filsys fs, u64 block) { BUG_ON(block != (blk_t)block); @@ -147,6 +166,13 @@ static int ext2_free_block(ext2_filsys fs, u64 block) return 0; } +static int ext2_free_block_range(ext2_filsys fs, u64 block, int num) +{ + BUG_ON(block != (blk_t)block); + ext2fs_fast_unmark_block_bitmap_range(fs->block_map, block, num); + return 0; +} + static int cache_free_extents(struct btrfs_root *root, ext2_filsys ext2_fs) { @@ -2231,10 +2257,28 @@ err: return ret; } +static int check_nodesize(u32 nodesize, u32 sectorsize) +{ + if (nodesize < sectorsize) { + fprintf(stderr, "illegal nodesize %u (smaller than %u)\n", + nodesize, sectorsize); + return -1; + } else if (nodesize > BTRFS_MAX_METADATA_BLOCKSIZE) { + fprintf(stderr, "illegal nodesize %u (larger than %u)\n", + nodesize, BTRFS_MAX_METADATA_BLOCKSIZE); + return -1; + } else if (nodesize & (sectorsize - 1)) { + fprintf(stderr, "illegal nodesize %u (not aligned to %u)\n", + nodesize, sectorsize); + return -1; + } + return 0; +} + static int do_convert(const char *devname, int datacsum, int packing, int noxattr, - int copylabel, const char *fslabel, int progress) + u32 nodesize, int copylabel, const char *fslabel, int progress) { - int i, ret; + int i, ret, blocks_per_node; int fd = -1; u32 blocksize; u64 blocks[7]; @@ -2261,8 +2305,17 @@ static int do_convert(const char *devname, int datacsum, int packing, int noxatt fprintf(stderr, "filetype feature is missing\n"); goto fail; } + if (check_nodesize(nodesize, blocksize)) + goto fail; + blocks_per_node = nodesize / blocksize; + ret = -blocks_per_node; for (i = 0; i < 7; i++) { - ret = ext2_alloc_block(ext2_fs, 0, blocks + i); + if (nodesize == blocksize) + ret = ext2_alloc_block(ext2_fs, 0, blocks + i); + else + ret = ext2_alloc_block_range(ext2_fs, + ret + blocks_per_node, blocks_per_node, + blocks + i); if (ret) { fprintf(stderr, "not enough free space\n"); goto fail; @@ -2276,7 +2329,7 @@ static int do_convert(const char *devname, int datacsum, int packing, int noxatt goto fail; } ret = make_btrfs(fd, devname, ext2_fs->super->s_volume_name, - NULL, blocks, total_bytes, blocksize, blocksize, + NULL, blocks, total_bytes, nodesize, nodesize, blocksize, blocksize, 0); if (ret) { fprintf(stderr, "unable to create initial ctree: %s\n", @@ -2303,7 +2356,11 @@ static int do_convert(const char *devname, int datacsum, int packing, int noxatt /* recover block allocation bitmap */ for (i = 0; i < 7; i++) { blocks[i] /= blocksize; - ext2_free_block(ext2_fs, blocks[i]); + if (nodesize == blocksize) + ext2_free_block(ext2_fs, blocks[i]); + else + ext2_free_block_range(ext2_fs, blocks[i], + blocks_per_node); } ret = init_btrfs(root); if (ret) { @@ -2759,10 +2816,11 @@ fail: static void print_usage(void) { - printf("usage: btrfs-convert [-d] [-i] [-n] [-r] [-l label] [-L] device\n"); + printf("usage: btrfs-convert [-d] [-i] [-n] [-r] [-N size] [-l label] [-L] device\n"); printf("\t-d disable data checksum\n"); printf("\t-i ignore xattrs and ACLs\n"); printf("\t-n disable packing of small files\n"); + printf("\t-N SIZE set filesystem nodesize\n"); printf("\t-r roll back to ext2fs\n"); printf("\t-l LABEL set filesystem label\n"); printf("\t-L use label from converted fs\n"); @@ -2775,6 +2833,8 @@ int main(int argc, char *argv[]) int packing = 1; int noxattr = 0; int datacsum = 1; + u32 nodesize = max_t(u32, sysconf(_SC_PAGESIZE), + BTRFS_MKFS_DEFAULT_NODE_SIZE); int rollback = 0; int copylabel = 0; int usage_error = 0; @@ -2783,7 +2843,7 @@ int main(int argc, char *argv[]) char *fslabel = NULL; while(1) { - int c = getopt(argc, argv, "dinrl:Lp"); + int c = getopt(argc, argv, "dinN:rl:Lp"); if (c < 0) break; switch(c) { @@ -2796,6 +2856,9 @@ int main(int argc, char *argv[]) case 'n': packing = 0; break; + case 'N': + nodesize = parse_size(optarg); + break; case 'r': rollback = 1; break; @@ -2852,7 +2915,8 @@ int main(int argc, char *argv[]) if (rollback) { ret = do_rollback(file); } else { - ret = do_convert(file, datacsum, packing, noxattr, copylabel, fslabel, progress); + ret = do_convert(file, datacsum, packing, noxattr, nodesize, + copylabel, fslabel, progress); } if (ret) return 1; diff --git a/mkfs.c b/mkfs.c index f83554d..3c1ed7e 100644 --- a/mkfs.c +++ b/mkfs.c @@ -47,8 +47,6 @@ static u64 index_cnt = 2; #define DEFAULT_MKFS_FEATURES (BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF \ | BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA) -#define DEFAULT_MKFS_LEAF_SIZE 16384 - struct directory_name_entry { char *dir_name; char *path; @@ -1230,7 +1228,8 @@ int main(int ac, char **av) u64 alloc_start = 0; u64 metadata_profile = 0; u64 data_profile = 0; - u32 leafsize = max_t(u32, sysconf(_SC_PAGESIZE), DEFAULT_MKFS_LEAF_SIZE); + u32 leafsize = max_t(u32, sysconf(_SC_PAGESIZE), + BTRFS_MKFS_DEFAULT_NODE_SIZE); u32 sectorsize = 4096; u32 nodesize = leafsize; u32 stripesize = 4096; diff --git a/tests/convert-tests.sh b/tests/convert-tests.sh index 6094287..3d912f3 100644 --- a/tests/convert-tests.sh +++ b/tests/convert-tests.sh @@ -16,7 +16,8 @@ rm -f convert-tests-results.txt test(){ echo " [TEST] $1" - shift + nodesize=$2 + shift 2 echo "creating ext image with: $*" >> convert-tests-results.txt # 256MB is the smallest acceptable btrfs image. rm -f $here/test.img >> convert-tests-results.txt 2>&1 \ @@ -25,13 +26,17 @@ test(){ || _fail "could not create test image file" $* -F $here/test.img >> convert-tests-results.txt 2>&1 \ || _fail "filesystem create failed" - $here/btrfs-convert $here/test.img >> convert-tests-results.txt 2>&1 \ + $here/btrfs-convert -N "$nodesize" $here/test.img \ + >> convert-tests-results.txt 2>&1 \ || _fail "btrfs-convert failed" $here/btrfs check $here/test.img >> convert-tests-results.txt 2>&1 \ || _fail "btrfs check detected errors" } # btrfs-convert requires 4k blocksize. -test "ext2" mke2fs -b 4096 -test "ext3" mke2fs -j -b 4096 -test "ext4" mke2fs -t ext4 -b 4096 +test "ext2 4k nodesize" 4096 mke2fs -b 4096 +test "ext3 4k nodesize" 4096 mke2fs -j -b 4096 +test "ext4 4k nodesize" 4096 mke2fs -t ext4 -b 4096 +test "ext2 16k nodesize" 16384 mke2fs -b 4096 +test "ext3 16k nodesize" 16384 mke2fs -j -b 4096 +test "ext4 16k nodesize" 16384 mke2fs -t ext4 -b 4096 diff --git a/utils.h b/utils.h index 82ab5e8..170e5cd 100644 --- a/utils.h +++ b/utils.h @@ -25,6 +25,7 @@ #define BTRFS_MKFS_SYSTEM_GROUP_SIZE (4 * 1024 * 1024) #define BTRFS_MKFS_SMALL_VOLUME_SIZE (1024 * 1024 * 1024) +#define BTRFS_MKFS_DEFAULT_NODE_SIZE 16384 #define BTRFS_SCAN_MOUNTED (1ULL << 0) #define BTRFS_SCAN_LBLKID (1ULL << 1)
Allow btrfs-convert to use nodesizes other than 4096. It defaults to max(16384, pagesize), like mkfs. The constant DEFAULT_MKFS_LEAF_SIZE has been moved to utils.h and renamed to BTRFS_MKFS_DEFAULT_NODE_SIZE for consistency. convert-tests now test both 4096 and 16384 nodesizes. Signed-off-by: Sebastian Thorarensen <sebth@naju.se> --- Documentation/btrfs-convert.txt | 4 ++ btrfs-convert.c | 80 +++++++++++++++++++++++++++++++++++---- mkfs.c | 5 +-- tests/convert-tests.sh | 15 +++++--- utils.h | 1 + 5 files changed, 89 insertions(+), 16 deletions(-)