From patchwork Wed Nov 17 20:19:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DBC9C433F5 for ; Wed, 17 Nov 2021 20:20:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3B9C761AD2 for ; Wed, 17 Nov 2021 20:20:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240651AbhKQUXF (ORCPT ); Wed, 17 Nov 2021 15:23:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239397AbhKQUXB (ORCPT ); Wed, 17 Nov 2021 15:23:01 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2E99C061764 for ; Wed, 17 Nov 2021 12:20:02 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id n15so3942665qta.0 for ; Wed, 17 Nov 2021 12:20:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Nn41RlDLf23cRu2RSsH/3qOMW/nWsLUvpnY8raPHOAk=; b=ndpvYXi0HuS2R7+Imnd5XIK1TYTXFORVNGW4IL4KMUOajJaPJjZkNtMkYP/4XHXQUK EJNw/5sVGJtW2SpgaTRf6vuIIwmKB8JgYexQFfnkBXK/1Xc71NAQITNl/iRDjVvFSKya Nb2BKyda8iCtbNAEhqprct3EOPxwaGnIYqWhpTg4xTIUiJYHkuUJs+QNjOFLADUuSlQx n/EeG8uYAfy9iaZ4K6Un2MI02rgXibxgfx8UH2uVBDg3BNKXNH44cJobVnsoucNn/Yqr RjtxL2170BQluIomimifT8LLgglc4iohHSDROTKi2Yc0Adn2T+Tbu+dky5VNmccwbTUY 9nFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Nn41RlDLf23cRu2RSsH/3qOMW/nWsLUvpnY8raPHOAk=; b=j3GYf1HJCkauTae3GeiQNOrJXL16lngUuUnk/1MY5rsPnslWlr9rVmt+/uXa6qehTg nerrM5hLe81IxuxheBzDNJ6CYWVWUtblsYQQHKeWEkMDHd+8K278152Ek3AbXIocZlQS 2oOMHpO5CUhZyoo83mlvDEoU+EaEe4CvL4D84XKGwMIataBgLuS9M51nafajm66yH55h fxWLjyCKF0kiytuWQPPH7BT8eHEg9KrZPvNFvbPh4l3MOReNIKTT3n/XGBMgjW5rzhQT fFkMkTZc0YSxaBZRvbrCUxULPbCOXzcGRwgHJLDlnaybeOmHIfQtSIpZapZsGMYFNH5E dQGQ== X-Gm-Message-State: AOAM532ZFjKTrb+mL1TW7LzBZ7W0FCahKCpjHW47SlDj8FHanjcIVIoH 6ZNur2hcjzDDFO4qcdv937NfL/FZ31VHoA== X-Google-Smtp-Source: ABdhPJwxLelo73O4hH7ab21Mam27T0JL4acZM9Y5YDtohcmBGV6BGN8TzXE08HzjnOz3zk5eCZisow== X-Received: by 2002:ac8:4e8c:: with SMTP id 12mr20120407qtp.45.1637180401905; Wed, 17 Nov 2021 12:20:01 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:01 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 01/17] fs: export rw_verify_area() Date: Wed, 17 Nov 2021 12:19:11 -0800 Message-Id: <11bc0fc15490afc6ce15c405cca3e16582f2f0ec.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval I'm adding Btrfs ioctls to read and write compressed data, and rather than duplicating the checks in rw_verify_area(), let's just export it. Reviewed-by: Josef Bacik Signed-off-by: Omar Sandoval --- fs/internal.h | 5 ----- fs/read_write.c | 1 + include/linux/fs.h | 1 + 3 files changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index 7979ff8d168c..8e400574401e 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -157,11 +157,6 @@ extern char *simple_dname(struct dentry *, char *, int); extern void dput_to_list(struct dentry *, struct list_head *); extern void shrink_dentry_list(struct list_head *); -/* - * read_write.c - */ -extern int rw_verify_area(int, struct file *, const loff_t *, size_t); - /* * pipe.c */ diff --git a/fs/read_write.c b/fs/read_write.c index 0074afa7ecb3..4d60146243df 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -385,6 +385,7 @@ int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t return security_file_permission(file, read_write == READ ? MAY_READ : MAY_WRITE); } +EXPORT_SYMBOL(rw_verify_area); static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos) { diff --git a/include/linux/fs.h b/include/linux/fs.h index 1cb616fc1105..364940c6a299 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3244,6 +3244,7 @@ extern loff_t fixed_size_llseek(struct file *file, loff_t offset, int whence, loff_t size); extern loff_t no_seek_end_llseek_size(struct file *, loff_t, int, loff_t); extern loff_t no_seek_end_llseek(struct file *, loff_t, int); +extern int rw_verify_area(int, struct file *, const loff_t *, size_t); extern int generic_file_open(struct inode * inode, struct file * filp); extern int nonseekable_open(struct inode * inode, struct file * filp); extern int stream_open(struct inode * inode, struct file * filp); From patchwork Wed Nov 17 20:19:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 250F3C433FE for ; Wed, 17 Nov 2021 20:20:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 076E461AEC for ; Wed, 17 Nov 2021 20:20:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239482AbhKQUXH (ORCPT ); Wed, 17 Nov 2021 15:23:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239434AbhKQUXC (ORCPT ); Wed, 17 Nov 2021 15:23:02 -0500 Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [IPv6:2607:f8b0:4864:20::f2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4FECC061570 for ; Wed, 17 Nov 2021 12:20:03 -0800 (PST) Received: by mail-qv1-xf2a.google.com with SMTP id bu11so2959401qvb.0 for ; Wed, 17 Nov 2021 12:20:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/XGzVuim02psWU08/3cLBhWlbJSU0KoiFU34gYkJx0g=; b=6vAtGrpYVz7rJviUpZuRQnzXmIuRL54QcGpYYkTLAlQPJiqpzsHhoDGH4mxuPPCqrZ cKzUWj0kbO0pLOdEJIW1jmSWpJQhF7msH190DofUVipBxTRBkFDLrNyJMd3ok/cU3yH3 fo5Y5CgqwC/32cDV75FvtnuLdeLILOBTT1c9BgkcJjmJAF4HCSkrEFSc2PtToL+dshl/ Ev/MMFa3dszxvdpDRep6UsO6NB8y9ggGui3ejOizEd+rLKNwxdj8KL0JuIY4hDS0L0pY QmfKs0vV7knwp8Sh4fPTg8rfmGoaIgQYUghgZm/0pYEQ7lzkL0Hfg63glqwOnDErQ7ag 0Xcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/XGzVuim02psWU08/3cLBhWlbJSU0KoiFU34gYkJx0g=; b=LoR2QFGNPXs6gxhYJT5NukR5OX+JlNnlVv5DPgxPiinGxIZpBmG2HTmNWqQHB4WJzT wBfFT26oba71KJgG4NUE3heCLp8g7VAFae74WluzGFocflM0LO4i5R7fkNQELuG1lalZ ydo5/3nnsPHVZdA5ewuf4aNG0ja57aBNCh7dgFVWxaCxu+L+q0UtHvZf+oetn4ylgPz1 N55k4Tdhd1hai9fXaSq/VQSYrdMPbK7+S0tWIC+CzS1PD2brB2pvZ8LOVIM3QkYhr85C 3LHPfKEorc8ctTwP97e4pRLaL/jOsHaa2xR5WlXAaapz4VsGrJo3rB/NpYhziKOfc2h5 K3Gg== X-Gm-Message-State: AOAM530VuxFflym0yxQRisJKHzDuyd2BJHePHKMBBZ+vZIAurJveEWtc 4hqUyN46iYkbeZPBIqKVjy7na7F14dYn2g== X-Google-Smtp-Source: ABdhPJw1ISYEQ5gZ1/0+g3rxWkESzUBSRG6YAERm1IUyuTqXbfwR5NpzmabTRcYnBT2/aBc5I2e/qQ== X-Received: by 2002:a05:6214:1ccb:: with SMTP id g11mr51610270qvd.31.1637180402768; Wed, 17 Nov 2021 12:20:02 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:02 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 02/17] fs: export variant of generic_write_checks without iov_iter Date: Wed, 17 Nov 2021 12:19:12 -0800 Message-Id: <6116cb20426370265251d18f2ccfbd7632bb5462.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Encoded I/O in Btrfs needs to check a write with a given logical size without an iov_iter that matches that size (because the iov_iter we have is for the compressed data). So, factor out the parts of generic_write_check() that don't need an iov_iter into a new generic_write_checks_count() function and export that. Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- fs/read_write.c | 33 ++++++++++++++++++++------------- include/linux/fs.h | 1 + 2 files changed, 21 insertions(+), 13 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 4d60146243df..dc5000173b80 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1618,24 +1618,16 @@ int generic_write_check_limits(struct file *file, loff_t pos, loff_t *count) return 0; } -/* - * Performs necessary checks before doing a write - * - * Can adjust writing position or amount of bytes to write. - * Returns appropriate error code that caller should return or - * zero in case that write should be allowed. - */ -ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) +/* Like generic_write_checks(), but takes size of write instead of iter. */ +int generic_write_checks_count(struct kiocb *iocb, loff_t *count) { struct file *file = iocb->ki_filp; struct inode *inode = file->f_mapping->host; - loff_t count; - int ret; if (IS_SWAPFILE(inode)) return -ETXTBSY; - if (!iov_iter_count(from)) + if (!*count) return 0; /* FIXME: this is for backwards compatibility with 2.4 */ @@ -1645,8 +1637,23 @@ ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) if ((iocb->ki_flags & IOCB_NOWAIT) && !(iocb->ki_flags & IOCB_DIRECT)) return -EINVAL; - count = iov_iter_count(from); - ret = generic_write_check_limits(file, iocb->ki_pos, &count); + return generic_write_check_limits(iocb->ki_filp, iocb->ki_pos, count); +} +EXPORT_SYMBOL(generic_write_checks_count); + +/* + * Performs necessary checks before doing a write + * + * Can adjust writing position or amount of bytes to write. + * Returns appropriate error code that caller should return or + * zero in case that write should be allowed. + */ +ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from) +{ + loff_t count = iov_iter_count(from); + int ret; + + ret = generic_write_checks_count(iocb, &count); if (ret) return ret; diff --git a/include/linux/fs.h b/include/linux/fs.h index 364940c6a299..368d537409ab 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3201,6 +3201,7 @@ extern int sb_min_blocksize(struct super_block *, int); extern int generic_file_mmap(struct file *, struct vm_area_struct *); extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *); extern ssize_t generic_write_checks(struct kiocb *, struct iov_iter *); +extern int generic_write_checks_count(struct kiocb *iocb, loff_t *count); extern int generic_write_check_limits(struct file *file, loff_t pos, loff_t *count); extern int generic_file_rw_checks(struct file *file_in, struct file *file_out); From patchwork Wed Nov 17 20:19:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0CBDC433EF for ; Wed, 17 Nov 2021 20:20:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D085861AD2 for ; Wed, 17 Nov 2021 20:20:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240656AbhKQUXI (ORCPT ); Wed, 17 Nov 2021 15:23:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240644AbhKQUXD (ORCPT ); Wed, 17 Nov 2021 15:23:03 -0500 Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB6FAC061570 for ; Wed, 17 Nov 2021 12:20:04 -0800 (PST) Received: by mail-qt1-x82f.google.com with SMTP id 8so3893154qtx.5 for ; Wed, 17 Nov 2021 12:20:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2+tU5Hk9m1KO64+xVReW2Yaf9drjl1ZINo1hzu/HhZ0=; b=LbvbrDx768RBA6QUe1JAHSnbdsbghFdhoQTllM5vvM72DWEymz7sxYz/WPrnwQLmLH p7Wty9R8KDphSzrj4SrxdZYZ+3gYVQsOFltVr4gGb6VHCxJmf+xy3fAKV/HMzjwqBtQh B6IffU+leCiSPzWt8b+dr+SS7pMYnjH1tmW/WL2n4dDTYIWoVrvNG00XUSKjMOWFzB+G QrFqvrNrFo+sPxuWjiDcqyIkmsNjMxHhzyaZGYfwvnXBEbdX/eEsQPX3FW44/rTsUN+E eI1ieIDX10lqC1ErzydCqpDe/pm6Qq9LDsePKL6d4lnKjDLhkXVMBtEFqLr0IHyRjTYo iN/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2+tU5Hk9m1KO64+xVReW2Yaf9drjl1ZINo1hzu/HhZ0=; b=klvcjjKKJwJ8+XnE3hSz/4kb+mGasgz5zRpNkwlQZ1NVr2W3F0+YCzLZ3qXorB/JUt NuHLmtsp62OkHZKqhzHlrV75juTOOfEUdx/OUnwHTxu2yQkBMVpVJP70R1+1iQooDIf+ +7+R3nRACdkG4Q4tiLMpG6Swmqo6vVtKguwPVDmXC0h+RhDqzhEmybWSgrjRp+UgtzM7 PAVx/lSwN9T6ZHHB6orEBrHCwbXuq+X/Ds1ynEAVVLnPf76woYTs8ovV0LG7YxPIXm9F npohY4h0pM55EYtlzQILbQrCjB/ZB9O8Kj41DNolggdv9BfFiOXKJ30WN/1COIAEICoi +Rdg== X-Gm-Message-State: AOAM532PFSvYURd2ZD6gOyqIUoaFWgoXe3BAhSqXjYUCXJ88ibj8eZ2D byDlc7C+NTyjbcV1I9nh75Q9CkQC0+aqjQ== X-Google-Smtp-Source: ABdhPJyQ1w6PPCTdo8y5lPYkCFsH/cTJzi4+hgYduZEngXkcDOeqCksmkMlluBRR2jk4ti9eNxD2gA== X-Received: by 2002:ac8:7f4f:: with SMTP id g15mr20268654qtk.309.1637180403614; Wed, 17 Nov 2021 12:20:03 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:03 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 03/17] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() Date: Wed, 17 Nov 2021 12:19:13 -0800 Message-Id: <3e142cc82fbabe36bbb0862f0fd702d1c8a58f48.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval btrfs_csum_one_bio() loops over each filesystem block in the bio while keeping a cursor of its current logical position in the file in order to look up the ordered extent to add the checksums to. However, this doesn't make much sense for compressed extents, as a sector on disk does not correspond to a sector of decompressed file data. It happens to work because 1) the compressed bio always covers one ordered extent and 2) the size of the bio is always less than the size of the ordered extent. However, the second point will not always be true for encoded writes. Let's add a boolean parameter to btrfs_csum_one_bio() to indicate that it can assume that the bio only covers one ordered extent. Since we're already changing the signature, let's get rid of the contig parameter and make it implied by the offset parameter, similar to the change we recently made to btrfs_lookup_bio_sums(). Additionally, let's rename nr_sectors to blockcount to make it clear that it's the number of filesystem blocks, not the number of 512-byte sectors. Reviewed-by: Josef Bacik Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- fs/btrfs/compression.c | 3 ++- fs/btrfs/ctree.h | 2 +- fs/btrfs/file-item.c | 32 ++++++++++++++------------------ fs/btrfs/inode.c | 8 ++++---- 4 files changed, 21 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 32da97c3c19d..73350f524fb8 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -590,7 +590,8 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, if (submit) { if (!skip_sum) { - ret = btrfs_csum_one_bio(inode, bio, start, 1); + ret = btrfs_csum_one_bio(inode, bio, start, + true); if (ret) goto finish_cb; } diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index f1dd2486dcb3..9fd677f2ce15 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3169,7 +3169,7 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct btrfs_ordered_sum *sums); blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio, - u64 file_start, int contig); + u64 offset, bool one_ordered); int btrfs_lookup_csums_range(struct btrfs_root *root, u64 start, u64 end, struct list_head *list, int search_commit); void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 0f2e2ab34828..6203c85f5a1b 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -613,28 +613,28 @@ int btrfs_lookup_csums_range(struct btrfs_root *root, u64 start, u64 end, * btrfs_csum_one_bio - Calculates checksums of the data contained inside a bio * @inode: Owner of the data inside the bio * @bio: Contains the data to be checksummed - * @file_start: offset in file this bio begins to describe - * @contig: Boolean. If true/1 means all bio vecs in this bio are - * contiguous and they begin at @file_start in the file. False/0 - * means this bio can contain potentially discontiguous bio vecs - * so the logical offset of each should be calculated separately. + * @offset: If (u64)-1, @bio may contain discontiguous bio vecs, so the + * file offsets are determined from the page offsets in the bio. + * Otherwise, this is the starting file offset of the bio vecs in + * @bio, which must be contiguous. + * @one_ordered: If true, @bio only refers to one ordered extent. */ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio, - u64 file_start, int contig) + u64 offset, bool one_ordered) { struct btrfs_fs_info *fs_info = inode->root->fs_info; SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); struct btrfs_ordered_sum *sums; struct btrfs_ordered_extent *ordered = NULL; + const bool use_page_offsets = (offset == (u64)-1); char *data; struct bvec_iter iter; struct bio_vec bvec; int index; - int nr_sectors; + int blockcount; unsigned long total_bytes = 0; unsigned long this_sum_bytes = 0; int i; - u64 offset; unsigned nofs_flag; nofs_flag = memalloc_nofs_save(); @@ -648,18 +648,13 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio, sums->len = bio->bi_iter.bi_size; INIT_LIST_HEAD(&sums->list); - if (contig) - offset = file_start; - else - offset = 0; /* shut up gcc */ - sums->bytenr = bio->bi_iter.bi_sector << 9; index = 0; shash->tfm = fs_info->csum_shash; bio_for_each_segment(bvec, bio, iter) { - if (!contig) + if (use_page_offsets) offset = page_offset(bvec.bv_page) + bvec.bv_offset; if (!ordered) { @@ -678,13 +673,14 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio, } } - nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info, + blockcount = BTRFS_BYTES_TO_BLKS(fs_info, bvec.bv_len + fs_info->sectorsize - 1); - for (i = 0; i < nr_sectors; i++) { - if (offset >= ordered->file_offset + ordered->num_bytes || - offset < ordered->file_offset) { + for (i = 0; i < blockcount; i++) { + if (!one_ordered && + !in_range(offset, ordered->file_offset, + ordered->num_bytes)) { unsigned long bytes_left; sums->len = this_sum_bytes; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 91f7ed27e421..0bd992835cf5 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2308,7 +2308,7 @@ void btrfs_clear_delalloc_extent(struct inode *vfs_inode, static blk_status_t btrfs_submit_bio_start(struct inode *inode, struct bio *bio, u64 dio_file_offset) { - return btrfs_csum_one_bio(BTRFS_I(inode), bio, 0, 0); + return btrfs_csum_one_bio(BTRFS_I(inode), bio, (u64)-1, false); } /* @@ -2560,7 +2560,7 @@ blk_status_t btrfs_submit_data_bio(struct inode *inode, struct bio *bio, 0, btrfs_submit_bio_start); goto out; } else if (!skip_sum) { - ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, 0, 0); + ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, (u64)-1, false); if (ret) goto out; } @@ -8162,7 +8162,7 @@ static blk_status_t btrfs_submit_bio_start_direct_io(struct inode *inode, struct bio *bio, u64 dio_file_offset) { - return btrfs_csum_one_bio(BTRFS_I(inode), bio, dio_file_offset, 1); + return btrfs_csum_one_bio(BTRFS_I(inode), bio, dio_file_offset, false); } static void btrfs_end_dio_bio(struct bio *bio) @@ -8219,7 +8219,7 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, * If we aren't doing async submit, calculate the csum of the * bio now. */ - ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, file_offset, 1); + ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, file_offset, false); if (ret) goto err; } else { From patchwork Wed Nov 17 20:19:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0DADC433F5 for ; Wed, 17 Nov 2021 20:20:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 966B861B43 for ; Wed, 17 Nov 2021 20:20:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240658AbhKQUXJ (ORCPT ); Wed, 17 Nov 2021 15:23:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240647AbhKQUXE (ORCPT ); Wed, 17 Nov 2021 15:23:04 -0500 Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6C3EC061764 for ; Wed, 17 Nov 2021 12:20:05 -0800 (PST) Received: by mail-qt1-x82c.google.com with SMTP id t11so3899300qtw.3 for ; Wed, 17 Nov 2021 12:20:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pi4j4V4l7C2Awm0t18FqNK5W2ZCvzHMUNiXr6cO8fFc=; b=PdI2LY3PYjcC7Q9v+cjV0dF6wRGnJe+EAXwzCHlnTojHf0EWneGeJpW+YVUVBnu0uO AdtTPW79hzsiNX5egx5FFCfKMTGQpV4G0Xd8imayEsBQUxnRQIm/Cs1RdPvLt3mpdLc+ ecIDGxMsBUhtaeAxbL1s8CwIbRf26pUcstrxkBJGneOwBL3YYGERa9qtXmGAf/yKBPCB PgB6GY24G/N9picrc7tpQ7juAEoH0KeWriGTaP9JpLd6N55LNo0ky2hKO+ePC3gAtg6t MPh4ut7xoixePbJeRoinWZDZSPArVd4ZDJteeU93SR3f7j8h/xLRhf5ggUiCl8tzNUey nb+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pi4j4V4l7C2Awm0t18FqNK5W2ZCvzHMUNiXr6cO8fFc=; b=RlcLNd1FttsJCn5erRGsN2OmI8Qf86gH+sZP7QAB3qhI+6/cZdOyN74XfIRlUuDOCX ebugng8J9piIRqwkGHN6cB7yIfKTOmuTGgVLjOkGsn2rPJxHkypZtfqbUse6zr7uRlpr neSYzGdTg2v94vr0sJABrhcKV6zti1dxpYt8l44ZKS+ALODi6Au3VTc5v6B3U5iSLltl j1GxsAUY1CZSo65or6IZdmdhqksXsCxS4JwWUWUkIAT7xx6jwXz8+SnRyOfhWaPTIM6H Os4Y4EI6JnrePIhs/HbfnH6pEg372Lx8ffnAh3brete3yQWIATf5RLlE0c/MMn2EtnFL QMvA== X-Gm-Message-State: AOAM531Dx1hEup9q5uP2d4dRPsJDTwdDu9tlLM9EJohLyFSfo5ARzlSo /3HO/nIWsXrtjOEky2n2C3HOGPIwMu5i+g== X-Google-Smtp-Source: ABdhPJyZEPUM0XpHxaY0FFZozE6AJRMuYfR1DXds18pkv81tRBb3MkJgy7a4KTsl89+9/zGLSYFkzg== X-Received: by 2002:a05:622a:15cc:: with SMTP id d12mr19798550qty.387.1637180404633; Wed, 17 Nov 2021 12:20:04 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:04 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 04/17] btrfs: add ram_bytes and offset to btrfs_ordered_extent Date: Wed, 17 Nov 2021 12:19:14 -0800 Message-Id: X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Currently, we only create ordered extents when ram_bytes == num_bytes and offset == 0. However, BTRFS_IOC_ENCODED_WRITE writes may create extents which only refer to a subset of the full unencoded extent, so we need to plumb these fields through the ordered extent infrastructure and pass them down to insert_reserved_file_extent(). Since we're changing the btrfs_add_ordered_extent* signature, let's get rid of the trivial wrappers and add a kernel-doc. Reviewed-by: Nikolay Borisov Reviewed-by: Josef Bacik Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 58 ++++++++++++-------- fs/btrfs/ordered-data.c | 119 ++++++++++++---------------------------- fs/btrfs/ordered-data.h | 22 +++++--- 3 files changed, 82 insertions(+), 117 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0bd992835cf5..1afadc7afff3 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -980,11 +980,14 @@ static int submit_one_async_extent(struct btrfs_inode *inode, } free_extent_map(em); - ret = btrfs_add_ordered_extent_compress(inode, start, /* file_offset */ - ins.objectid, /* disk_bytenr */ - async_extent->ram_size, /* num_bytes */ - ins.offset, /* disk_num_bytes */ - async_extent->compress_type); + ret = btrfs_add_ordered_extent(inode, start, /* file_offset */ + async_extent->ram_size, /* num_bytes */ + async_extent->ram_size, /* ram_bytes */ + ins.objectid, /* disk_bytenr */ + ins.offset, /* disk_num_bytes */ + 0, /* offset */ + 1 << BTRFS_ORDERED_COMPRESSED, + async_extent->compress_type); if (ret) { btrfs_drop_extent_cache(inode, start, end, 0); goto out_free_reserve; @@ -1233,9 +1236,10 @@ static noinline int cow_file_range(struct btrfs_inode *inode, } free_extent_map(em); - ret = btrfs_add_ordered_extent(inode, start, ins.objectid, - ram_size, cur_alloc_size, - BTRFS_ORDERED_REGULAR); + ret = btrfs_add_ordered_extent(inode, start, ram_size, ram_size, + ins.objectid, cur_alloc_size, 0, + 1 << BTRFS_ORDERED_REGULAR, + BTRFS_COMPRESS_NONE); if (ret) goto out_drop_extent_cache; @@ -1893,10 +1897,11 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, goto error; } free_extent_map(em); - ret = btrfs_add_ordered_extent(inode, cur_offset, - disk_bytenr, num_bytes, - num_bytes, - BTRFS_ORDERED_PREALLOC); + ret = btrfs_add_ordered_extent(inode, + cur_offset, num_bytes, num_bytes, + disk_bytenr, num_bytes, 0, + 1 << BTRFS_ORDERED_PREALLOC, + BTRFS_COMPRESS_NONE); if (ret) { btrfs_drop_extent_cache(inode, cur_offset, cur_offset + num_bytes - 1, @@ -1905,9 +1910,11 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, } } else { ret = btrfs_add_ordered_extent(inode, cur_offset, + num_bytes, num_bytes, disk_bytenr, num_bytes, - num_bytes, - BTRFS_ORDERED_NOCOW); + 0, + 1 << BTRFS_ORDERED_NOCOW, + BTRFS_COMPRESS_NONE); if (ret) goto error; } @@ -2864,6 +2871,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, struct btrfs_key ins; u64 disk_num_bytes = btrfs_stack_file_extent_disk_num_bytes(stack_fi); u64 disk_bytenr = btrfs_stack_file_extent_disk_bytenr(stack_fi); + u64 offset = btrfs_stack_file_extent_offset(stack_fi); u64 num_bytes = btrfs_stack_file_extent_num_bytes(stack_fi); u64 ram_bytes = btrfs_stack_file_extent_ram_bytes(stack_fi); struct btrfs_drop_extents_args drop_args = { 0 }; @@ -2938,7 +2946,8 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, goto out; ret = btrfs_alloc_reserved_file_extent(trans, root, btrfs_ino(inode), - file_pos, qgroup_reserved, &ins); + file_pos - offset, + qgroup_reserved, &ins); out: btrfs_free_path(path); @@ -2964,20 +2973,20 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, struct btrfs_ordered_extent *oe) { struct btrfs_file_extent_item stack_fi; - u64 logical_len; bool update_inode_bytes; + u64 num_bytes = oe->num_bytes; + u64 ram_bytes = oe->ram_bytes; memset(&stack_fi, 0, sizeof(stack_fi)); btrfs_set_stack_file_extent_type(&stack_fi, BTRFS_FILE_EXTENT_REG); btrfs_set_stack_file_extent_disk_bytenr(&stack_fi, oe->disk_bytenr); btrfs_set_stack_file_extent_disk_num_bytes(&stack_fi, oe->disk_num_bytes); + btrfs_set_stack_file_extent_offset(&stack_fi, oe->offset); if (test_bit(BTRFS_ORDERED_TRUNCATED, &oe->flags)) - logical_len = oe->truncated_len; - else - logical_len = oe->num_bytes; - btrfs_set_stack_file_extent_num_bytes(&stack_fi, logical_len); - btrfs_set_stack_file_extent_ram_bytes(&stack_fi, logical_len); + num_bytes = ram_bytes = oe->truncated_len; + btrfs_set_stack_file_extent_num_bytes(&stack_fi, num_bytes); + btrfs_set_stack_file_extent_ram_bytes(&stack_fi, ram_bytes); btrfs_set_stack_file_extent_compression(&stack_fi, oe->compress_type); /* Encryption and other encoding is reserved and all 0 */ @@ -7399,8 +7408,11 @@ static struct extent_map *btrfs_create_dio_extent(struct btrfs_inode *inode, if (IS_ERR(em)) goto out; } - ret = btrfs_add_ordered_extent_dio(inode, start, block_start, len, - block_len, type); + ret = btrfs_add_ordered_extent(inode, start, len, len, block_start, + block_len, 0, + (1 << type) | + (1 << BTRFS_ORDERED_DIRECT), + BTRFS_COMPRESS_NONE); if (ret) { if (em) { free_extent_map(em); diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 6b51fd2ec5ac..5e4c59b00b01 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -143,16 +143,27 @@ static inline struct rb_node *tree_search(struct btrfs_ordered_inode_tree *tree, return ret; } -/* - * Allocate and add a new ordered_extent into the per-inode tree. +/** + * btrfs_add_ordered_extent - Add an ordered extent to the per-inode tree. + * @inode: inode that this extent is for. + * @file_offset: Logical offset in file where the extent starts. + * @num_bytes: Logical length of extent in file. + * @ram_bytes: Full length of unencoded data. + * @disk_bytenr: Offset of extent on disk. + * @disk_num_bytes: Size of extent on disk. + * @offset: Offset into unencoded data where file data starts. + * @flags: Flags specifying type of extent (1 << BTRFS_ORDERED_*). + * @compress_type: Compression algorithm used for data. * - * The tree is given a single reference on the ordered extent that was - * inserted. + * Most of these parameters correspond to &struct btrfs_file_extent_item. The + * tree is given a single reference on the ordered extent that was inserted. + * + * Return: 0 or -ENOMEM. */ -static int __btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, - u64 disk_num_bytes, int type, int dio, - int compress_type) +int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, + u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, + u64 disk_num_bytes, u64 offset, int flags, + int compress_type) { struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; @@ -161,7 +172,8 @@ static int __btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset struct btrfs_ordered_extent *entry; int ret; - if (type == BTRFS_ORDERED_NOCOW || type == BTRFS_ORDERED_PREALLOC) { + if (flags & + ((1 << BTRFS_ORDERED_NOCOW) | (1 << BTRFS_ORDERED_PREALLOC))) { /* For nocow write, we can release the qgroup rsv right now */ ret = btrfs_qgroup_free_data(inode, NULL, file_offset, num_bytes); if (ret < 0) @@ -181,9 +193,11 @@ static int __btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset return -ENOMEM; entry->file_offset = file_offset; - entry->disk_bytenr = disk_bytenr; entry->num_bytes = num_bytes; + entry->ram_bytes = ram_bytes; + entry->disk_bytenr = disk_bytenr; entry->disk_num_bytes = disk_num_bytes; + entry->offset = offset; entry->bytes_left = num_bytes; entry->inode = igrab(&inode->vfs_inode); entry->compress_type = compress_type; @@ -191,18 +205,12 @@ static int __btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset entry->qgroup_rsv = ret; entry->physical = (u64)-1; - ASSERT(type == BTRFS_ORDERED_REGULAR || - type == BTRFS_ORDERED_NOCOW || - type == BTRFS_ORDERED_PREALLOC || - type == BTRFS_ORDERED_COMPRESSED); - set_bit(type, &entry->flags); + ASSERT((flags & ~BTRFS_ORDERED_TYPE_FLAGS) == 0); + entry->flags = flags; percpu_counter_add_batch(&fs_info->ordered_bytes, num_bytes, fs_info->delalloc_batch); - if (dio) - set_bit(BTRFS_ORDERED_DIRECT, &entry->flags); - /* one ref for the tree */ refcount_set(&entry->refs, 1); init_waitqueue_head(&entry->wait); @@ -247,41 +255,6 @@ static int __btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset return 0; } -int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, u64 disk_num_bytes, - int type) -{ - ASSERT(type == BTRFS_ORDERED_REGULAR || - type == BTRFS_ORDERED_NOCOW || - type == BTRFS_ORDERED_PREALLOC); - return __btrfs_add_ordered_extent(inode, file_offset, disk_bytenr, - num_bytes, disk_num_bytes, type, 0, - BTRFS_COMPRESS_NONE); -} - -int btrfs_add_ordered_extent_dio(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, - u64 disk_num_bytes, int type) -{ - ASSERT(type == BTRFS_ORDERED_REGULAR || - type == BTRFS_ORDERED_NOCOW || - type == BTRFS_ORDERED_PREALLOC); - return __btrfs_add_ordered_extent(inode, file_offset, disk_bytenr, - num_bytes, disk_num_bytes, type, 1, - BTRFS_COMPRESS_NONE); -} - -int btrfs_add_ordered_extent_compress(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, - u64 disk_num_bytes, int compress_type) -{ - ASSERT(compress_type != BTRFS_COMPRESS_NONE); - return __btrfs_add_ordered_extent(inode, file_offset, disk_bytenr, - num_bytes, disk_num_bytes, - BTRFS_ORDERED_COMPRESSED, 0, - compress_type); -} - /* * Add a struct btrfs_ordered_sum into the list of checksums to be inserted * when an ordered extent is finished. If the list covers more than one @@ -1052,42 +1025,18 @@ static int clone_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pos, struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; u64 file_offset = ordered->file_offset + pos; u64 disk_bytenr = ordered->disk_bytenr + pos; - u64 num_bytes = len; - u64 disk_num_bytes = len; - int type; - unsigned long flags_masked = ordered->flags & ~(1 << BTRFS_ORDERED_DIRECT); - int compress_type = ordered->compress_type; - unsigned long weight; - int ret; - - weight = hweight_long(flags_masked); - WARN_ON_ONCE(weight > 1); - if (!weight) - type = 0; - else - type = __ffs(flags_masked); + unsigned long flags = ordered->flags & BTRFS_ORDERED_TYPE_FLAGS; /* - * The splitting extent is already counted and will be added again - * in btrfs_add_ordered_extent_*(). Subtract num_bytes to avoid - * double counting. + * The splitting extent is already counted and will be added again in + * btrfs_add_ordered_extent_*(). Subtract len to avoid double counting. */ - percpu_counter_add_batch(&fs_info->ordered_bytes, -num_bytes, + percpu_counter_add_batch(&fs_info->ordered_bytes, -len, fs_info->delalloc_batch); - if (test_bit(BTRFS_ORDERED_COMPRESSED, &ordered->flags)) { - WARN_ON_ONCE(1); - ret = btrfs_add_ordered_extent_compress(BTRFS_I(inode), - file_offset, disk_bytenr, num_bytes, - disk_num_bytes, compress_type); - } else if (test_bit(BTRFS_ORDERED_DIRECT, &ordered->flags)) { - ret = btrfs_add_ordered_extent_dio(BTRFS_I(inode), file_offset, - disk_bytenr, num_bytes, disk_num_bytes, type); - } else { - ret = btrfs_add_ordered_extent(BTRFS_I(inode), file_offset, - disk_bytenr, num_bytes, disk_num_bytes, type); - } - - return ret; + WARN_ON_ONCE(flags & (1 << BTRFS_ORDERED_COMPRESSED)); + return btrfs_add_ordered_extent(BTRFS_I(inode), file_offset, len, len, + disk_bytenr, len, 0, flags, + ordered->compress_type); } int btrfs_split_ordered_extent(struct btrfs_ordered_extent *ordered, u64 pre, diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 4194e960ff61..0feb0c29839e 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -76,6 +76,13 @@ enum { BTRFS_ORDERED_PENDING, }; +/* BTRFS_ORDERED_* flags that specify the type of the extent. */ +#define BTRFS_ORDERED_TYPE_FLAGS ((1UL << BTRFS_ORDERED_REGULAR) | \ + (1UL << BTRFS_ORDERED_NOCOW) | \ + (1UL << BTRFS_ORDERED_PREALLOC) | \ + (1UL << BTRFS_ORDERED_COMPRESSED) | \ + (1UL << BTRFS_ORDERED_DIRECT)) + struct btrfs_ordered_extent { /* logical offset in the file */ u64 file_offset; @@ -84,9 +91,11 @@ struct btrfs_ordered_extent { * These fields directly correspond to the same fields in * btrfs_file_extent_item. */ - u64 disk_bytenr; u64 num_bytes; + u64 ram_bytes; + u64 disk_bytenr; u64 disk_num_bytes; + u64 offset; /* number of bytes that still need writing */ u64 bytes_left; @@ -179,14 +188,9 @@ bool btrfs_dec_test_ordered_pending(struct btrfs_inode *inode, struct btrfs_ordered_extent **cached, u64 file_offset, u64 io_size); int btrfs_add_ordered_extent(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, u64 disk_num_bytes, - int type); -int btrfs_add_ordered_extent_dio(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, - u64 disk_num_bytes, int type); -int btrfs_add_ordered_extent_compress(struct btrfs_inode *inode, u64 file_offset, - u64 disk_bytenr, u64 num_bytes, - u64 disk_num_bytes, int compress_type); + u64 num_bytes, u64 ram_bytes, u64 disk_bytenr, + u64 disk_num_bytes, u64 offset, int flags, + int compress_type); void btrfs_add_ordered_sum(struct btrfs_ordered_extent *entry, struct btrfs_ordered_sum *sum); struct btrfs_ordered_extent *btrfs_lookup_ordered_extent(struct btrfs_inode *inode, From patchwork Wed Nov 17 20:19:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BEADC433F5 for ; Wed, 17 Nov 2021 20:20:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7F7B261AF0 for ; Wed, 17 Nov 2021 20:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240661AbhKQUXK (ORCPT ); Wed, 17 Nov 2021 15:23:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240650AbhKQUXF (ORCPT ); Wed, 17 Nov 2021 15:23:05 -0500 Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EDBBC061766 for ; Wed, 17 Nov 2021 12:20:06 -0800 (PST) Received: by mail-qv1-xf34.google.com with SMTP id v2so2885420qve.11 for ; Wed, 17 Nov 2021 12:20:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Mr/qxi8LZKYtL1UDp9JV8wqH9H2ofq9cHo044ox5hNY=; b=ry6+oXpCRsTrQ7LebQ4uAijV1tQh2IxULh+lPfuC7YSzb/kqkTFW/xgEQq4eOyWV/9 vaGniA3WDrTfkDdJUU+Hc1l2FoCLUM2tmKCb8ZbFYJpiFnBaGavauWa9/O+CDXv+PZjo u7R2y/l+jGOx6gwMgn4xbwEchX42onsuiQWFn5UPoy48Y9mnsbBzdmPMhU4yk9WBzsK+ DUv89bRXg6eExiIK9stZHI4T2/6vq+9Zih4Xf4Bu2MU7Nsgzdozm1aOM5Rk7MJAqecU7 YJkkwjC6MQsZGOvmoJsh8VoEgvIoipjJtih6lg3lWhOly8y+8iki6EH5wPmvs7+sPYC8 hkBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Mr/qxi8LZKYtL1UDp9JV8wqH9H2ofq9cHo044ox5hNY=; b=J1nJdiz1gZ9Upft7+DSzwxjR6T8m6d9n6iLgqlfk3B6htRrqhEmjYMkac3AwlsBVtR d79ynpipEd8LxEhteNaqW5kfUFWxv7UDBEOz19WB7Cce9kP4DmZ5tHb7vbBmOaQn6Xd0 zJGeaS9SVguy6ryx3Cghh+f9BENHtoWJkb3mzVBEZ1oeCsdfP/EMNlbf6nvyzqKnnP1o jEubwqUW5PdAydOGxNPx5v+v+xmaB9NqOHj2oSRAYNI27uJMngVkutdAFQ+UiINuLzUb ggx7+v7hHFK/kWyt9ruDd8eYanIyBdVu5N5ibrLQ+PdzBgFJDHit+ro2Lqe86iLH5UeK TG6w== X-Gm-Message-State: AOAM532r3f6N5YAKlYjNIcOZSn0mTMfUnmXL7BxHD2Qeduy5RHlJKlvV WMmpUztfstrVrZzicSpZPweFBmNE79C49Q== X-Google-Smtp-Source: ABdhPJxkV6G0LeKV++akmYtcod1tHejzrg07lXp4iAuClQSM8tf2WhGDaCZVP/4bf3Xek0TidbHnYg== X-Received: by 2002:a05:6214:f62:: with SMTP id iy2mr57514876qvb.25.1637180405601; Wed, 17 Nov 2021 12:20:05 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:05 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 05/17] btrfs: support different disk extent size for delalloc Date: Wed, 17 Nov 2021 12:19:15 -0800 Message-Id: <3bdadb3fb344495323fbb4a098fe97c08d967a31.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Currently, we always reserve the same extent size in the file and extent size on disk for delalloc because the former is the worst case for the latter. For BTRFS_IOC_ENCODED_WRITE writes, we know the exact size of the extent on disk, which may be less than or greater than (for bookends) the size in the file. Add a disk_num_bytes parameter to btrfs_delalloc_reserve_metadata() so that we can reserve the correct amount of csum bytes. No functional change. Reviewed-by: Nikolay Borisov Reviewed-by: Josef Bacik Signed-off-by: Omar Sandoval --- fs/btrfs/ctree.h | 3 ++- fs/btrfs/delalloc-space.c | 18 ++++++++++-------- fs/btrfs/file.c | 3 ++- fs/btrfs/inode.c | 4 ++-- fs/btrfs/relocation.c | 2 +- 5 files changed, 17 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9fd677f2ce15..2e7f74060a14 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2823,7 +2823,8 @@ void btrfs_subvolume_release_metadata(struct btrfs_root *root, struct btrfs_block_rsv *rsv); void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes); -int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes); +int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes, + u64 disk_num_bytes); u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo); int btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info, u64 start, u64 end); diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c index bca438c7c972..ec96f1b342e0 100644 --- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -267,11 +267,11 @@ static void btrfs_calculate_inode_block_rsv_size(struct btrfs_fs_info *fs_info, } static void calc_inode_reservations(struct btrfs_fs_info *fs_info, - u64 num_bytes, u64 *meta_reserve, - u64 *qgroup_reserve) + u64 num_bytes, u64 disk_num_bytes, + u64 *meta_reserve, u64 *qgroup_reserve) { u64 nr_extents = count_max_extents(num_bytes); - u64 csum_leaves = btrfs_csum_bytes_to_leaves(fs_info, num_bytes); + u64 csum_leaves = btrfs_csum_bytes_to_leaves(fs_info, disk_num_bytes); u64 inode_update = btrfs_calc_metadata_size(fs_info, 1); *meta_reserve = btrfs_calc_insert_metadata_size(fs_info, @@ -285,7 +285,8 @@ static void calc_inode_reservations(struct btrfs_fs_info *fs_info, *qgroup_reserve = nr_extents * fs_info->nodesize; } -int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes) +int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes, + u64 disk_num_bytes) { struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; @@ -315,6 +316,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes) } num_bytes = ALIGN(num_bytes, fs_info->sectorsize); + disk_num_bytes = ALIGN(disk_num_bytes, fs_info->sectorsize); /* * We always want to do it this way, every other way is wrong and ends @@ -326,8 +328,8 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes) * everything out and try again, which is bad. This way we just * over-reserve slightly, and clean up the mess when we are done. */ - calc_inode_reservations(fs_info, num_bytes, &meta_reserve, - &qgroup_reserve); + calc_inode_reservations(fs_info, num_bytes, disk_num_bytes, + &meta_reserve, &qgroup_reserve); ret = btrfs_qgroup_reserve_meta_prealloc(root, qgroup_reserve, true); if (ret) return ret; @@ -346,7 +348,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes) spin_lock(&inode->lock); nr_extents = count_max_extents(num_bytes); btrfs_mod_outstanding_extents(inode, nr_extents); - inode->csum_bytes += num_bytes; + inode->csum_bytes += disk_num_bytes; btrfs_calculate_inode_block_rsv_size(fs_info, inode); spin_unlock(&inode->lock); @@ -451,7 +453,7 @@ int btrfs_delalloc_reserve_space(struct btrfs_inode *inode, ret = btrfs_check_data_free_space(inode, reserved, start, len); if (ret < 0) return ret; - ret = btrfs_delalloc_reserve_metadata(inode, len); + ret = btrfs_delalloc_reserve_metadata(inode, len, len); if (ret < 0) btrfs_free_reserved_data_space(inode, *reserved, start, len); return ret; diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 11204dbbe053..5fbf0a2aba2e 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1749,7 +1749,8 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, fs_info->sectorsize); WARN_ON(reserve_bytes == 0); ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), - reserve_bytes); + reserve_bytes, + reserve_bytes); if (ret) { if (!only_release_metadata) btrfs_free_reserved_data_space(BTRFS_I(inode), diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1afadc7afff3..0c5b9832f975 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5050,7 +5050,7 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len, goto out; } } - ret = btrfs_delalloc_reserve_metadata(inode, blocksize); + ret = btrfs_delalloc_reserve_metadata(inode, blocksize, blocksize); if (ret < 0) { if (!only_release_metadata) btrfs_free_reserved_data_space(inode, data_reserved, @@ -7812,7 +7812,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map, struct extent_map *em2; /* We can NOCOW, so only need to reserve metadata space. */ - ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), len); + ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), len, len); if (ret < 0) { /* Our caller expects us to free the input extent map. */ free_extent_map(em); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index a455a1ead0d6..fa8dcc3375b5 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2996,7 +2996,7 @@ static int relocate_one_page(struct inode *inode, struct file_ra_state *ra, /* Reserve metadata for this range */ ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), - clamped_len); + clamped_len, clamped_len); if (ret) goto release_page; From patchwork Wed Nov 17 20:19:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7C2AC433EF for ; Wed, 17 Nov 2021 20:20:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AE06C61AF0 for ; Wed, 17 Nov 2021 20:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240664AbhKQUXL (ORCPT ); Wed, 17 Nov 2021 15:23:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240653AbhKQUXG (ORCPT ); Wed, 17 Nov 2021 15:23:06 -0500 Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90646C061570 for ; Wed, 17 Nov 2021 12:20:07 -0800 (PST) Received: by mail-qk1-x735.google.com with SMTP id p4so3896053qkm.7 for ; Wed, 17 Nov 2021 12:20:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5bYjNSJeFPNMOWrUYgw2GmfA6HDuESrxPqPiv0fJ0og=; b=uFdkXbFz1RbRav2HyOoSaZAKlfhfiTy0krGOe5dUzac6FtIKBwhtPwb8I3C6BFSuGU uoYGhF4KetzqcOaPdkKX17iIm7ALZxVF33VxeKJAcFUC1f5m78jYC0HDDzYWy5fxEf0h +2Hmj/pwa6bnbu/j0HbjCSTqkKpPibfRlZkZu5LRXbD6YvMFOwer8jp4Q66iwBPV4gUv 2QX1sx/0IQSgxMsx4kJyqWFeNN2xD9u8q1YtRlq/XTpFUsiR+8nSfFbDaESS/sVPo/uZ Q/wxZ5jg5MjfROhLsgnMGY4r561XkiIRMWnOFi6o8+ZXZWnJ3yfThi8WpKXnhb/6j+Hv 2Jvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5bYjNSJeFPNMOWrUYgw2GmfA6HDuESrxPqPiv0fJ0og=; b=7yxoFUuiitrpkImOQFc3frKxmgTo6PlcFaxt7TqYWmq+pXD+gxhaCYWjku/3n4btsE icbdsfR2obbBVsckhVNLpicI3amWmiHldT6iUJ/K4CRB+hXKS5O9Ch5LsDBbcN2WqZR9 qa3/RSVsfDAI9JOJMAfuQBX0vlXk1Zg9CGaRugqFZvNsy6PF61PEmtReBD4Mk6rj0Uzm vXy+uYZFOvMboVrEbqIgm/uI57OvUmR77CNbPW+0ZqbMNafxpvADazTJTwU7Tp4JaKwe sxcfkkNW50fteNuzsKsbzK9Eg9cSt5vDC3eSP0pAuvdY81hcoJFgL4mKmYyJD9fgtIk7 HqVA== X-Gm-Message-State: AOAM533y2vUqkPz/l0cPEjmuQi6km1a8awhk43AK5WDyxx9OM6nu37pv B5urV5mxGbMELghgVNrKmWgSOC0r95DFIQ== X-Google-Smtp-Source: ABdhPJxpQ13e9iTT+YyCXpjns+THfydPplxrniRATZpqdzLrmjppcUlQQ40SSgzsV3EfhapOQPOz6w== X-Received: by 2002:a05:620a:17aa:: with SMTP id ay42mr15868451qkb.481.1637180406485; Wed, 17 Nov 2021 12:20:06 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:06 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 06/17] btrfs: clean up cow_file_range_inline() Date: Wed, 17 Nov 2021 12:19:16 -0800 Message-Id: <6ee20a1022fe350038b2b0ee1c464285959512b8.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval The start parameter to cow_file_range_inline() (and insert_inline_extent()) is always 0, so get rid of it and simplify the logic in those two functions. Pass btrfs_inode to insert_inline_extent() and remove the redundant root parameter. Also document the requirements for creating an inline extent. No functional change. Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 88 +++++++++++++++++++++--------------------------- 1 file changed, 38 insertions(+), 50 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0c5b9832f975..a5cae0c6d992 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -233,12 +233,13 @@ static int btrfs_init_inode_security(struct btrfs_trans_handle *trans, * no overlapping inline items exist in the btree */ static int insert_inline_extent(struct btrfs_trans_handle *trans, - struct btrfs_path *path, bool extent_inserted, - struct btrfs_root *root, struct inode *inode, - u64 start, size_t size, size_t compressed_size, + struct btrfs_path *path, + struct btrfs_inode *inode, bool extent_inserted, + size_t size, size_t compressed_size, int compress_type, struct page **compressed_pages) { + struct btrfs_root *root = inode->root; struct extent_buffer *leaf; struct page *page = NULL; char *kaddr; @@ -246,7 +247,6 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, struct btrfs_file_extent_item *ei; int ret; size_t cur_size = size; - unsigned long offset; ASSERT((compressed_size > 0 && compressed_pages) || (compressed_size == 0 && !compressed_pages)); @@ -258,8 +258,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, struct btrfs_key key; size_t datasize; - key.objectid = btrfs_ino(BTRFS_I(inode)); - key.offset = start; + key.objectid = btrfs_ino(inode); + key.offset = 0; key.type = BTRFS_EXTENT_DATA_KEY; datasize = btrfs_file_extent_calc_inline_size(cur_size); @@ -297,12 +297,10 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, btrfs_set_file_extent_compression(leaf, ei, compress_type); } else { - page = find_get_page(inode->i_mapping, - start >> PAGE_SHIFT); + page = find_get_page(inode->vfs_inode.i_mapping, 0); btrfs_set_file_extent_compression(leaf, ei, 0); kaddr = kmap_atomic(page); - offset = offset_in_page(start); - write_extent_buffer(leaf, kaddr + offset, ptr, size); + write_extent_buffer(leaf, kaddr, ptr, size); kunmap_atomic(kaddr); put_page(page); } @@ -313,8 +311,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, * We align size to sectorsize for inline extents just for simplicity * sake. */ - size = ALIGN(size, root->fs_info->sectorsize); - ret = btrfs_inode_set_file_extent_range(BTRFS_I(inode), start, size); + ret = btrfs_inode_set_file_extent_range(inode, 0, + ALIGN(size, root->fs_info->sectorsize)); if (ret) goto fail; @@ -327,7 +325,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, * before we unlock the pages. Otherwise we * could end up racing with unlink. */ - BTRFS_I(inode)->disk_i_size = inode->i_size; + inode->disk_i_size = i_size_read(&inode->vfs_inode); + fail: return ret; } @@ -338,8 +337,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, * does the checks required to make sure the data is small enough * to fit as an inline extent. */ -static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 start, - u64 end, size_t compressed_size, +static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size, + size_t compressed_size, int compress_type, struct page **compressed_pages) { @@ -347,26 +346,21 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 start, struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; struct btrfs_trans_handle *trans; - u64 isize = i_size_read(&inode->vfs_inode); - u64 actual_end = min(end + 1, isize); - u64 inline_len = actual_end - start; - u64 aligned_end = ALIGN(end, fs_info->sectorsize); - u64 data_len = inline_len; + u64 data_len = compressed_size ? compressed_size : size; int ret; struct btrfs_path *path; - if (compressed_size) - data_len = compressed_size; - - if (start > 0 || - actual_end > fs_info->sectorsize || + /* + * We can create an inline extent if it ends at or beyond the current + * i_size, is no larger than a sector (decompressed), and the (possibly + * compressed) data fits in a leaf and the configured maximum inline + * size. + */ + if (size < i_size_read(&inode->vfs_inode) || + size > fs_info->sectorsize || data_len > BTRFS_MAX_INLINE_DATA_SIZE(fs_info) || - (!compressed_size && - (actual_end & (fs_info->sectorsize - 1)) == 0) || - end + 1 < isize || - data_len > fs_info->max_inline) { + data_len > fs_info->max_inline) return 1; - } path = btrfs_alloc_path(); if (!path) @@ -380,30 +374,21 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 start, trans->block_rsv = &inode->block_rsv; drop_args.path = path; - drop_args.start = start; - drop_args.end = aligned_end; + drop_args.start = 0; + drop_args.end = fs_info->sectorsize; drop_args.drop_cache = true; drop_args.replace_extent = true; - - if (compressed_size && compressed_pages) - drop_args.extent_item_size = btrfs_file_extent_calc_inline_size( - compressed_size); - else - drop_args.extent_item_size = btrfs_file_extent_calc_inline_size( - inline_len); - + drop_args.extent_item_size = btrfs_file_extent_calc_inline_size(data_len); ret = btrfs_drop_extents(trans, root, inode, &drop_args); if (ret) { btrfs_abort_transaction(trans, ret); goto out; } - if (isize > actual_end) - inline_len = min_t(u64, isize, actual_end); - ret = insert_inline_extent(trans, path, drop_args.extent_inserted, - root, &inode->vfs_inode, start, - inline_len, compressed_size, - compress_type, compressed_pages); + ret = insert_inline_extent(trans, path, inode, + drop_args.extent_inserted, size, + compressed_size, compress_type, + compressed_pages); if (ret && ret != -ENOSPC) { btrfs_abort_transaction(trans, ret); goto out; @@ -412,7 +397,7 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 start, goto out; } - btrfs_update_inode_bytes(inode, inline_len, drop_args.bytes_found); + btrfs_update_inode_bytes(inode, size, drop_args.bytes_found); ret = btrfs_update_inode(trans, root, inode); if (ret && ret != -ENOSPC) { btrfs_abort_transaction(trans, ret); @@ -734,12 +719,12 @@ static noinline int compress_file_range(struct async_chunk *async_chunk) /* we didn't compress the entire range, try * to make an uncompressed inline extent. */ - ret = cow_file_range_inline(BTRFS_I(inode), start, end, + ret = cow_file_range_inline(BTRFS_I(inode), actual_end, 0, BTRFS_COMPRESS_NONE, NULL); } else { /* try making a compressed inline extent */ - ret = cow_file_range_inline(BTRFS_I(inode), start, end, + ret = cow_file_range_inline(BTRFS_I(inode), actual_end, total_compressed, compress_type, pages); } @@ -1154,8 +1139,11 @@ static noinline int cow_file_range(struct btrfs_inode *inode, * So here we skip inline extent creation completely. */ if (start == 0 && fs_info->sectorsize == PAGE_SIZE) { + u64 actual_end = min_t(u64, i_size_read(&inode->vfs_inode), + end + 1); + /* lets try to make an inline extent */ - ret = cow_file_range_inline(inode, start, end, 0, + ret = cow_file_range_inline(inode, actual_end, 0, BTRFS_COMPRESS_NONE, NULL); if (ret == 0) { /* From patchwork Wed Nov 17 20:19:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C529C433FE for ; Wed, 17 Nov 2021 20:20:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 378F861AD2 for ; Wed, 17 Nov 2021 20:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240690AbhKQUXT (ORCPT ); Wed, 17 Nov 2021 15:23:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240649AbhKQUXH (ORCPT ); Wed, 17 Nov 2021 15:23:07 -0500 Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76B7EC061764 for ; Wed, 17 Nov 2021 12:20:08 -0800 (PST) Received: by mail-qt1-x836.google.com with SMTP id a2so3847807qtx.11 for ; Wed, 17 Nov 2021 12:20:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mSkrOXEvpNqF1zdxEsTNF4ZMegNRF+/ApMheM38Qvv4=; b=C9gUjDfR45vu1fWiZGITidNEWSS2HOZVxKBU02EQ9im61hWOBlLCR8BmAfgY1aKcJ7 TTJOl/g7YBiu0Kay9xXaBOP5riXKBZqy174xW/F4UZ4d+tdOBc5/fN5W99ARIWANGDoc Qbo8+6QYSFjqdg8ouinCKsnCqxc+EHQiE7z8CE2AB7sMSRudsINtfPdAg9aDvvR4nrO9 iWPrvdfMo/upixmONgU+CktEyC0H8OOzDp30TwEGi6M0oYP8PiuoyHosI+yDPV6e89iq IGjQl0SR22F8muP1r0mkycxfJ0VdqvwpFekXPxHNIfLeeBzwk3b15pf1ZLfdjYpUdPdq iz9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mSkrOXEvpNqF1zdxEsTNF4ZMegNRF+/ApMheM38Qvv4=; b=BX1LvVOr0G0lAwxAZ9IfZUpAIJHZg+Gf9LPj6uoFVduAd2enACrgfWcwBkziwXEy6G WxyQ5CIsbITtfSQR7ezy4kATHB1zu8MdV0+t75Q7RisK4hr28jOljKWsUI6pwQU8KWqH G06mmSbWUikuCB4vxsa/F6kAcmtYfslCnRTKit19Q2bsH5OKVoMe3Vzi51xdOMttgSCl 2SNg6yBglS0YhCQHmBGIRle6JWPjFZuJ/PoMStqufPtiD5BMjW+y1FH/qlLg+BwCru9K Uy1hXyCJWT/haWldcPM4aIJLCU0qVIsdzZbP6WBtuz2XCxb6vB6IygDBPCmL2HinlnOB rS3Q== X-Gm-Message-State: AOAM532nEb41rST+bLdoYQVhYUFtr6D+Svv/kMOHT5j5+C6OK2Y5WKks /8lkeI2L7USANmCYWVVrGJbRABnXFlr+uQ== X-Google-Smtp-Source: ABdhPJwpejxhmI3wyyNYgaXjECbStLwcMOaltcGBNl5IYXr7aeBoDSkNUxzvcWJL/0jseaX08iPDrQ== X-Received: by 2002:a05:622a:2c9:: with SMTP id a9mr19589878qtx.28.1637180407408; Wed, 17 Nov 2021 12:20:07 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:07 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 07/17] btrfs: optionally extend i_size in cow_file_range_inline() Date: Wed, 17 Nov 2021 12:19:17 -0800 Message-Id: <1040f2fba5861130283b16aa915a9ce42f234d9e.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Currently, an inline extent is always created after i_size is extended from btrfs_dirty_pages(). However, for encoded writes, we only want to update i_size after we successfully created the inline extent. Add an update_i_size parameter to cow_file_range_inline() and insert_inline_extent() and pass in the size of the extent rather than determining it from i_size. Reviewed-by: Josef Bacik Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index a5cae0c6d992..c2efea101f61 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -237,7 +237,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, struct btrfs_inode *inode, bool extent_inserted, size_t size, size_t compressed_size, int compress_type, - struct page **compressed_pages) + struct page **compressed_pages, + bool update_i_size) { struct btrfs_root *root = inode->root; struct extent_buffer *leaf; @@ -247,6 +248,7 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, struct btrfs_file_extent_item *ei; int ret; size_t cur_size = size; + u64 i_size; ASSERT((compressed_size > 0 && compressed_pages) || (compressed_size == 0 && !compressed_pages)); @@ -325,7 +327,12 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, * before we unlock the pages. Otherwise we * could end up racing with unlink. */ - inode->disk_i_size = i_size_read(&inode->vfs_inode); + i_size = i_size_read(&inode->vfs_inode); + if (update_i_size && size > i_size) { + i_size_write(&inode->vfs_inode, size); + i_size = size; + } + inode->disk_i_size = i_size; fail: return ret; @@ -340,7 +347,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans, static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size, size_t compressed_size, int compress_type, - struct page **compressed_pages) + struct page **compressed_pages, + bool update_i_size) { struct btrfs_drop_extents_args drop_args = { 0 }; struct btrfs_root *root = inode->root; @@ -388,7 +396,7 @@ static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size, ret = insert_inline_extent(trans, path, inode, drop_args.extent_inserted, size, compressed_size, compress_type, - compressed_pages); + compressed_pages, update_i_size); if (ret && ret != -ENOSPC) { btrfs_abort_transaction(trans, ret); goto out; @@ -721,12 +729,13 @@ static noinline int compress_file_range(struct async_chunk *async_chunk) */ ret = cow_file_range_inline(BTRFS_I(inode), actual_end, 0, BTRFS_COMPRESS_NONE, - NULL); + NULL, false); } else { /* try making a compressed inline extent */ ret = cow_file_range_inline(BTRFS_I(inode), actual_end, total_compressed, - compress_type, pages); + compress_type, pages, + false); } if (ret <= 0) { unsigned long clear_flags = EXTENT_DELALLOC | @@ -1144,7 +1153,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, /* lets try to make an inline extent */ ret = cow_file_range_inline(inode, actual_end, 0, - BTRFS_COMPRESS_NONE, NULL); + BTRFS_COMPRESS_NONE, NULL, false); if (ret == 0) { /* * We use DO_ACCOUNTING here because we need the From patchwork Wed Nov 17 20:19:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B2E2C433EF for ; Wed, 17 Nov 2021 20:20:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5F7A161AD2 for ; Wed, 17 Nov 2021 20:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240667AbhKQUXU (ORCPT ); Wed, 17 Nov 2021 15:23:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239434AbhKQUXI (ORCPT ); Wed, 17 Nov 2021 15:23:08 -0500 Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AC73C061570 for ; Wed, 17 Nov 2021 12:20:09 -0800 (PST) Received: by mail-qt1-x836.google.com with SMTP id z9so3852655qtj.9 for ; Wed, 17 Nov 2021 12:20:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=16AJr8eJMyUTL7KFAYHrcl6ITlDIzYBXKh9YST7ElNs=; b=ilOdJ4/XNqAEYFS+G+8HrEpzA78VwOGPpUGG8n7V73uuUB/l71ljlD9Hh+TzlsDdq9 KROIkqYVZzCydCLBwegqSbsSWnD2SoZ5S+fTRRPyqFbV0o5BCdOuNaa3Wl+p7sN4BxQN xvCDEm35Vxe3mF3u5fGidtvxa4h2IoknWXFzeKlkd5yabEovY1Ww/ReczGk9qj92EvJ1 2pHOTdDUO2AAYU1HXDyqkvZn5B4q3dwhstIAoEcH1ZBpds+hlt07KFS81EJQr59arjVf qN+HU7A0Tm31avioMcL+4sJZmWSBTkmMkChDCxpazDTKrTKK7KDGgobBeknpmkd2yqsx JyTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=16AJr8eJMyUTL7KFAYHrcl6ITlDIzYBXKh9YST7ElNs=; b=qa67i0YBZly/FT6R3A98ksjW94Q7f/PD364T+mNaxGaMFXAuzT9Hm0REbdUUlTQUBp dyr4hla7R3CVQS5A9nbCIAIHTSfid+taSupBYqSQsnxwKpL5egE7tzwr6Llr4OAGcrwh TO2JRrNR9gRCv6BXW94AlCd/gW3F2m5Fv+7eBvo3IXqF9YDaBKMEx105ub/aZmt/q902 jkbnfg9dCTd2uSGCnEQAf5e+oANW/SqMHVXB0G8u2/bJajawcZ6WpvmvdyOBjlmq23Kn WazNI0IT3/CoZqDViGtBR0yLLDRJ7uQoEsHaqDeQJ8cKV2OaheZbumLZy4ILeu9ooFmc VJpA== X-Gm-Message-State: AOAM5337Goz730hdWxwTT/4sIVk5MTkdADrzSiS4ZOBRi00yKHh3RhGc 9NS/lu7NErjhu0Kgvw6Jv6fJOham5bOtew== X-Google-Smtp-Source: ABdhPJxUcRM3ShmqXu6S1Pau1Dlmfnh5sCJ7ECK0cwfJo+7KYuU3E+lYh00DKrDWqW98tnVtuHM1/Q== X-Received: by 2002:a05:622a:40f:: with SMTP id n15mr20549510qtx.296.1637180408264; Wed, 17 Nov 2021 12:20:08 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:07 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 08/17] btrfs: add definitions + documentation for encoded I/O ioctls Date: Wed, 17 Nov 2021 12:19:18 -0800 Message-Id: <87e8a3e6268ad5115728ba5c8ef3a2387a0595b9.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval In order to allow sending and receiving compressed data without decompressing it, we need an interface to write pre-compressed data directly to the filesystem and the matching interface to read compressed data without decompressing it. This adds the definitions for ioctls to do that and detailed explanations of how to use them. Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- include/uapi/linux/btrfs.h | 132 +++++++++++++++++++++++++++++++++++++ 1 file changed, 132 insertions(+) diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 738619994e26..7505acfa18d7 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -868,6 +868,134 @@ struct btrfs_ioctl_get_subvol_rootref_args { __u8 align[7]; }; +/* + * Data and metadata for an encoded read or write. + * + * Encoded I/O bypasses any encoding automatically done by the filesystem (e.g., + * compression). This can be used to read the compressed contents of a file or + * write pre-compressed data directly to a file. + * + * BTRFS_IOC_ENCODED_READ and BTRFS_IOC_ENCODED_WRITE are essentially + * preadv/pwritev with additional metadata about how the data is encoded and the + * size of the unencoded data. + * + * BTRFS_IOC_ENCODED_READ fills the given iovecs with the encoded data, fills + * the metadata fields, and returns the size of the encoded data. It reads one + * extent per call. It can also read data which is not encoded. + * + * BTRFS_IOC_ENCODED_WRITE uses the metadata fields, writes the encoded data + * from the iovecs, and returns the size of the encoded data. Note that the + * encoded data is not validated when it is written; if it is not valid (e.g., + * it cannot be decompressed), then a subsequent read may return an error. + * + * Since the filesystem page cache contains decoded data, encoded I/O bypasses + * the page cache. Encoded I/O requires CAP_SYS_ADMIN. + */ +struct btrfs_ioctl_encoded_io_args { + /* Input parameters for both reads and writes. */ + + /* + * iovecs containing encoded data. + * + * For reads, if the size of the encoded data is larger than the sum of + * iov[n].iov_len for 0 <= n < iovcnt, then the ioctl fails with + * ENOBUFS. + * + * For writes, the size of the encoded data is the sum of iov[n].iov_len + * for 0 <= n < iovcnt. This must be less than 128 KiB (this limit may + * increase in the future). This must also be less than or equal to + * unencoded_len. + */ + const struct iovec __user *iov; + /* Number of iovecs. */ + unsigned long iovcnt; + /* + * Offset in file. + * + * For writes, must be aligned to the sector size of the filesystem. + */ + __s64 offset; + /* Currently must be zero. */ + __u64 flags; + + /* + * For reads, the following members are output parameters that will + * contain the returned metadata for the encoded data. + * For writes, the following members must be set to the metadata for the + * encoded data. + */ + + /* + * Length of the data in the file. + * + * Must be less than or equal to unencoded_len - unencoded_offset. For + * writes, must be aligned to the sector size of the filesystem unless + * the data ends at or beyond the current end of the file. + */ + __u64 len; + /* + * Length of the unencoded (i.e., decrypted and decompressed) data. + * + * For writes, must be no more than 128 KiB (this limit may increase in + * the future). If the unencoded data is actually longer than + * unencoded_len, then it is truncated; if it is shorter, then it is + * extended with zeroes. + */ + __u64 unencoded_len; + /* + * Offset from the first byte of the unencoded data to the first byte of + * logical data in the file. + * + * Must be less than unencoded_len. + */ + __u64 unencoded_offset; + /* + * BTRFS_ENCODED_IO_COMPRESSION_* type. + * + * For writes, must not be BTRFS_ENCODED_IO_COMPRESSION_NONE. + */ + __u32 compression; + /* Currently always BTRFS_ENCODED_IO_ENCRYPTION_NONE. */ + __u32 encryption; + /* + * Reserved for future expansion. + * + * For reads, always returned as zero. Users should check for non-zero + * bytes. If there are any, then the kernel has a newer version of this + * structure with additional information that the user definition is + * missing. + * + * For writes, must be zeroed. + */ + __u8 reserved[32]; +}; + +/* Data is not compressed. */ +#define BTRFS_ENCODED_IO_COMPRESSION_NONE 0 +/* Data is compressed as a single zlib stream. */ +#define BTRFS_ENCODED_IO_COMPRESSION_ZLIB 1 +/* + * Data is compressed as a single zstd frame with the windowLog compression + * parameter set to no more than 17. + */ +#define BTRFS_ENCODED_IO_COMPRESSION_ZSTD 2 +/* + * Data is compressed sector by sector (using the sector size indicated by the + * name of the constant) with LZO1X and wrapped in the format documented in + * fs/btrfs/lzo.c. For writes, the compression sector size must match the + * filesystem sector size. + */ +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_4K 3 +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_8K 4 +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_16K 5 +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_32K 6 +#define BTRFS_ENCODED_IO_COMPRESSION_LZO_64K 7 +#define BTRFS_ENCODED_IO_COMPRESSION_TYPES 8 + +/* Data is not encrypted. */ +#define BTRFS_ENCODED_IO_ENCRYPTION_NONE 0 +#define BTRFS_ENCODED_IO_ENCRYPTION_TYPES 1 + /* Error codes as returned by the kernel */ enum btrfs_err_code { BTRFS_ERROR_DEV_RAID1_MIN_NOT_MET = 1, @@ -996,5 +1124,9 @@ enum btrfs_err_code { struct btrfs_ioctl_ino_lookup_user_args) #define BTRFS_IOC_SNAP_DESTROY_V2 _IOW(BTRFS_IOCTL_MAGIC, 63, \ struct btrfs_ioctl_vol_args_v2) +#define BTRFS_IOC_ENCODED_READ _IOR(BTRFS_IOCTL_MAGIC, 64, \ + struct btrfs_ioctl_encoded_io_args) +#define BTRFS_IOC_ENCODED_WRITE _IOW(BTRFS_IOCTL_MAGIC, 64, \ + struct btrfs_ioctl_encoded_io_args) #endif /* _UAPI_LINUX_BTRFS_H */ From patchwork Wed Nov 17 20:19:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16E4BC4332F for ; Wed, 17 Nov 2021 20:20:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ED77A61AF0 for ; Wed, 17 Nov 2021 20:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240692AbhKQUXU (ORCPT ); Wed, 17 Nov 2021 15:23:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240659AbhKQUXJ (ORCPT ); Wed, 17 Nov 2021 15:23:09 -0500 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92C68C061570 for ; Wed, 17 Nov 2021 12:20:10 -0800 (PST) Received: by mail-qt1-x833.google.com with SMTP id l8so3879216qtk.6 for ; Wed, 17 Nov 2021 12:20:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0hoPRYbrQBg7t1Z0dlVofROP7rDI28vrZtPzXYK9wKE=; b=OuquikrqtaAKxW/bwVQUI3jtwwwTEj+4yQ6F32F0VzmNVMCnmpTf2wovgWTK6rg5+H Ki4V0ovbD6A3kewt9Pqkt54/tkGodcYW7VJf+JZu+BzHWbAVxh3BoNDFT1oo+cjX+ZBv wJNM3n1yTorsE2s9uCNtY6lpPd103U/E4rYHkD3Ncehy3hchiCDgoSSptebHId/onHUd 7XSjojPRYAQlP1owkBzGcFxdPIEB4fqYMCgkSLEG2xQ99ev1HEVYcto4gXxLYpg0j/5G fo+HCuVqbglrN9RHzc9+q0181zM0SnuPcoGjKTRVPcoISGbrDm5ieDm6ZzPR4KVpgp0J Z/KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0hoPRYbrQBg7t1Z0dlVofROP7rDI28vrZtPzXYK9wKE=; b=aEcJuMNsx2hyl66RMgLllkjazFxk18NYcKKevfjU7Os+GtmV7UiYSVO+XVSGcZxVak O/7VF9kB1zJy0RwYIOl6dm8Et6fw82ycnoMzhmu3KTKp8eCQtM1KXpJlrDBt2Nparpb5 28faQHe9l5rlKPtNLBlhNVvnRPihOXzVFx3xs+B0IHmyfCN2FGGinOYqDCEUrUNYvRwl ah2jRQUVflz0k6ek0YjyJHjfaDqVhD+lBZALlbto9YsxQKkno94zbUTLKi90t8EMWAjV lmcWg2TA8kZHf39vvx23gb8W/iUw2x6JRKenANuBCHH0oy8/aeo5UGM8BwmEbJn93SRZ Y7Xg== X-Gm-Message-State: AOAM531pLdW2u3KJohHavukhnpGgiPNvwhISUDlscUE4PO0cC0RmAN6L rECDNYEf1EsWIz8U5sITJtJFy7X0WgD4cQ== X-Google-Smtp-Source: ABdhPJyNKA0mh1E047Z1MQ+uVWQqDu6gqI2N9T65lQfKbjvXolJYvp6uCJJQ7RfZ2uDgqb32G5zFCA== X-Received: by 2002:ac8:5d89:: with SMTP id d9mr20384060qtx.49.1637180409373; Wed, 17 Nov 2021 12:20:09 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:08 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 09/17] btrfs: add BTRFS_IOC_ENCODED_READ Date: Wed, 17 Nov 2021 12:19:19 -0800 Message-Id: <111c7aadb3fa218b7926a49acf3fb34654d89a75.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval There are 4 main cases: 1. Inline extents: we copy the data straight out of the extent buffer. 2. Hole/preallocated extents: we fill in zeroes. 3. Regular, uncompressed extents: we read the sectors we need directly from disk. 4. Regular, compressed extents: we read the entire compressed extent from disk and indicate what subset of the decompressed extent is in the file. This initial implementation simplifies a few things that can be improved in the future: - We hold the inode lock during the operation. - Cases 1, 3, and 4 allocate temporary memory to read into before copying out to userspace. - We don't do read repair, because it turns out that read repair is currently broken for compressed data. Signed-off-by: Omar Sandoval --- fs/btrfs/ctree.h | 4 + fs/btrfs/inode.c | 496 +++++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/ioctl.c | 106 ++++++++++ 3 files changed, 606 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2e7f74060a14..70034e33abe6 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3275,6 +3275,10 @@ int btrfs_writepage_cow_fixup(struct page *page); void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode, struct page *page, u64 start, u64 end, bool uptodate); +struct btrfs_ioctl_encoded_io_args; +ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter, + struct btrfs_ioctl_encoded_io_args *encoded); + extern const struct dentry_operations btrfs_dentry_operations; extern const struct iomap_ops btrfs_dio_iomap_ops; extern const struct iomap_dio_ops btrfs_dio_ops; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c2efea101f61..d29e968fd18b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -10525,6 +10525,502 @@ void btrfs_set_range_writeback(struct btrfs_inode *inode, u64 start, u64 end) } } +static int btrfs_encoded_io_compression_from_extent(int compress_type) +{ + switch (compress_type) { + case BTRFS_COMPRESS_NONE: + return BTRFS_ENCODED_IO_COMPRESSION_NONE; + case BTRFS_COMPRESS_ZLIB: + return BTRFS_ENCODED_IO_COMPRESSION_ZLIB; + case BTRFS_COMPRESS_LZO: + /* + * The LZO format depends on the page size. 64k is the maximum + * sectorsize (and thus page size) that we support. + */ + if (PAGE_SIZE < SZ_4K || PAGE_SIZE > SZ_64K) + return -EINVAL; + return BTRFS_ENCODED_IO_COMPRESSION_LZO_4K + (PAGE_SHIFT - 12); + case BTRFS_COMPRESS_ZSTD: + return BTRFS_ENCODED_IO_COMPRESSION_ZSTD; + default: + return -EUCLEAN; + } +} + +static ssize_t btrfs_encoded_read_inline( + struct kiocb *iocb, + struct iov_iter *iter, u64 start, + u64 lockend, + struct extent_state **cached_state, + u64 extent_start, size_t count, + struct btrfs_ioctl_encoded_io_args *encoded, + bool *unlocked) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; + struct btrfs_path *path; + struct extent_buffer *leaf; + struct btrfs_file_extent_item *item; + u64 ram_bytes; + unsigned long ptr; + void *tmp; + ssize_t ret; + + path = btrfs_alloc_path(); + if (!path) { + ret = -ENOMEM; + goto out; + } + ret = btrfs_lookup_file_extent(NULL, BTRFS_I(inode)->root, path, + btrfs_ino(BTRFS_I(inode)), extent_start, + 0); + if (ret) { + if (ret > 0) { + /* The extent item disappeared? */ + ret = -EIO; + } + goto out; + } + leaf = path->nodes[0]; + item = btrfs_item_ptr(leaf, path->slots[0], + struct btrfs_file_extent_item); + + ram_bytes = btrfs_file_extent_ram_bytes(leaf, item); + ptr = btrfs_file_extent_inline_start(item); + + encoded->len = (min_t(u64, extent_start + ram_bytes, inode->i_size) - + iocb->ki_pos); + ret = btrfs_encoded_io_compression_from_extent( + btrfs_file_extent_compression(leaf, item)); + if (ret < 0) + goto out; + encoded->compression = ret; + if (encoded->compression) { + size_t inline_size; + + inline_size = btrfs_file_extent_inline_item_len(leaf, + path->slots[0]); + if (inline_size > count) { + ret = -ENOBUFS; + goto out; + } + count = inline_size; + encoded->unencoded_len = ram_bytes; + encoded->unencoded_offset = iocb->ki_pos - extent_start; + } else { + encoded->len = encoded->unencoded_len = count = + min_t(u64, count, encoded->len); + ptr += iocb->ki_pos - extent_start; + } + + tmp = kmalloc(count, GFP_NOFS); + if (!tmp) { + ret = -ENOMEM; + goto out; + } + read_extent_buffer(leaf, tmp, ptr, count); + btrfs_release_path(path); + unlock_extent_cached(io_tree, start, lockend, cached_state); + inode_unlock_shared(inode); + *unlocked = true; + + ret = copy_to_iter(tmp, count, iter); + if (ret != count) + ret = -EFAULT; + kfree(tmp); +out: + btrfs_free_path(path); + return ret; +} + +struct btrfs_encoded_read_private { + struct inode *inode; + u64 file_offset; + wait_queue_head_t wait; + atomic_t pending; + blk_status_t status; + bool skip_csum; +}; + +static blk_status_t submit_encoded_read_bio(struct inode *inode, + struct bio *bio, int mirror_num, + unsigned long bio_flags) +{ + struct btrfs_encoded_read_private *priv = bio->bi_private; + struct btrfs_bio *bbio = btrfs_bio(bio); + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + blk_status_t ret; + + if (!priv->skip_csum) { + ret = btrfs_lookup_bio_sums(inode, bio, NULL); + if (ret) + return ret; + } + + ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA); + if (ret) { + btrfs_bio_free_csum(bbio); + return ret; + } + + atomic_inc(&priv->pending); + ret = btrfs_map_bio(fs_info, bio, mirror_num); + if (ret) { + atomic_dec(&priv->pending); + btrfs_bio_free_csum(bbio); + } + return ret; +} + +static blk_status_t btrfs_encoded_read_verify_csum(struct btrfs_bio *bbio) +{ + const bool uptodate = bbio->bio.bi_status == BLK_STS_OK; + struct btrfs_encoded_read_private *priv = bbio->bio.bi_private; + struct inode *inode = priv->inode; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + u32 sectorsize = fs_info->sectorsize; + struct bio_vec *bvec; + struct bvec_iter_all iter_all; + u64 start = priv->file_offset; + u32 bio_offset = 0; + + if (priv->skip_csum || !uptodate) + return bbio->bio.bi_status; + + bio_for_each_segment_all(bvec, &bbio->bio, iter_all) { + unsigned int i, nr_sectors, pgoff; + + nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info, bvec->bv_len); + pgoff = bvec->bv_offset; + for (i = 0; i < nr_sectors; i++) { + ASSERT(pgoff < PAGE_SIZE); + if (check_data_csum(inode, bbio, bio_offset, + bvec->bv_page, pgoff, start)) + return BLK_STS_IOERR; + start += sectorsize; + bio_offset += sectorsize; + pgoff += sectorsize; + } + } + return BLK_STS_OK; +} + +static void btrfs_encoded_read_endio(struct bio *bio) +{ + struct btrfs_encoded_read_private *priv = bio->bi_private; + struct btrfs_bio *bbio = btrfs_bio(bio); + blk_status_t status; + + status = btrfs_encoded_read_verify_csum(bbio); + if (status) { + /* + * The memory barrier implied by the atomic_dec_return() here + * pairs with the memory barrier implied by the + * atomic_dec_return() or io_wait_event() in + * btrfs_encoded_read_regular_fill_pages() to ensure that this + * write is observed before the load of status in + * btrfs_encoded_read_regular_fill_pages(). + */ + WRITE_ONCE(priv->status, status); + } + if (!atomic_dec_return(&priv->pending)) + wake_up(&priv->wait); + btrfs_bio_free_csum(bbio); + bio_put(bio); +} + +static int btrfs_encoded_read_regular_fill_pages(struct inode *inode, + u64 file_offset, + u64 disk_bytenr, + u64 disk_io_size, + struct page **pages) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_encoded_read_private priv = { + .inode = inode, + .file_offset = file_offset, + .pending = ATOMIC_INIT(1), + .skip_csum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM, + }; + unsigned long i = 0; + u64 cur = 0; + int ret; + + init_waitqueue_head(&priv.wait); + /* + * Submit bios for the extent, splitting due to bio or stripe limits as + * necessary. + */ + while (cur < disk_io_size) { + struct extent_map *em; + struct btrfs_io_geometry geom; + struct bio *bio = NULL; + u64 remaining; + + em = btrfs_get_chunk_map(fs_info, disk_bytenr + cur, + disk_io_size - cur); + if (IS_ERR(em)) { + ret = PTR_ERR(em); + } else { + ret = btrfs_get_io_geometry(fs_info, em, BTRFS_MAP_READ, + disk_bytenr + cur, &geom); + free_extent_map(em); + } + if (ret) { + WRITE_ONCE(priv.status, errno_to_blk_status(ret)); + break; + } + remaining = min(geom.len, disk_io_size - cur); + while (bio || remaining) { + size_t bytes = min_t(u64, remaining, PAGE_SIZE); + + if (!bio) { + bio = btrfs_bio_alloc(BIO_MAX_VECS); + bio->bi_iter.bi_sector = + (disk_bytenr + cur) >> SECTOR_SHIFT; + bio->bi_end_io = btrfs_encoded_read_endio; + bio->bi_private = &priv; + bio->bi_opf = REQ_OP_READ; + } + + if (!bytes || + bio_add_page(bio, pages[i], bytes, 0) < bytes) { + blk_status_t status; + + status = submit_encoded_read_bio(inode, bio, 0, + 0); + if (status) { + WRITE_ONCE(priv.status, status); + bio_put(bio); + goto out; + } + bio = NULL; + continue; + } + + i++; + cur += bytes; + remaining -= bytes; + } + } + +out: + if (atomic_dec_return(&priv.pending)) + io_wait_event(priv.wait, !atomic_read(&priv.pending)); + /* See btrfs_encoded_read_endio() for ordering. */ + return blk_status_to_errno(READ_ONCE(priv.status)); +} + +static ssize_t btrfs_encoded_read_regular(struct kiocb *iocb, + struct iov_iter *iter, + u64 start, u64 lockend, + struct extent_state **cached_state, + u64 disk_bytenr, u64 disk_io_size, + size_t count, bool compressed, + bool *unlocked) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; + struct page **pages; + unsigned long nr_pages, i; + u64 cur; + size_t page_offset; + ssize_t ret; + + nr_pages = DIV_ROUND_UP(disk_io_size, PAGE_SIZE); + pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS); + if (!pages) + return -ENOMEM; + for (i = 0; i < nr_pages; i++) { + pages[i] = alloc_page(GFP_NOFS | __GFP_HIGHMEM); + if (!pages[i]) { + ret = -ENOMEM; + goto out; + } + } + + ret = btrfs_encoded_read_regular_fill_pages(inode, start, disk_bytenr, + disk_io_size, pages); + if (ret) + goto out; + + unlock_extent_cached(io_tree, start, lockend, cached_state); + inode_unlock_shared(inode); + *unlocked = true; + + if (compressed) { + i = 0; + page_offset = 0; + } else { + i = (iocb->ki_pos - start) >> PAGE_SHIFT; + page_offset = (iocb->ki_pos - start) & (PAGE_SIZE - 1); + } + cur = 0; + while (cur < count) { + size_t bytes = min_t(size_t, count - cur, + PAGE_SIZE - page_offset); + + if (copy_page_to_iter(pages[i], page_offset, bytes, + iter) != bytes) { + ret = -EFAULT; + goto out; + } + i++; + cur += bytes; + page_offset = 0; + } + ret = count; +out: + for (i = 0; i < nr_pages; i++) { + if (pages[i]) + __free_page(pages[i]); + } + kfree(pages); + return ret; +} + +ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter, + struct btrfs_ioctl_encoded_io_args *encoded) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; + ssize_t ret; + size_t count = iov_iter_count(iter); + u64 start, lockend, disk_bytenr, disk_io_size; + struct extent_state *cached_state = NULL; + struct extent_map *em; + bool unlocked = false; + + file_accessed(iocb->ki_filp); + + inode_lock_shared(inode); + + if (iocb->ki_pos >= inode->i_size) { + inode_unlock_shared(inode); + return 0; + } + start = ALIGN_DOWN(iocb->ki_pos, fs_info->sectorsize); + /* + * We don't know how long the extent containing iocb->ki_pos is, but if + * it's compressed we know that it won't be longer than this. + */ + lockend = start + BTRFS_MAX_UNCOMPRESSED - 1; + + for (;;) { + struct btrfs_ordered_extent *ordered; + + ret = btrfs_wait_ordered_range(inode, start, + lockend - start + 1); + if (ret) + goto out_unlock_inode; + lock_extent_bits(io_tree, start, lockend, &cached_state); + ordered = btrfs_lookup_ordered_range(BTRFS_I(inode), start, + lockend - start + 1); + if (!ordered) + break; + btrfs_put_ordered_extent(ordered); + unlock_extent_cached(io_tree, start, lockend, &cached_state); + cond_resched(); + } + + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, + lockend - start + 1); + if (IS_ERR(em)) { + ret = PTR_ERR(em); + goto out_unlock_extent; + } + + if (em->block_start == EXTENT_MAP_INLINE) { + u64 extent_start = em->start; + + /* + * For inline extents we get everything we need out of the + * extent item. + */ + free_extent_map(em); + em = NULL; + ret = btrfs_encoded_read_inline(iocb, iter, start, lockend, + &cached_state, extent_start, + count, encoded, &unlocked); + goto out; + } + + /* + * We only want to return up to EOF even if the extent extends beyond + * that. + */ + encoded->len = (min_t(u64, extent_map_end(em), inode->i_size) - + iocb->ki_pos); + if (em->block_start == EXTENT_MAP_HOLE || + test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) { + disk_bytenr = EXTENT_MAP_HOLE; + encoded->len = encoded->unencoded_len = count = + min_t(u64, count, encoded->len); + } else if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) { + disk_bytenr = em->block_start; + /* + * Bail if the buffer isn't large enough to return the whole + * compressed extent. + */ + if (em->block_len > count) { + ret = -ENOBUFS; + goto out_em; + } + disk_io_size = count = em->block_len; + encoded->unencoded_len = em->ram_bytes; + encoded->unencoded_offset = iocb->ki_pos - em->orig_start; + ret = btrfs_encoded_io_compression_from_extent( + em->compress_type); + if (ret < 0) + goto out_em; + encoded->compression = ret; + } else { + disk_bytenr = em->block_start + (start - em->start); + if (encoded->len > count) + encoded->len = count; + /* + * Don't read beyond what we locked. This also limits the page + * allocations that we'll do. + */ + disk_io_size = min(lockend + 1, + iocb->ki_pos + encoded->len) - start; + encoded->len = encoded->unencoded_len = count = + start + disk_io_size - iocb->ki_pos; + disk_io_size = ALIGN(disk_io_size, fs_info->sectorsize); + } + free_extent_map(em); + em = NULL; + + if (disk_bytenr == EXTENT_MAP_HOLE) { + unlock_extent_cached(io_tree, start, lockend, &cached_state); + inode_unlock_shared(inode); + unlocked = true; + ret = iov_iter_zero(count, iter); + if (ret != count) + ret = -EFAULT; + } else { + ret = btrfs_encoded_read_regular(iocb, iter, start, lockend, + &cached_state, disk_bytenr, + disk_io_size, count, + encoded->compression, + &unlocked); + } + +out: + if (ret >= 0) + iocb->ki_pos += encoded->len; +out_em: + free_extent_map(em); +out_unlock_extent: + if (!unlocked) + unlock_extent_cached(io_tree, start, lockend, &cached_state); +out_unlock_inode: + if (!unlocked) + inode_unlock_shared(inode); + return ret; +} + #ifdef CONFIG_SWAP /* * Add an entry indicating a block group or device which is pinned by a diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 05c77a1979a9..f0c575223d88 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -28,6 +28,7 @@ #include #include #include +#include #include "ctree.h" #include "disk-io.h" #include "export.h" @@ -88,6 +89,22 @@ struct btrfs_ioctl_send_args_32 { #define BTRFS_IOC_SEND_32 _IOW(BTRFS_IOCTL_MAGIC, 38, \ struct btrfs_ioctl_send_args_32) + +struct btrfs_ioctl_encoded_io_args_32 { + compat_uptr_t iov; + compat_ulong_t iovcnt; + __s64 offset; + __u64 flags; + __u64 len; + __u64 unencoded_len; + __u64 unencoded_offset; + __u32 compression; + __u32 encryption; + __u32 reserved[8]; +}; + +#define BTRFS_IOC_ENCODED_READ_32 _IOR(BTRFS_IOCTL_MAGIC, 64, \ + struct btrfs_ioctl_encoded_io_args_32) #endif /* Mask out flags that are inappropriate for the given type of inode. */ @@ -4861,6 +4878,89 @@ static int _btrfs_ioctl_send(struct file *file, void __user *argp, bool compat) return ret; } +static int btrfs_ioctl_encoded_read(struct file *file, void __user *argp, + bool compat) +{ + struct btrfs_ioctl_encoded_io_args args = {}; + size_t copy_end_kernel = offsetofend(struct btrfs_ioctl_encoded_io_args, + flags); + size_t copy_end; + struct iovec iovstack[UIO_FASTIOV]; + struct iovec *iov = iovstack; + struct iov_iter iter; + loff_t pos; + struct kiocb kiocb; + ssize_t ret; + + if (!capable(CAP_SYS_ADMIN)) { + ret = -EPERM; + goto out_acct; + } + + if (compat) { +#if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT) + struct btrfs_ioctl_encoded_io_args_32 args32; + + copy_end = offsetofend(struct btrfs_ioctl_encoded_io_args_32, + flags); + if (copy_from_user(&args32, argp, copy_end)) { + ret = -EFAULT; + goto out_acct; + } + args.iov = compat_ptr(args32.iov); + args.iovcnt = args32.iovcnt; + args.offset = args32.offset; + args.flags = args32.flags; +#else + return -ENOTTY; +#endif + } else { + copy_end = copy_end_kernel; + if (copy_from_user(&args, argp, copy_end)) { + ret = -EFAULT; + goto out_acct; + } + } + if (args.flags != 0) { + ret = -EINVAL; + goto out_acct; + } + + ret = import_iovec(READ, args.iov, args.iovcnt, ARRAY_SIZE(iovstack), + &iov, &iter); + if (ret < 0) + goto out_acct; + + if (iov_iter_count(&iter) == 0) { + ret = 0; + goto out_iov; + } + pos = args.offset; + ret = rw_verify_area(READ, file, &pos, args.len); + if (ret < 0) + goto out_iov; + + init_sync_kiocb(&kiocb, file); + kiocb.ki_pos = pos; + + ret = btrfs_encoded_read(&kiocb, &iter, &args); + if (ret >= 0) { + fsnotify_access(file); + if (copy_to_user(argp + copy_end, + (char *)&args + copy_end_kernel, + sizeof(args) - copy_end_kernel)) + ret = -EFAULT; + } + +out_iov: + kfree(iov); +out_acct: + if (ret > 0) + add_rchar(current, ret); + inc_syscr(current); + return ret; +} + long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -5005,6 +5105,12 @@ long btrfs_ioctl(struct file *file, unsigned int return fsverity_ioctl_enable(file, (const void __user *)argp); case FS_IOC_MEASURE_VERITY: return fsverity_ioctl_measure(file, argp); + case BTRFS_IOC_ENCODED_READ: + return btrfs_ioctl_encoded_read(file, argp, false); +#if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT) + case BTRFS_IOC_ENCODED_READ_32: + return btrfs_ioctl_encoded_read(file, argp, true); +#endif } return -ENOTTY; From patchwork Wed Nov 17 20:19:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECE3AC433FE for ; Wed, 17 Nov 2021 20:20:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D665A61AEC for ; Wed, 17 Nov 2021 20:20:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240671AbhKQUXV (ORCPT ); Wed, 17 Nov 2021 15:23:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53084 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240662AbhKQUXK (ORCPT ); Wed, 17 Nov 2021 15:23:10 -0500 Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com [IPv6:2607:f8b0:4864:20::f33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 852D2C061570 for ; Wed, 17 Nov 2021 12:20:11 -0800 (PST) Received: by mail-qv1-xf33.google.com with SMTP id gu12so2913789qvb.6 for ; Wed, 17 Nov 2021 12:20:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mZCctGKvsKBsf6kZVF7EU9RY+omIU+FdlXjGHWEI3DY=; b=mNgUummmbzggy7wsDSCjuoHU6m0p850bWHCEak+sJHPqEraAZJq2LDKcmq97udGJnY pLrs038EBaS/v0UqU7pdYYenmcGpouSctyQjDYPq3usvQPgAoiClEOCOGHPxugilDTj+ 0DN8DWxo85dNlciQlQk4AWuc8a57auxQsz/Lzq/Gq/hCGvQqLa49bHGGdHoZdwy6zmKa 4GNmJfMm+tB6ZFFz8jsqfDt/bfQLR/4pt0FgV8gZRPxdj/txU0mf8292qCgwCGxxTvl4 Br02h7exsZ9fE6VakLvjRPX8ARaHN4sr7u77EBZTIVn8vVCT4c6K2a0DG5Kt/aBkM6Bd I+ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mZCctGKvsKBsf6kZVF7EU9RY+omIU+FdlXjGHWEI3DY=; b=hwkTs6clmwfLLKMUR0hI5zxH2f+pCVyAhn2GBM4w2G0QTj4k4QMDAUmK5u2lBnT477 Mx1s7HR5Npdrkmk6ZoyMmiOqYjU2PSCRRc7qA9R1KgfiVZFsSUVAobY0M54u0rzpxPyB iJSK0ZCPrxdsZkjSOYthcJhoPIHO3OqsqACSNH87Cp/8UCj3N7HBO2fuFhu/ypACf6Ha 1CK1VZ85B0nXk+iFka9g1LbCMDIieXYUHfKR4amOIp2HK6D1ScX/8cGr09J5oFFRgh+4 tV4Z0ZbAlEQpYR6C5lnJ9P4kzNqxdpDGUUgGw+nJpMKXh6Q1IA7YZZhsIZJw1+6Qpzua 3myw== X-Gm-Message-State: AOAM533cSgzou21D4OQ+OPI0N6lBTvnP7MXY/Q6ixbRg9pJDaFIeay+c hFtglQ2HcZoQR+M9EEyi+dIQJST3bY6ufw== X-Google-Smtp-Source: ABdhPJwtaGwcXEYrSu9+NrZpLvzojiZPZbMbCUbrIO2f8ooubGO06P7/HM6EowxKlhOxOtRFRxk2Iw== X-Received: by 2002:a05:6214:226c:: with SMTP id gs12mr59207199qvb.49.1637180410332; Wed, 17 Nov 2021 12:20:10 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:10 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 10/17] btrfs: add BTRFS_IOC_ENCODED_WRITE Date: Wed, 17 Nov 2021 12:19:20 -0800 Message-Id: X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval The implementation resembles direct I/O: we have to flush any ordered extents, invalidate the page cache, and do the io tree/delalloc/extent map/ordered extent dance. From there, we can reuse the compression code with a minor modification to distinguish the write from writeback. This also creates inline extents when possible. Signed-off-by: Omar Sandoval --- fs/btrfs/compression.c | 7 +- fs/btrfs/compression.h | 6 +- fs/btrfs/ctree.h | 4 + fs/btrfs/file.c | 65 ++++++++-- fs/btrfs/inode.c | 256 +++++++++++++++++++++++++++++++++++++++- fs/btrfs/ioctl.c | 102 ++++++++++++++++ fs/btrfs/ordered-data.c | 12 +- fs/btrfs/ordered-data.h | 5 +- 8 files changed, 437 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 73350f524fb8..ac5656392c2c 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -382,7 +382,8 @@ static void finish_compressed_bio_write(struct compressed_bio *cb) cb->start, cb->start + cb->len - 1, !cb->errors); - end_compressed_writeback(inode, cb); + if (cb->writeback) + end_compressed_writeback(inode, cb); /* Note, our inode could be gone now */ /* @@ -505,7 +506,8 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, struct page **compressed_pages, unsigned int nr_pages, unsigned int write_flags, - struct cgroup_subsys_state *blkcg_css) + struct cgroup_subsys_state *blkcg_css, + bool writeback) { struct btrfs_fs_info *fs_info = inode->root->fs_info; struct bio *bio = NULL; @@ -530,6 +532,7 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, cb->mirror_num = 0; cb->compressed_pages = compressed_pages; cb->compressed_len = compressed_len; + cb->writeback = writeback; cb->orig_bio = NULL; cb->nr_pages = nr_pages; diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index 56eef0821e3e..a50dd19e764e 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -52,6 +52,9 @@ struct compressed_bio { /* The compression algorithm for this bio */ u8 compress_type; + /* Whether this is a write for writeback. */ + bool writeback; + /* IO errors */ u8 errors; int mirror_num; @@ -95,7 +98,8 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, struct page **compressed_pages, unsigned int nr_pages, unsigned int write_flags, - struct cgroup_subsys_state *blkcg_css); + struct cgroup_subsys_state *blkcg_css, + bool writeback); blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags); diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 70034e33abe6..98defa9cce0e 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3278,6 +3278,8 @@ void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode, struct btrfs_ioctl_encoded_io_args; ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter, struct btrfs_ioctl_encoded_io_args *encoded); +ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, + const struct btrfs_ioctl_encoded_io_args *encoded); extern const struct dentry_operations btrfs_dentry_operations; extern const struct iomap_ops btrfs_dio_iomap_ops; @@ -3338,6 +3340,8 @@ int btrfs_replace_file_extents(struct btrfs_inode *inode, struct btrfs_trans_handle **trans_out); int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, struct btrfs_inode *inode, u64 start, u64 end); +ssize_t btrfs_do_write_iter(struct kiocb *iocb, struct iov_iter *from, + const struct btrfs_ioctl_encoded_io_args *encoded); int btrfs_release_file(struct inode *inode, struct file *file); int btrfs_dirty_pages(struct btrfs_inode *inode, struct page **pages, size_t num_pages, loff_t pos, size_t write_bytes, diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 5fbf0a2aba2e..d740d559717b 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2067,12 +2067,43 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from) return err < 0 ? err : written; } -static ssize_t btrfs_file_write_iter(struct kiocb *iocb, - struct iov_iter *from) +static ssize_t btrfs_encoded_write(struct kiocb *iocb, struct iov_iter *from, + const struct btrfs_ioctl_encoded_io_args *encoded) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + loff_t count; + ssize_t ret; + + btrfs_inode_lock(inode, 0); + count = encoded->len; + ret = generic_write_checks_count(iocb, &count); + if (ret == 0 && count != encoded->len) { + /* + * The write got truncated by generic_write_checks_count(). We + * can't do a partial encoded write. + */ + ret = -EFBIG; + } + if (ret || encoded->len == 0) + goto out; + + ret = btrfs_write_check(iocb, from, encoded->len); + if (ret < 0) + goto out; + + ret = btrfs_do_encoded_write(iocb, from, encoded); +out: + btrfs_inode_unlock(inode, 0); + return ret; +} + +ssize_t btrfs_do_write_iter(struct kiocb *iocb, struct iov_iter *from, + const struct btrfs_ioctl_encoded_io_args *encoded) { struct file *file = iocb->ki_filp; struct btrfs_inode *inode = BTRFS_I(file_inode(file)); - ssize_t num_written = 0; + ssize_t num_written, num_sync; const bool sync = iocb->ki_flags & IOCB_DSYNC; /* @@ -2083,22 +2114,28 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, if (BTRFS_FS_ERROR(inode->root->fs_info)) return -EROFS; - if (!(iocb->ki_flags & IOCB_DIRECT) && - (iocb->ki_flags & IOCB_NOWAIT)) + if ((iocb->ki_flags & IOCB_NOWAIT) && !(iocb->ki_flags & IOCB_DIRECT)) return -EOPNOTSUPP; if (sync) atomic_inc(&inode->sync_writers); - if (iocb->ki_flags & IOCB_DIRECT) - num_written = btrfs_direct_write(iocb, from); - else - num_written = btrfs_buffered_write(iocb, from); + if (encoded) { + num_written = btrfs_encoded_write(iocb, from, encoded); + num_sync = encoded->len; + } else if (iocb->ki_flags & IOCB_DIRECT) { + num_written = num_sync = btrfs_direct_write(iocb, from); + } else { + num_written = num_sync = btrfs_buffered_write(iocb, from); + } btrfs_set_inode_last_sub_trans(inode); - if (num_written > 0) - num_written = generic_write_sync(iocb, num_written); + if (num_sync > 0) { + num_sync = generic_write_sync(iocb, num_sync); + if (num_sync < 0) + num_written = num_sync; + } if (sync) atomic_dec(&inode->sync_writers); @@ -2107,6 +2144,12 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, return num_written; } +static ssize_t btrfs_file_write_iter(struct kiocb *iocb, + struct iov_iter *from) +{ + return btrfs_do_write_iter(iocb, from, NULL); +} + int btrfs_release_file(struct inode *inode, struct file *filp) { struct btrfs_file_private *private = filp->private_data; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d29e968fd18b..6919dc170c93 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -999,7 +999,7 @@ static int submit_one_async_extent(struct btrfs_inode *inode, async_extent->pages, /* compressed_pages */ async_extent->nr_pages, async_chunk->write_flags, - async_chunk->blkcg_css)) { + async_chunk->blkcg_css, true)) { const u64 start = async_extent->start; const u64 end = start + async_extent->ram_size - 1; @@ -2994,6 +2994,7 @@ static int insert_ordered_extent_file_extent(struct btrfs_trans_handle *trans, * except if the ordered extent was truncated. */ update_inode_bytes = test_bit(BTRFS_ORDERED_DIRECT, &oe->flags) || + test_bit(BTRFS_ORDERED_ENCODED, &oe->flags) || test_bit(BTRFS_ORDERED_TRUNCATED, &oe->flags); return insert_reserved_file_extent(trans, BTRFS_I(oe->inode), @@ -3028,7 +3029,8 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent) if (!test_bit(BTRFS_ORDERED_NOCOW, &ordered_extent->flags) && !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags) && - !test_bit(BTRFS_ORDERED_DIRECT, &ordered_extent->flags)) + !test_bit(BTRFS_ORDERED_DIRECT, &ordered_extent->flags) && + !test_bit(BTRFS_ORDERED_ENCODED, &ordered_extent->flags)) clear_bits |= EXTENT_DELALLOC_NEW; freespace_inode = btrfs_is_free_space_inode(inode); @@ -11021,6 +11023,256 @@ ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter, return ret; } +ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, + const struct btrfs_ioctl_encoded_io_args *encoded) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + struct btrfs_root *root = BTRFS_I(inode)->root; + struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; + struct extent_changeset *data_reserved = NULL; + struct extent_state *cached_state = NULL; + int compression; + size_t orig_count; + u64 start, end; + u64 num_bytes, ram_bytes, disk_num_bytes; + unsigned long nr_pages, i; + struct page **pages; + struct btrfs_key ins; + bool extent_reserved = false; + struct extent_map *em; + ssize_t ret; + + switch (encoded->compression) { + case BTRFS_ENCODED_IO_COMPRESSION_ZLIB: + compression = BTRFS_COMPRESS_ZLIB; + break; + case BTRFS_ENCODED_IO_COMPRESSION_ZSTD: + compression = BTRFS_COMPRESS_ZSTD; + break; + case BTRFS_ENCODED_IO_COMPRESSION_LZO_4K: + case BTRFS_ENCODED_IO_COMPRESSION_LZO_8K: + case BTRFS_ENCODED_IO_COMPRESSION_LZO_16K: + case BTRFS_ENCODED_IO_COMPRESSION_LZO_32K: + case BTRFS_ENCODED_IO_COMPRESSION_LZO_64K: + /* The page size must match for LZO. */ + if (encoded->compression - + BTRFS_ENCODED_IO_COMPRESSION_LZO_4K + 12 != PAGE_SHIFT) + return -EINVAL; + compression = BTRFS_COMPRESS_LZO; + break; + default: + return -EINVAL; + } + if (encoded->encryption != BTRFS_ENCODED_IO_ENCRYPTION_NONE) + return -EINVAL; + + orig_count = iov_iter_count(from); + + /* The extent size must be sane. */ + if (encoded->unencoded_len > BTRFS_MAX_UNCOMPRESSED || + orig_count > BTRFS_MAX_COMPRESSED || orig_count == 0) + return -EINVAL; + + /* + * The compressed data must be smaller than the decompressed data. + * + * It's of course possible for data to compress to larger or the same + * size, but the buffered I/O path falls back to no compression for such + * data, and we don't want to break any assumptions by creating these + * extents. + * + * Note that this is less strict than the current check we have that the + * compressed data must be at least one sector smaller than the + * decompressed data. We only want to enforce the weaker requirement + * from old kernels that it is at least one byte smaller. + */ + if (orig_count >= encoded->unencoded_len) + return -EINVAL; + + /* The extent must start on a sector boundary. */ + start = iocb->ki_pos; + if (!IS_ALIGNED(start, fs_info->sectorsize)) + return -EINVAL; + + /* + * The extent must end on a sector boundary. However, we allow a write + * which ends at or extends i_size to have an unaligned length; we round + * up the extent size and set i_size to the unaligned end. + */ + if (start + encoded->len < inode->i_size && + !IS_ALIGNED(start + encoded->len, fs_info->sectorsize)) + return -EINVAL; + + /* Finally, the offset in the unencoded data must be sector-aligned. */ + if (!IS_ALIGNED(encoded->unencoded_offset, fs_info->sectorsize)) + return -EINVAL; + + num_bytes = ALIGN(encoded->len, fs_info->sectorsize); + ram_bytes = ALIGN(encoded->unencoded_len, fs_info->sectorsize); + end = start + num_bytes - 1; + + /* + * If the extent cannot be inline, the compressed data on disk must be + * sector-aligned. For convenience, we extend it with zeroes if it + * isn't. + */ + disk_num_bytes = ALIGN(orig_count, fs_info->sectorsize); + nr_pages = DIV_ROUND_UP(disk_num_bytes, PAGE_SIZE); + pages = kvcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL_ACCOUNT); + if (!pages) + return -ENOMEM; + for (i = 0; i < nr_pages; i++) { + size_t bytes = min_t(size_t, PAGE_SIZE, iov_iter_count(from)); + char *kaddr; + + pages[i] = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM); + if (!pages[i]) { + ret = -ENOMEM; + goto out_pages; + } + kaddr = kmap(pages[i]); + if (copy_from_iter(kaddr, bytes, from) != bytes) { + kunmap(pages[i]); + ret = -EFAULT; + goto out_pages; + } + if (bytes < PAGE_SIZE) + memset(kaddr + bytes, 0, PAGE_SIZE - bytes); + kunmap(pages[i]); + } + + for (;;) { + struct btrfs_ordered_extent *ordered; + + ret = btrfs_wait_ordered_range(inode, start, num_bytes); + if (ret) + goto out_pages; + ret = invalidate_inode_pages2_range(inode->i_mapping, + start >> PAGE_SHIFT, + end >> PAGE_SHIFT); + if (ret) + goto out_pages; + lock_extent_bits(io_tree, start, end, &cached_state); + ordered = btrfs_lookup_ordered_range(BTRFS_I(inode), start, + num_bytes); + if (!ordered && + !filemap_range_has_page(inode->i_mapping, start, end)) + break; + if (ordered) + btrfs_put_ordered_extent(ordered); + unlock_extent_cached(io_tree, start, end, &cached_state); + cond_resched(); + } + + /* + * We don't use the higher-level delalloc space functions because our + * num_bytes and disk_num_bytes are different. + */ + ret = btrfs_alloc_data_chunk_ondemand(BTRFS_I(inode), disk_num_bytes); + if (ret) + goto out_unlock; + ret = btrfs_qgroup_reserve_data(BTRFS_I(inode), &data_reserved, start, + num_bytes); + if (ret) + goto out_free_data_space; + ret = btrfs_delalloc_reserve_metadata(BTRFS_I(inode), num_bytes, + disk_num_bytes); + if (ret) + goto out_qgroup_free_data; + + /* Try an inline extent first. */ + if (start == 0 && encoded->unencoded_len == encoded->len && + encoded->unencoded_offset == 0) { + ret = cow_file_range_inline(BTRFS_I(inode), encoded->len, + orig_count, compression, pages, + true); + if (ret <= 0) { + if (ret == 0) + ret = orig_count; + goto out_delalloc_release; + } + } + + ret = btrfs_reserve_extent(root, disk_num_bytes, disk_num_bytes, + disk_num_bytes, 0, 0, &ins, 1, 1); + if (ret) + goto out_delalloc_release; + extent_reserved = true; + + em = create_io_em(BTRFS_I(inode), start, num_bytes, + start - encoded->unencoded_offset, ins.objectid, + ins.offset, ins.offset, ram_bytes, compression, + BTRFS_ORDERED_COMPRESSED); + if (IS_ERR(em)) { + ret = PTR_ERR(em); + goto out_free_reserved; + } + free_extent_map(em); + + ret = btrfs_add_ordered_extent(BTRFS_I(inode), start, num_bytes, + ram_bytes, ins.objectid, ins.offset, + encoded->unencoded_offset, + (1 << BTRFS_ORDERED_ENCODED) | + (1 << BTRFS_ORDERED_COMPRESSED), + compression); + if (ret) { + btrfs_drop_extent_cache(BTRFS_I(inode), start, end, 0); + goto out_free_reserved; + } + btrfs_dec_block_group_reservations(fs_info, ins.objectid); + + if (start + encoded->len > inode->i_size) + i_size_write(inode, start + encoded->len); + + unlock_extent_cached(io_tree, start, end, &cached_state); + + btrfs_delalloc_release_extents(BTRFS_I(inode), num_bytes); + + if (btrfs_submit_compressed_write(BTRFS_I(inode), start, num_bytes, + ins.objectid, ins.offset, pages, + nr_pages, 0, NULL, false)) { + btrfs_writepage_endio_finish_ordered(BTRFS_I(inode), pages[0], + start, end, 0); + ret = -EIO; + goto out_pages; + } + ret = orig_count; + goto out; + +out_free_reserved: + btrfs_dec_block_group_reservations(fs_info, ins.objectid); + btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1); +out_delalloc_release: + btrfs_delalloc_release_extents(BTRFS_I(inode), num_bytes); + btrfs_delalloc_release_metadata(BTRFS_I(inode), disk_num_bytes, + ret < 0); +out_qgroup_free_data: + if (ret < 0) { + btrfs_qgroup_free_data(BTRFS_I(inode), data_reserved, start, + num_bytes); + } +out_free_data_space: + /* + * If btrfs_reserve_extent() succeeded, then we already decremented + * bytes_may_use. + */ + if (!extent_reserved) + btrfs_free_reserved_data_space_noquota(fs_info, disk_num_bytes); +out_unlock: + unlock_extent_cached(io_tree, start, end, &cached_state); +out_pages: + for (i = 0; i < nr_pages; i++) { + if (pages[i]) + __free_page(pages[i]); + } + kvfree(pages); +out: + if (ret >= 0) + iocb->ki_pos += encoded->len; + return ret; +} + #ifdef CONFIG_SWAP /* * Add an entry indicating a block group or device which is pinned by a diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index f0c575223d88..ea78aa2e8ff3 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -105,6 +105,8 @@ struct btrfs_ioctl_encoded_io_args_32 { #define BTRFS_IOC_ENCODED_READ_32 _IOR(BTRFS_IOCTL_MAGIC, 64, \ struct btrfs_ioctl_encoded_io_args_32) +#define BTRFS_IOC_ENCODED_WRITE_32 _IOW(BTRFS_IOCTL_MAGIC, 64, \ + struct btrfs_ioctl_encoded_io_args_32) #endif /* Mask out flags that are inappropriate for the given type of inode. */ @@ -4961,6 +4963,102 @@ static int btrfs_ioctl_encoded_read(struct file *file, void __user *argp, return ret; } +static int btrfs_ioctl_encoded_write(struct file *file, void __user *argp, + bool compat) +{ + struct btrfs_ioctl_encoded_io_args args; + struct iovec iovstack[UIO_FASTIOV]; + struct iovec *iov = iovstack; + struct iov_iter iter; + loff_t pos; + struct kiocb kiocb; + ssize_t ret; + + if (!capable(CAP_SYS_ADMIN)) { + ret = -EPERM; + goto out_acct; + } + + if (!(file->f_mode & FMODE_WRITE)) { + ret = -EBADF; + goto out_acct; + } + + if (compat) { +#if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT) + struct btrfs_ioctl_encoded_io_args_32 args32; + + if (copy_from_user(&args32, argp, sizeof(args32))) { + ret = -EFAULT; + goto out_acct; + } + args.iov = compat_ptr(args32.iov); + args.iovcnt = args32.iovcnt; + memcpy(&args.offset, &args32.offset, + sizeof(args) - + offsetof(struct btrfs_ioctl_encoded_io_args, offset)); +#else + return -ENOTTY; +#endif + } else { + if (copy_from_user(&args, argp, sizeof(args))) { + ret = -EFAULT; + goto out_acct; + } + } + + ret = -EINVAL; + if (args.flags != 0) + goto out_acct; + if (memchr_inv(args.reserved, 0, sizeof(args.reserved))) + goto out_acct; + if (args.compression == BTRFS_ENCODED_IO_COMPRESSION_NONE && + args.encryption == BTRFS_ENCODED_IO_ENCRYPTION_NONE) + goto out_acct; + if (args.compression >= BTRFS_ENCODED_IO_COMPRESSION_TYPES || + args.encryption >= BTRFS_ENCODED_IO_ENCRYPTION_TYPES) + goto out_acct; + if (args.unencoded_offset > args.unencoded_len) + goto out_acct; + if (args.len > args.unencoded_len - args.unencoded_offset) + goto out_acct; + + ret = import_iovec(WRITE, args.iov, args.iovcnt, ARRAY_SIZE(iovstack), + &iov, &iter); + if (ret < 0) + goto out_acct; + + file_start_write(file); + + if (iov_iter_count(&iter) == 0) { + ret = 0; + goto out_end_write; + } + pos = args.offset; + ret = rw_verify_area(WRITE, file, &pos, args.len); + if (ret < 0) + goto out_end_write; + + init_sync_kiocb(&kiocb, file); + ret = kiocb_set_rw_flags(&kiocb, 0); + if (ret) + goto out_end_write; + kiocb.ki_pos = pos; + + ret = btrfs_do_write_iter(&kiocb, &iter, &args); + if (ret > 0) + fsnotify_modify(file); + +out_end_write: + file_end_write(file); + kfree(iov); +out_acct: + if (ret > 0) + add_wchar(current, ret); + inc_syscw(current); + return ret; +} + long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -5107,9 +5205,13 @@ long btrfs_ioctl(struct file *file, unsigned int return fsverity_ioctl_measure(file, argp); case BTRFS_IOC_ENCODED_READ: return btrfs_ioctl_encoded_read(file, argp, false); + case BTRFS_IOC_ENCODED_WRITE: + return btrfs_ioctl_encoded_write(file, argp, false); #if defined(CONFIG_64BIT) && defined(CONFIG_COMPAT) case BTRFS_IOC_ENCODED_READ_32: return btrfs_ioctl_encoded_read(file, argp, true); + case BTRFS_IOC_ENCODED_WRITE_32: + return btrfs_ioctl_encoded_write(file, argp, true); #endif } diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index 5e4c59b00b01..7837336ab4b9 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -521,9 +521,15 @@ void btrfs_remove_ordered_extent(struct btrfs_inode *btrfs_inode, spin_lock(&btrfs_inode->lock); btrfs_mod_outstanding_extents(btrfs_inode, -1); spin_unlock(&btrfs_inode->lock); - if (root != fs_info->tree_root) - btrfs_delalloc_release_metadata(btrfs_inode, entry->num_bytes, - false); + if (root != fs_info->tree_root) { + u64 release; + + if (test_bit(BTRFS_ORDERED_ENCODED, &entry->flags)) + release = entry->disk_num_bytes; + else + release = entry->num_bytes; + btrfs_delalloc_release_metadata(btrfs_inode, release, false); + } percpu_counter_add_batch(&fs_info->ordered_bytes, -entry->num_bytes, fs_info->delalloc_batch); diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h index 0feb0c29839e..aeb17714ba5a 100644 --- a/fs/btrfs/ordered-data.h +++ b/fs/btrfs/ordered-data.h @@ -74,6 +74,8 @@ enum { BTRFS_ORDERED_LOGGED_CSUM, /* We wait for this extent to complete in the current transaction */ BTRFS_ORDERED_PENDING, + /* BTRFS_IOC_ENCODED_WRITE */ + BTRFS_ORDERED_ENCODED, }; /* BTRFS_ORDERED_* flags that specify the type of the extent. */ @@ -81,7 +83,8 @@ enum { (1UL << BTRFS_ORDERED_NOCOW) | \ (1UL << BTRFS_ORDERED_PREALLOC) | \ (1UL << BTRFS_ORDERED_COMPRESSED) | \ - (1UL << BTRFS_ORDERED_DIRECT)) + (1UL << BTRFS_ORDERED_DIRECT) | \ + (1UL << BTRFS_ORDERED_ENCODED)) struct btrfs_ordered_extent { /* logical offset in the file */ From patchwork Wed Nov 17 20:19:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CBD2C433F5 for ; Wed, 17 Nov 2021 20:20:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8454E61AF0 for ; Wed, 17 Nov 2021 20:20:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240659AbhKQUXV (ORCPT ); Wed, 17 Nov 2021 15:23:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240665AbhKQUXL (ORCPT ); Wed, 17 Nov 2021 15:23:11 -0500 Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FD00C061764 for ; Wed, 17 Nov 2021 12:20:12 -0800 (PST) Received: by mail-qt1-x836.google.com with SMTP id l8so3879287qtk.6 for ; Wed, 17 Nov 2021 12:20:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qBQva0FUvt3/5vG/ecPqeKPJFo4tO0Jrs3EWII0VQ54=; b=IHrt7ef/F10KfuYQXnzg26k8VxkjppqEW+UIqeXu1EF5M5waQcu+xtaffU1noxOlcj l7ii6BQYYhwr+tHs+Hinpv3MFJ/kZn67KqSZQg2LBdrWBREDzraoWQ2HvRS7zX4Kozhn qkSCyI7iw7e8RaAQSrCMlIk1e+ZCAqmFDSeOuhlb78EUTg0iGwII8TVcLCLrbX98ADS/ 8T5k0Cr8/t+oR7fTbwv16sePnhXUxIwRxa6Hl+S3QrcQH9VgF+jKniPcMJioRRLGItwx 6Zfao4Gqnjlhhx5Wb4FLuxtPG8e7lRqHJcDMHNThlJcL8rhe5oAvm1nPpW+/8hWb1hKy 4GxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qBQva0FUvt3/5vG/ecPqeKPJFo4tO0Jrs3EWII0VQ54=; b=UNC1IBWWGxYuaYgoEAOcprhBs7pB3ptRuVU6YRJTZq8wOlbO++0Rgbdt2PZtsTXz6e 5uyQfSWIqihnbbO02GeVvV8MKIhAWdohyLjKAv5ywnj6/yb4m0UORKY/fqcgJMeO9eSv mZAdvwGHCBNbFURGAhnNl068Y36JCaUZ8k3oWKUyjOfiyf1DNOaxeHzK7NaYzjIpwtuG DTYmxj3s79M8wIxizP8x3X89Wq6Bx1XMSq/hJ2rkDs4RcdWMcj6/BKJvXXS6zq9drXIX GzzF7RUWo2mUgFRkAIsW7gQDDxJhozkUus+xdsq3bdlWUfkMdWCdMc1mvKwxdsaA0IQH RRtw== X-Gm-Message-State: AOAM532zvD7qwU5rAW7mo3vBfrkbzCzl3fldt0LIh0Km++Poog7sWAmC e58HK47jQIgbIgouO4XcI4dQdiXNf/Oy1Q== X-Google-Smtp-Source: ABdhPJx9r/04O5IehEisXGLFVPtDNv3IDa+/CJYNLMdJA2sOVbMwtnMqx364AstNjI78b9Nj/KO3wQ== X-Received: by 2002:a05:622a:18e:: with SMTP id s14mr20896510qtw.203.1637180411174; Wed, 17 Nov 2021 12:20:11 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:10 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 11/17] btrfs: send: remove unused send_ctx::{total,cmd}_send_size Date: Wed, 17 Nov 2021 12:19:21 -0800 Message-Id: <27b1a93caae1666ea29ea76f3eb6a7aa2878e65e.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval We collect these statistics but have never used them for anything. Signed-off-by: Omar Sandoval --- fs/btrfs/send.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 6bdcb9d481d5..500b866ede43 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,8 +81,6 @@ struct send_ctx { char *send_buf; u32 send_size; u32 send_max_size; - u64 total_send_size; - u64 cmd_send_size[BTRFS_SEND_C_MAX + 1]; u64 flags; /* 'flags' member of btrfs_ioctl_send_args is u64 */ /* Protocol version compatibility requested */ u32 proto; @@ -722,8 +720,6 @@ static int send_cmd(struct send_ctx *sctx) ret = write_buf(sctx->send_filp, sctx->send_buf, sctx->send_size, &sctx->send_off); - sctx->total_send_size += sctx->send_size; - sctx->cmd_send_size[get_unaligned_le16(&hdr->cmd)] += sctx->send_size; sctx->send_size = 0; return ret; From patchwork Wed Nov 17 20:19:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CD40C433EF for ; Wed, 17 Nov 2021 20:20:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 02D4061AEC for ; Wed, 17 Nov 2021 20:20:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240676AbhKQUXW (ORCPT ); Wed, 17 Nov 2021 15:23:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240673AbhKQUXM (ORCPT ); Wed, 17 Nov 2021 15:23:12 -0500 Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39D72C061570 for ; Wed, 17 Nov 2021 12:20:13 -0800 (PST) Received: by mail-qk1-x733.google.com with SMTP id a11so3864147qkh.13 for ; Wed, 17 Nov 2021 12:20:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FSd9FSAHvwRStKfG5eBvhbiT9acKh6fdEyQqg0VRLcU=; b=vzuaJNx8HlHEgIWVqBIgPcRxmkatk4PupbaLzlPs9ZdJX0inwPYn22x9nFKmtzZNsD y/V7MBJ3mDSgGlMrGUUYe9N8TH+w2eBMEhl9l38J4xIqeEUdDq0fFv7nMOhCrjnl/vg0 vox7WTYL88mUPEtqeKgDe2CXJeCj76BcEyTWiDyI4COlgvTbsXRdm9f5OHMiS2IwfV4w Z9T93bwlQ8McRwo7CwkxSOCJp+XB0F9z5NBx8Kl6R1ubs54viO47L7cTHu3c/G5tjyAO rskK5JZTy6V2SqTHjBFsyKmCmPW9/H26e8e2jBW/ttyXCcN9CtZkfKLZ2a9sOIKj/pD/ I0/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FSd9FSAHvwRStKfG5eBvhbiT9acKh6fdEyQqg0VRLcU=; b=zEo9Byiurc9I5wqzF6f0KTQAO8OjktTaoLotvgS1WnyQYLyazSfCW/nexi9maK5CjV 4vdkKL9qTO2kOghZU9224YXwXuyipww+r3WLSaUqFrc5JlcHnMk9FjOwRDt0GY48cwmg 4WFi8INNvrvmCj1I4YogUEPmUTNRkyNOHrppnf2p2SMGqEvBCx0DADmRLAGZRuFlvQYW xm67BZOmpySpSjr4bUTPShUIbf594vYj1Hgh+yoQ8cxtKnVSonHeATv5XcfezA+bhyqW O6qfwvcRO3GThqNtn2qFnQxwiNDxLXFfyuZJyMFFXl5MikOBEfyOWpRj3mWgqYRGQwXS RbCg== X-Gm-Message-State: AOAM533B9NAdfu9pFdBF2M2Us9gxzSTsCf6wwZ+qJ5yC5CHZxWCyzkzN nxzJ5IHpJz+8iQNIrFiEPCJ8ORI/H85QcA== X-Google-Smtp-Source: ABdhPJzGuH1CjkgxwLeG5C9AvF4iZGpeo3U5jIVbiVJVGhqr9CPkxwxVovj5nYcmI0qGHhX7Lq5KdA== X-Received: by 2002:a05:620a:d58:: with SMTP id o24mr15180250qkl.517.1637180412238; Wed, 17 Nov 2021 12:20:12 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:11 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 12/17] btrfs: send: fix maximum command numbering Date: Wed, 17 Nov 2021 12:19:22 -0800 Message-Id: X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Commit e77fbf990316 ("btrfs: send: prepare for v2 protocol") added _BTRFS_SEND_C_MAX_V* macros equal to the maximum command number for the version plus 1, but as written this creates gaps in the number space. The maximum command number is currently 22, and __BTRFS_SEND_C_MAX_V1 is accordingly 23. But then __BTRFS_SEND_C_MAX_V2 is 24, suggesting that v2 has a command numbered 23, and __BTRFS_SEND_C_MAX is 25, suggesting that 23 and 24 are valid commands. Instead, let's explicitly set BTRFS_SEND_C_MAX_V* to the maximum command number. This requires repeating the command name, but it has a clearer meaning and avoids gaps. It also doesn't require updating __BTRFS_SEND_C_MAX for every new version. Signed-off-by: Omar Sandoval --- fs/btrfs/send.c | 4 ++-- fs/btrfs/send.h | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 500b866ede43..450c873684e8 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -316,8 +316,8 @@ __maybe_unused static bool proto_cmd_ok(const struct send_ctx *sctx, int cmd) { switch (sctx->proto) { - case 1: return cmd < __BTRFS_SEND_C_MAX_V1; - case 2: return cmd < __BTRFS_SEND_C_MAX_V2; + case 1: return cmd <= BTRFS_SEND_C_MAX_V1; + case 2: return cmd <= BTRFS_SEND_C_MAX_V2; default: return false; } } diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h index 23bcefc84e49..59a4be3b09cd 100644 --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -77,10 +77,10 @@ enum btrfs_send_cmd { BTRFS_SEND_C_END, BTRFS_SEND_C_UPDATE_EXTENT, - __BTRFS_SEND_C_MAX_V1, + BTRFS_SEND_C_MAX_V1 = BTRFS_SEND_C_UPDATE_EXTENT, /* Version 2 */ - __BTRFS_SEND_C_MAX_V2, + BTRFS_SEND_C_MAX_V2 = BTRFS_SEND_C_MAX_V1, /* End */ __BTRFS_SEND_C_MAX, From patchwork Wed Nov 17 20:19:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AB8CC4332F for ; Wed, 17 Nov 2021 20:20:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 52D5B61B9F for ; Wed, 17 Nov 2021 20:20:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240678AbhKQUXW (ORCPT ); Wed, 17 Nov 2021 15:23:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240675AbhKQUXN (ORCPT ); Wed, 17 Nov 2021 15:23:13 -0500 Received: from mail-qk1-x736.google.com (mail-qk1-x736.google.com [IPv6:2607:f8b0:4864:20::736]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30A75C061764 for ; Wed, 17 Nov 2021 12:20:14 -0800 (PST) Received: by mail-qk1-x736.google.com with SMTP id a11so3864189qkh.13 for ; Wed, 17 Nov 2021 12:20:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OHlERDHKyp9NFrg1TfRbEDXur2CZfF7mBtc6t/fE2V8=; b=cnSrLtBb/UjIx7e/RHQoXm3XtMkrQE1+PCztE7evTXOFHhYE/s9pQu9UFOnAPS9jww sTP9S4Flu80UAlaNo7NLRXE9OPJj2mrt/vo833QNGwG1FVEuq9W/QYAiitahLN2G+hK8 4HvoZQmRC2WFHAsSM/xmFcKdwLyZ/hQHdwWAHi8mHlKBOllxRWtas6tfRCIp2rf6D31w DVJD/JUnSPmZ5WASmoURJRwFhMTt5JGbos/OgIJF5VCT5fFv5JjAXXikElGAQGXIH+Em 0S4185xeB7mT0YhvdzDNObJ6X1n1kSs6IYr6PKDYKGbnpMdNw9OSdpHbQqTIknkjdxly mEXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OHlERDHKyp9NFrg1TfRbEDXur2CZfF7mBtc6t/fE2V8=; b=GmEtuAfwcqfjxCIEa+KIEM26J1p2OnCefrhwl/7ltjMQZmi7OPXK236fcUaXp8UFz0 Amua9ynv27CAv1/6yAakiUhLQ7t8Ix22m0MKEHYgxPad9XOzfwxEoRp3XjhLCtGwj3jI l8vDNZy/eB1jAIRty141PYqFp12HMXbYkhcCQlgG/19KoZ7HWc67TG7qJk5Svfky65vR MJvyvrezaOBoVOoi+oHDkz+dZqpqgPmlBQZ9E26YM2qZ62bOveUR1qGm/YwiJzbKJz9W gZmFc03NMTcnCplfGPYC2yl5ttRLHHmS2cyKaClem21S2p5+pZVOWwCm40Ya9ZTYz5Hu 1OUQ== X-Gm-Message-State: AOAM530D8xPoaaNR79FNWf5nIGXG394gSKlHbOfscWu6bhQZnueLm010 C6ap8yYLSwe8t7XxbZlY/B1kTE4iGWxh1Q== X-Google-Smtp-Source: ABdhPJxeq3CgeDAbUbufE4hBolDaon4mXShFLLa1lpuZiCP0oPGlJLZkhT+F1RugDEVm3RRVAWS7HQ== X-Received: by 2002:a05:620a:24c9:: with SMTP id m9mr15099164qkn.317.1637180413071; Wed, 17 Nov 2021 12:20:13 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:12 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 13/17] btrfs: add send stream v2 definitions Date: Wed, 17 Nov 2021 12:19:23 -0800 Message-Id: <09f343aa74b5cb2ce0a715f90c71ae64ae74ce94.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval This adds the definitions of the new commands for send stream version 2 and their respective attributes: fallocate, FS_IOC_SETFLAGS (a.k.a. chattr), and encoded writes. It also documents two changes to the send stream format in v2: the receiver shouldn't assume a maximum command size, and the DATA attribute is encoded differently to allow for writes larger than 64k. These will be implemented in subsequent changes, and then the ioctl will accept the new version and flag. Reviewed-by: Josef Bacik Signed-off-by: Omar Sandoval --- fs/btrfs/send.c | 2 +- fs/btrfs/send.h | 35 +++++++++++++++++++++++++++++++++-- include/uapi/linux/btrfs.h | 7 +++++++ 3 files changed, 41 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 450c873684e8..53b3cc2276ea 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -7292,7 +7292,7 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) sctx->clone_roots_cnt = arg->clone_sources_count; - sctx->send_max_size = BTRFS_SEND_BUF_SIZE; + sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1; sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL); if (!sctx->send_buf) { ret = -ENOMEM; diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h index 59a4be3b09cd..50c2284f08af 100644 --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -12,7 +12,11 @@ #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream" #define BTRFS_SEND_STREAM_VERSION 1 -#define BTRFS_SEND_BUF_SIZE SZ_64K +/* + * In send stream v1, no command is larger than 64k. In send stream v2, no limit + * should be assumed. + */ +#define BTRFS_SEND_BUF_SIZE_V1 SZ_64K enum btrfs_tlv_type { BTRFS_TLV_U8, @@ -80,7 +84,10 @@ enum btrfs_send_cmd { BTRFS_SEND_C_MAX_V1 = BTRFS_SEND_C_UPDATE_EXTENT, /* Version 2 */ - BTRFS_SEND_C_MAX_V2 = BTRFS_SEND_C_MAX_V1, + BTRFS_SEND_C_FALLOCATE, + BTRFS_SEND_C_SETFLAGS, + BTRFS_SEND_C_ENCODED_WRITE, + BTRFS_SEND_C_MAX_V2 = BTRFS_SEND_C_ENCODED_WRITE, /* End */ __BTRFS_SEND_C_MAX, @@ -91,6 +98,7 @@ enum btrfs_send_cmd { enum { BTRFS_SEND_A_UNSPEC, + /* Version 1 */ BTRFS_SEND_A_UUID, BTRFS_SEND_A_CTRANSID, @@ -113,6 +121,11 @@ enum { BTRFS_SEND_A_PATH_LINK, BTRFS_SEND_A_FILE_OFFSET, + /* + * In send stream v2, this attribute is special: it must be the last + * attribute in a command, its header contains only the type, and its + * length is implicitly the remaining length of the command. + */ BTRFS_SEND_A_DATA, BTRFS_SEND_A_CLONE_UUID, @@ -120,7 +133,25 @@ enum { BTRFS_SEND_A_CLONE_PATH, BTRFS_SEND_A_CLONE_OFFSET, BTRFS_SEND_A_CLONE_LEN, + BTRFS_SEND_A_MAX_V1 = BTRFS_SEND_A_CLONE_LEN, + /* Version 2 */ + BTRFS_SEND_A_FALLOCATE_MODE, + + BTRFS_SEND_A_SETFLAGS_FLAGS, + + BTRFS_SEND_A_UNENCODED_FILE_LEN, + BTRFS_SEND_A_UNENCODED_LEN, + BTRFS_SEND_A_UNENCODED_OFFSET, + /* + * COMPRESSION and ENCRYPTION default to NONE (0) if omitted from + * BTRFS_SEND_C_ENCODED_WRITE. + */ + BTRFS_SEND_A_COMPRESSION, + BTRFS_SEND_A_ENCRYPTION, + BTRFS_SEND_A_MAX_V2 = BTRFS_SEND_A_ENCRYPTION, + + /* End */ __BTRFS_SEND_A_MAX, }; #define BTRFS_SEND_A_MAX (__BTRFS_SEND_A_MAX - 1) diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 7505acfa18d7..9d5fbe8c36c4 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -776,6 +776,13 @@ struct btrfs_ioctl_received_subvol_args { */ #define BTRFS_SEND_FLAG_VERSION 0x8 +/* + * Send compressed data using the ENCODED_WRITE command instead of decompressing + * the data and sending it with the WRITE command. This requires protocol + * version >= 2. + */ +#define BTRFS_SEND_FLAG_COMPRESSED 0x10 + #define BTRFS_SEND_FLAG_MASK \ (BTRFS_SEND_FLAG_NO_FILE_DATA | \ BTRFS_SEND_FLAG_OMIT_STREAM_HEADER | \ From patchwork Wed Nov 17 20:19:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10B30C433F5 for ; Wed, 17 Nov 2021 20:20:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EEF3D61AEC for ; Wed, 17 Nov 2021 20:20:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240683AbhKQUXX (ORCPT ); Wed, 17 Nov 2021 15:23:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240677AbhKQUXO (ORCPT ); Wed, 17 Nov 2021 15:23:14 -0500 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B6DCC061570 for ; Wed, 17 Nov 2021 12:20:15 -0800 (PST) Received: by mail-qt1-x835.google.com with SMTP id v22so3861507qtx.8 for ; Wed, 17 Nov 2021 12:20:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NYKW0sOXiErr7iXsY6pXtFa7lVn3lMfvSbAnD/gJyhA=; b=0wdJvZLGsRCD3FaUh+i7U74IuxYAm1HwakOP0YELIr+4q4o5B5aUemx9MpUqz6dzsF raB3DIzzeFIQ5mhQrU5CvR8JRxIBF8WAVntOgbOujApVTqkFM+NJ5E68DSsjkxOE7/s4 /EV6MbZmB552cOyvdGxw/4ZQUVlEDwoLyGxzxIiPTKEzCCyaNDcU7UHJu4xP5YBNJFBV 9fRj2+pw/G7jp7gDlRt8T7qZr+pg6qlJaPjtFpOJC/DDCc8BG9vGNRYhsSONCbLy2kn7 SOglMP21SH8fOXmu1fk2hqREc5PgcVS//jjX8/HRsjqpAuVmskrxamRpYk2K3f3F9wVa O18A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NYKW0sOXiErr7iXsY6pXtFa7lVn3lMfvSbAnD/gJyhA=; b=0SQT4GOp1cNTP4ZiozP3cbx0APYN++t5Pq5hxfajuQDIx9iCNEvPfwJsX62f7fewe8 Musd7pRM/n1ayNQ003dzVzGs2bqzeujUAf7N8mnmyx+jjYzFR6vc74Lm/xsda2FV7nnd hfNVHMw7XkhcckP3CsKNaJofmkVd3xiPfTdzERtimbTTfGcd8m3ZRAnSG4wjO3GctlXZ GPC+s5wgkN+MFl3jeb75ol91DKCXZzMJ8YsEilXygGkD3zEA05o/v85ssp2cuKpQcBtG WoCh6eAYKpQMNOIWu6QRS2SDnIRB3eA/B4ujs89kpNdZgbu1DSZCAnZa5DcaLYOWN1Rs 5ISg== X-Gm-Message-State: AOAM533bxTzPBmrxpmWinYJoly4XGGFG+39ECnis3Wvkj+1MBSnUkgFL mfmbPy52/+SUauv1l1v6uc9i/H3f/5PoRQ== X-Google-Smtp-Source: ABdhPJzrpTdgvwhwMT4hbyvwxNooNq65EGP39fGmHB+H3y6gMMQKQ2QlxCyUcJ+sxoqlIDezj+2Y9g== X-Received: by 2002:ac8:57ce:: with SMTP id w14mr19994518qta.252.1637180413987; Wed, 17 Nov 2021 12:20:13 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:13 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 14/17] btrfs: send: write larger chunks when using stream v2 Date: Wed, 17 Nov 2021 12:19:24 -0800 Message-Id: <051232485e4ac1a1a5fd35de7328208385c59f65.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval The length field of the send stream TLV header is 16 bits. This means that the maximum amount of data that can be sent for one write is 64k minus one. However, encoded writes must be able to send the maximum compressed extent (128k) in one command. To support this, send stream version 2 encodes the DATA attribute differently: it has no length field, and the length is implicitly up to the end of containing command (which has a 32-bit length field). Although this is necessary for encoded writes, normal writes can benefit from it, too. Also add a check to enforce that the DATA attribute is last. It is only strictly necessary for v2, but we might as well make v1 consistent with it. For v2, let's bump up the send buffer to the maximum compressed extent size plus 16k for the other metadata (144k total). Since this will most likely be vmalloc'd (and always will be after the next commit), we round it up to the next page since we might as well use the rest of the page on systems with >16k pages. Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- fs/btrfs/send.c | 42 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 34 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 53b3cc2276ea..12844cb20584 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,6 +81,7 @@ struct send_ctx { char *send_buf; u32 send_size; u32 send_max_size; + bool put_data; u64 flags; /* 'flags' member of btrfs_ioctl_send_args is u64 */ /* Protocol version compatibility requested */ u32 proto; @@ -584,6 +585,9 @@ static int tlv_put(struct send_ctx *sctx, u16 attr, const void *data, int len) int total_len = sizeof(*hdr) + len; int left = sctx->send_max_size - sctx->send_size; + if (WARN_ON_ONCE(sctx->put_data)) + return -EINVAL; + if (unlikely(left < total_len)) return -EOVERFLOW; @@ -721,6 +725,7 @@ static int send_cmd(struct send_ctx *sctx) &sctx->send_off); sctx->send_size = 0; + sctx->put_data = false; return ret; } @@ -4902,14 +4907,30 @@ static inline u64 max_send_read_size(const struct send_ctx *sctx) static int put_data_header(struct send_ctx *sctx, u32 len) { - struct btrfs_tlv_header *hdr; + if (WARN_ON_ONCE(sctx->put_data)) + return -EINVAL; + sctx->put_data = true; + if (sctx->proto >= 2) { + /* + * In v2, the data attribute header doesn't include a length; it + * is implicitly to the end of the command. + */ + if (sctx->send_max_size - sctx->send_size < 2 + len) + return -EOVERFLOW; + put_unaligned_le16(BTRFS_SEND_A_DATA, + sctx->send_buf + sctx->send_size); + sctx->send_size += 2; + } else { + struct btrfs_tlv_header *hdr; - if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len) - return -EOVERFLOW; - hdr = (struct btrfs_tlv_header *)(sctx->send_buf + sctx->send_size); - put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type); - put_unaligned_le16(len, &hdr->tlv_len); - sctx->send_size += sizeof(*hdr); + if (sctx->send_max_size - sctx->send_size < sizeof(*hdr) + len) + return -EOVERFLOW; + hdr = (struct btrfs_tlv_header *)(sctx->send_buf + + sctx->send_size); + put_unaligned_le16(BTRFS_SEND_A_DATA, &hdr->tlv_type); + put_unaligned_le16(len, &hdr->tlv_len); + sctx->send_size += sizeof(*hdr); + } return 0; } @@ -7292,7 +7313,12 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) sctx->clone_roots_cnt = arg->clone_sources_count; - sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1; + if (sctx->proto >= 2) { + sctx->send_max_size = ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED, + PAGE_SIZE); + } else { + sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1; + } sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL); if (!sctx->send_buf) { ret = -ENOMEM; From patchwork Wed Nov 17 20:19:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00D7EC433EF for ; Wed, 17 Nov 2021 20:20:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E00EC61B43 for ; Wed, 17 Nov 2021 20:20:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240699AbhKQUX1 (ORCPT ); Wed, 17 Nov 2021 15:23:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53108 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240680AbhKQUXP (ORCPT ); Wed, 17 Nov 2021 15:23:15 -0500 Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C706C061764 for ; Wed, 17 Nov 2021 12:20:16 -0800 (PST) Received: by mail-qt1-x82c.google.com with SMTP id m25so3837418qtq.13 for ; Wed, 17 Nov 2021 12:20:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=K5Zki3IDt4r6PccCuQgtvcWDAYx/BgmU2YdgzEIORh8=; b=JNCQIDUtEmyVP0r123eqEtSUSjuERhrlbyt5BKU3uEiZwmxwrIKfq4Skcxd7ddyjmv A7crQDBrVA/76vqLmPwgvQqYLbbMqI8bXh+CPJQhYwmRMQRGmj+20UwWOtmg136M5WSS 6JotuPUYvONPn6HuUudaLoHNIFQEV4tT5KTkVyeHbyNcayd1WncjYrjTblh7kYZWOeMg A9Sp4JaiBXGVhHf5N3LLWQUbWOSbfxBkFQth9Anx9L7WSmsimJFhhKVypjiaEttF83Lx tlo0/qv+IlNzklGrxA/z+gLpOoejsYaUWlhs8Un4OvrgmbFmiKpapQQwtx94QS4PwEQl 2PBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=K5Zki3IDt4r6PccCuQgtvcWDAYx/BgmU2YdgzEIORh8=; b=5qszRgxj1nm11Jhr4YKsFd5qb8qKeDt/XHof/1HC0++iitOWfzVN6VsPZu88IWVcr3 Bx61jhU09+K9DNbDhrUd8pUrhYGIdK/gEDG3qty8x7BWKk8Kw85l2HfqnhSIc/q5Io+h 6C8qWytBwNULBSGJgz8OASBJ5TS2wRJnYy2jyoEFK7b24fi/sVXwwO0AxBpRZY/b3pVb DUj68j1UBBDa8GuLwjdANBe+6VtJD4pgMJMuPo0ydPf37On5M0MQPZ26mFmDpGsVzYuO iA+P50OB93z58BTSyH0fHGgKQeRw0EhOFLxiTlRDTTtmKDl7LmsDU4c1eBqh8TQFSndW A6lg== X-Gm-Message-State: AOAM5315IdFeTr2paLeDACaFCx3OYWft0JATPpXKYYgdzZ0sNTM4wWJi QPpW+wLosf1We1htPm8IwR++GcvEUr2VVA== X-Google-Smtp-Source: ABdhPJyzb8E1VJ9VX0VbLLyKqh8Zk2wKKNcCOY41eUjkf3FWEWuoaZhX/gBSArzs1MUInf/uG71i9A== X-Received: by 2002:ac8:5c50:: with SMTP id j16mr20418863qtj.255.1637180414922; Wed, 17 Nov 2021 12:20:14 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:14 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 15/17] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Date: Wed, 17 Nov 2021 12:19:25 -0800 Message-Id: <0a04aed8ad4bfd2137afbf5a5ffb6f07080c3d31.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval For encoded writes, we need the raw pages for reading compressed data directly via a bio. So, replace kvmalloc() with vmap() so we have access to the raw pages. 144k is large enough that it usually gets allocated with vmalloc(), anyways. Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- fs/btrfs/send.c | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 12844cb20584..8493335ef47a 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -82,6 +82,7 @@ struct send_ctx { u32 send_size; u32 send_max_size; bool put_data; + struct page **send_buf_pages; u64 flags; /* 'flags' member of btrfs_ioctl_send_args is u64 */ /* Protocol version compatibility requested */ u32 proto; @@ -7225,6 +7226,7 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) struct btrfs_root *clone_root; struct send_ctx *sctx = NULL; u32 i; + u32 send_buf_num_pages = 0; u64 *clone_sources_tmp = NULL; int clone_sources_to_rollback = 0; size_t alloc_size; @@ -7316,10 +7318,28 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) if (sctx->proto >= 2) { sctx->send_max_size = ALIGN(SZ_16K + BTRFS_MAX_COMPRESSED, PAGE_SIZE); + send_buf_num_pages = sctx->send_max_size >> PAGE_SHIFT; + sctx->send_buf_pages = kcalloc(send_buf_num_pages, + sizeof(*sctx->send_buf_pages), + GFP_KERNEL); + if (!sctx->send_buf_pages) { + send_buf_num_pages = 0; + ret = -ENOMEM; + goto out; + } + for (i = 0; i < send_buf_num_pages; i++) { + sctx->send_buf_pages[i] = alloc_page(GFP_KERNEL); + if (!sctx->send_buf_pages[i]) { + ret = -ENOMEM; + goto out; + } + } + sctx->send_buf = vmap(sctx->send_buf_pages, send_buf_num_pages, + VM_MAP, PAGE_KERNEL); } else { sctx->send_max_size = BTRFS_SEND_BUF_SIZE_V1; + sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL); } - sctx->send_buf = kvmalloc(sctx->send_max_size, GFP_KERNEL); if (!sctx->send_buf) { ret = -ENOMEM; goto out; @@ -7526,7 +7546,16 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) fput(sctx->send_filp); kvfree(sctx->clone_roots); - kvfree(sctx->send_buf); + if (sctx->proto >= 2) { + vunmap(sctx->send_buf); + for (i = 0; i < send_buf_num_pages; i++) { + if (sctx->send_buf_pages[i]) + __free_page(sctx->send_buf_pages[i]); + } + kfree(sctx->send_buf_pages); + } else { + kvfree(sctx->send_buf); + } name_cache_free(sctx); From patchwork Wed Nov 17 20:19:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03C9DC433EF for ; Wed, 17 Nov 2021 20:20:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DFFA561AF0 for ; Wed, 17 Nov 2021 20:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240695AbhKQUXZ (ORCPT ); Wed, 17 Nov 2021 15:23:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240682AbhKQUXQ (ORCPT ); Wed, 17 Nov 2021 15:23:16 -0500 Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com [IPv6:2607:f8b0:4864:20::831]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 262D1C061570 for ; Wed, 17 Nov 2021 12:20:17 -0800 (PST) Received: by mail-qt1-x831.google.com with SMTP id l8so3879513qtk.6 for ; Wed, 17 Nov 2021 12:20:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=31OBZqgxa9l00VrefdLfSm02IqlsQlTwMAg8qYGjpSs=; b=WcRAb9uM/eTratyqcgarj7Vm90KkhyKYKA1X0tGva41FaUJZLHcfjm9rXsSicW0DXX qRhv9XZB4ccu5b1z5ab5ICLkZJgVhPOu3pb6w5IWPV0foMZ5k1r0e98kiMdq2Wroh7jy 4D756qWl6ydUmVV/4WJJFmbzJgFHjgRrZ/f7HIf7SxDphyE/gXRVA1294d/7plu1xYG1 6WyTABcl1zPTfThJLEJmJX4CtH3VayducZLUk4e9QkIaTu+LCucTVt+8FHlg1q4Z3+PD XQz8bHsmg2TGsd7riP5O7Ap75RpzsxiOPBawgHo0jE1vDhbysd3H3za1utL4ph5jxU0x cy0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=31OBZqgxa9l00VrefdLfSm02IqlsQlTwMAg8qYGjpSs=; b=6szn6eYVbPtGqoYjwXeYSNNhu4s2m2M+I2ipe17FiOd2jC9gAwzmBSqoboIp3ivZzT dKsyBEKBLynOhMUGP6Y1EapEeNP9U5ANkl5weXJtsg2inOW2WyVuFPfEOuzZM06jlvIg vX4ayYHj3/LSYTkQ4WQhXcu3EFfxKFbpC4l3X+5Cf8pCv/hcc80kLfLfVd6vM49V+ukd QV6TGd/VRsQGY3P/C1PwJyf73UIG4sVH2YVjqEFrfQavB21ezBY7H/MB0Bu2QRy1waD7 rUEcR1PDNmPvEjps3ieNSC1UNdDeFgSrG8iQS6mDTDdJ4sFc+r6hl+uVJO+ys9pKmnU7 8B+Q== X-Gm-Message-State: AOAM531+Dqo55E5TymtGox4doLJwIRmHRC6k1SW7QqBo3BP6x0aIYM8U u8CFb5vaLMekRFLemSNaggGwtUgMbfoCvQ== X-Google-Smtp-Source: ABdhPJzhCvbdAHOGM1//9DphKYXxVsGQGdTDvdaDOEKYEhQt5I7Z4hAqzJDDRPoKZpEybzP1OzL/pA== X-Received: by 2002:a05:622a:3c9:: with SMTP id k9mr20141567qtx.42.1637180415803; Wed, 17 Nov 2021 12:20:15 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:15 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 16/17] btrfs: send: send compressed extents with encoded writes Date: Wed, 17 Nov 2021 12:19:26 -0800 Message-Id: <56cc5e7ef33c02e0e0cf5f2a4b9f8534178bda2a.1637179348.git.osandov@fb.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Now that all of the pieces are in place, we can use the ENCODED_WRITE command to send compressed extents when appropriate. Signed-off-by: Omar Sandoval --- fs/btrfs/ctree.h | 4 + fs/btrfs/inode.c | 10 +- fs/btrfs/send.c | 234 +++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 225 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 98defa9cce0e..af1bb06f9ca0 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3275,6 +3275,10 @@ int btrfs_writepage_cow_fixup(struct page *page); void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode, struct page *page, u64 start, u64 end, bool uptodate); +int btrfs_encoded_io_compression_from_extent(int compress_type); +int btrfs_encoded_read_regular_fill_pages(struct inode *inode, u64 file_offset, + u64 disk_bytenr, u64 disk_io_size, + struct page **pages); struct btrfs_ioctl_encoded_io_args; ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter, struct btrfs_ioctl_encoded_io_args *encoded); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6919dc170c93..92e1dea7af9a 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -10527,7 +10527,7 @@ void btrfs_set_range_writeback(struct btrfs_inode *inode, u64 start, u64 end) } } -static int btrfs_encoded_io_compression_from_extent(int compress_type) +int btrfs_encoded_io_compression_from_extent(int compress_type) { switch (compress_type) { case BTRFS_COMPRESS_NONE: @@ -10731,11 +10731,9 @@ static void btrfs_encoded_read_endio(struct bio *bio) bio_put(bio); } -static int btrfs_encoded_read_regular_fill_pages(struct inode *inode, - u64 file_offset, - u64 disk_bytenr, - u64 disk_io_size, - struct page **pages) +int btrfs_encoded_read_regular_fill_pages(struct inode *inode, u64 file_offset, + u64 disk_bytenr, u64 disk_io_size, + struct page **pages) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_encoded_read_private priv = { diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 8493335ef47a..7a6a23a63950 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -609,6 +609,7 @@ static int tlv_put(struct send_ctx *sctx, u16 attr, const void *data, int len) return tlv_put(sctx, attr, &__tmp, sizeof(__tmp)); \ } +TLV_PUT_DEFINE_INT(32) TLV_PUT_DEFINE_INT(64) static int tlv_put_string(struct send_ctx *sctx, u16 attr, @@ -5205,16 +5206,215 @@ static int send_hole(struct send_ctx *sctx, u64 end) return ret; } -static int send_extent_data(struct send_ctx *sctx, - const u64 offset, - const u64 len) +static int send_encoded_inline_extent(struct send_ctx *sctx, + struct btrfs_path *path, u64 offset, + u64 len) { + struct btrfs_root *root = sctx->send_root; + struct btrfs_fs_info *fs_info = root->fs_info; + struct inode *inode; + struct fs_path *p; + struct extent_buffer *leaf = path->nodes[0]; + struct btrfs_key key; + struct btrfs_file_extent_item *ei; + u64 ram_bytes; + size_t inline_size; + int ret; + + inode = btrfs_iget(fs_info->sb, sctx->cur_ino, root); + if (IS_ERR(inode)) + return PTR_ERR(inode); + + p = fs_path_alloc(); + if (!p) { + ret = -ENOMEM; + goto out; + } + + ret = begin_cmd(sctx, BTRFS_SEND_C_ENCODED_WRITE); + if (ret < 0) + goto out; + + ret = get_cur_path(sctx, sctx->cur_ino, sctx->cur_inode_gen, p); + if (ret < 0) + goto out; + + btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); + ei = btrfs_item_ptr(leaf, path->slots[0], + struct btrfs_file_extent_item); + ram_bytes = btrfs_file_extent_ram_bytes(leaf, ei); + inline_size = btrfs_file_extent_inline_item_len(leaf, path->slots[0]); + + TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p); + TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset); + TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_FILE_LEN, + min(key.offset + ram_bytes - offset, len)); + TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_LEN, ram_bytes); + TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_OFFSET, offset - key.offset); + ret = btrfs_encoded_io_compression_from_extent( + btrfs_file_extent_compression(leaf, ei)); + if (ret < 0) + goto out; + TLV_PUT_U32(sctx, BTRFS_SEND_A_COMPRESSION, ret); + + ret = put_data_header(sctx, inline_size); + if (ret < 0) + goto out; + read_extent_buffer(leaf, sctx->send_buf + sctx->send_size, + btrfs_file_extent_inline_start(ei), inline_size); + sctx->send_size += inline_size; + + ret = send_cmd(sctx); + +tlv_put_failure: +out: + fs_path_free(p); + iput(inode); + return ret; +} + +static int send_encoded_extent(struct send_ctx *sctx, struct btrfs_path *path, + u64 offset, u64 len) +{ + struct btrfs_root *root = sctx->send_root; + struct btrfs_fs_info *fs_info = root->fs_info; + struct inode *inode; + struct fs_path *p; + struct extent_buffer *leaf = path->nodes[0]; + struct btrfs_key key; + struct btrfs_file_extent_item *ei; + u64 disk_bytenr, disk_num_bytes; + u32 data_offset; + struct btrfs_cmd_header *hdr; + u32 crc; + int ret; + + inode = btrfs_iget(fs_info->sb, sctx->cur_ino, root); + if (IS_ERR(inode)) + return PTR_ERR(inode); + + p = fs_path_alloc(); + if (!p) { + ret = -ENOMEM; + goto out; + } + + ret = begin_cmd(sctx, BTRFS_SEND_C_ENCODED_WRITE); + if (ret < 0) + goto out; + + ret = get_cur_path(sctx, sctx->cur_ino, sctx->cur_inode_gen, p); + if (ret < 0) + goto out; + + btrfs_item_key_to_cpu(leaf, &key, path->slots[0]); + ei = btrfs_item_ptr(leaf, path->slots[0], + struct btrfs_file_extent_item); + disk_bytenr = btrfs_file_extent_disk_bytenr(leaf, ei); + disk_num_bytes = btrfs_file_extent_disk_num_bytes(leaf, ei); + + TLV_PUT_PATH(sctx, BTRFS_SEND_A_PATH, p); + TLV_PUT_U64(sctx, BTRFS_SEND_A_FILE_OFFSET, offset); + TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_FILE_LEN, + min(key.offset + btrfs_file_extent_num_bytes(leaf, ei) - offset, + len)); + TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_LEN, + btrfs_file_extent_ram_bytes(leaf, ei)); + TLV_PUT_U64(sctx, BTRFS_SEND_A_UNENCODED_OFFSET, + offset - key.offset + btrfs_file_extent_offset(leaf, ei)); + ret = btrfs_encoded_io_compression_from_extent( + btrfs_file_extent_compression(leaf, ei)); + if (ret < 0) + goto out; + TLV_PUT_U32(sctx, BTRFS_SEND_A_COMPRESSION, ret); + TLV_PUT_U32(sctx, BTRFS_SEND_A_ENCRYPTION, 0); + + ret = put_data_header(sctx, disk_num_bytes); + if (ret < 0) + goto out; + + /* + * We want to do I/O directly into the send buffer, so get the next page + * boundary in the send buffer. This means that there may be a gap + * between the beginning of the command and the file data. + */ + data_offset = ALIGN(sctx->send_size, PAGE_SIZE); + if (data_offset > sctx->send_max_size || + sctx->send_max_size - data_offset < disk_num_bytes) { + ret = -EOVERFLOW; + goto out; + } + + /* + * Note that send_buf is a mapping of send_buf_pages, so this is really + * reading into send_buf. + */ + ret = btrfs_encoded_read_regular_fill_pages(inode, offset, disk_bytenr, + disk_num_bytes, + sctx->send_buf_pages + + (data_offset >> PAGE_SHIFT)); + if (ret) + goto out; + + hdr = (struct btrfs_cmd_header *)sctx->send_buf; + hdr->len = cpu_to_le32(sctx->send_size + disk_num_bytes - sizeof(*hdr)); + hdr->crc = 0; + crc = btrfs_crc32c(0, sctx->send_buf, sctx->send_size); + crc = btrfs_crc32c(crc, sctx->send_buf + data_offset, disk_num_bytes); + hdr->crc = cpu_to_le32(crc); + + ret = write_buf(sctx->send_filp, sctx->send_buf, sctx->send_size, + &sctx->send_off); + if (!ret) { + ret = write_buf(sctx->send_filp, sctx->send_buf + data_offset, + disk_num_bytes, &sctx->send_off); + } + sctx->send_size = 0; + sctx->put_data = false; + +tlv_put_failure: +out: + fs_path_free(p); + iput(inode); + return ret; +} + +static int send_extent_data(struct send_ctx *sctx, struct btrfs_path *path, + const u64 offset, const u64 len) +{ + struct extent_buffer *leaf = path->nodes[0]; + struct btrfs_file_extent_item *ei; u64 read_size = max_send_read_size(sctx); u64 sent = 0; if (sctx->flags & BTRFS_SEND_FLAG_NO_FILE_DATA) return send_update_extent(sctx, offset, len); + ei = btrfs_item_ptr(leaf, path->slots[0], + struct btrfs_file_extent_item); + if ((sctx->flags & BTRFS_SEND_FLAG_COMPRESSED) && + btrfs_file_extent_compression(leaf, ei) != BTRFS_COMPRESS_NONE) { + bool is_inline = (btrfs_file_extent_type(leaf, ei) == + BTRFS_FILE_EXTENT_INLINE); + + /* + * Send the compressed extent unless the compressed data is + * larger than the decompressed data. This can happen if we're + * not sending the entire extent, either because it has been + * partially overwritten/truncated or because this is a part of + * the extent that we couldn't clone in clone_range(). + */ + if (is_inline && + btrfs_file_extent_inline_item_len(leaf, + path->slots[0]) <= len) { + return send_encoded_inline_extent(sctx, path, offset, + len); + } else if (!is_inline && + btrfs_file_extent_disk_num_bytes(leaf, ei) <= len) { + return send_encoded_extent(sctx, path, offset, len); + } + } + while (sent < len) { u64 size = min(len - sent, read_size); int ret; @@ -5285,12 +5485,9 @@ static int send_capabilities(struct send_ctx *sctx) return ret; } -static int clone_range(struct send_ctx *sctx, - struct clone_root *clone_root, - const u64 disk_byte, - u64 data_offset, - u64 offset, - u64 len) +static int clone_range(struct send_ctx *sctx, struct btrfs_path *dst_path, + struct clone_root *clone_root, const u64 disk_byte, + u64 data_offset, u64 offset, u64 len) { struct btrfs_path *path; struct btrfs_key key; @@ -5314,7 +5511,7 @@ static int clone_range(struct send_ctx *sctx, */ if (clone_root->offset == 0 && len == sctx->send_root->fs_info->sectorsize) - return send_extent_data(sctx, offset, len); + return send_extent_data(sctx, dst_path, offset, len); path = alloc_path_for_send(); if (!path) @@ -5411,7 +5608,8 @@ static int clone_range(struct send_ctx *sctx, if (hole_len > len) hole_len = len; - ret = send_extent_data(sctx, offset, hole_len); + ret = send_extent_data(sctx, dst_path, offset, + hole_len); if (ret < 0) goto out; @@ -5484,14 +5682,16 @@ static int clone_range(struct send_ctx *sctx, if (ret < 0) goto out; } - ret = send_extent_data(sctx, offset + slen, + ret = send_extent_data(sctx, dst_path, + offset + slen, clone_len - slen); } else { ret = send_clone(sctx, offset, clone_len, clone_root); } } else { - ret = send_extent_data(sctx, offset, clone_len); + ret = send_extent_data(sctx, dst_path, offset, + clone_len); } if (ret < 0) @@ -5523,7 +5723,7 @@ static int clone_range(struct send_ctx *sctx, } if (len > 0) - ret = send_extent_data(sctx, offset, len); + ret = send_extent_data(sctx, dst_path, offset, len); else ret = 0; out: @@ -5554,10 +5754,10 @@ static int send_write_or_clone(struct send_ctx *sctx, struct btrfs_file_extent_item); disk_byte = btrfs_file_extent_disk_bytenr(path->nodes[0], ei); data_offset = btrfs_file_extent_offset(path->nodes[0], ei); - ret = clone_range(sctx, clone_root, disk_byte, data_offset, - offset, end - offset); + ret = clone_range(sctx, path, clone_root, disk_byte, + data_offset, offset, end - offset); } else { - ret = send_extent_data(sctx, offset, end - offset); + ret = send_extent_data(sctx, path, offset, end - offset); } sctx->cur_inode_next_write_offset = end; return ret; From patchwork Wed Nov 17 20:19:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 12625359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64671C433FE for ; Wed, 17 Nov 2021 20:20:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CEE161B43 for ; Wed, 17 Nov 2021 20:20:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240701AbhKQUX3 (ORCPT ); Wed, 17 Nov 2021 15:23:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240684AbhKQUXQ (ORCPT ); Wed, 17 Nov 2021 15:23:16 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF6B0C061766 for ; Wed, 17 Nov 2021 12:20:17 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id t34so3862665qtc.7 for ; Wed, 17 Nov 2021 12:20:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YZiABG78Z6NCxyx4F8b5qP9JCofCOP7XTYPRaUs8yaA=; b=3cjQ5gzyaW+s+0LgAdtOudktUM8NK65ifyegjteuhtr5b51ai9LwCjQUpyMfMIMbaS 1lRRKI0G4p3SL//pG+d/zqQm1bCZ5S2ohC0RXNP3OzXhuC02bcbs6jpuOnLnzGX+ZOoB +c0e2c4Y76mK3AHQkr9IMD5PF8cSVtuA595mwsEw8ZN4gqW0n9CSULoSEkraFC/jMoy+ 2z9x7Qz9r+dG5zRYhrP3mtlMAD171jqlPZyLmwG1/sPOGrnVMsAreIBRxDVlsTqwsuQn 0SThhxkh4aFc3G+J/n4D9YxjGDjUMk7qsF3cABVVO1WBNXxefPSMRYlZXtS58IBG1NQO fF6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YZiABG78Z6NCxyx4F8b5qP9JCofCOP7XTYPRaUs8yaA=; b=h4xMSV3ftT3ur0Dq2JEAXAFs5ucP1Sb4veHdTAsFTct5UxX3Y6FROQX57dMSShZeng QP/gAaJ/nQ1m88i0iX1fQj0AJWd0kw8HpUgtkaSz1wLJi+wAb8AJ6pZ5mh3RMLgS7Din nG+g5ogNeoDwgel0Kfam/X/XhDcGpBzWLdCqtvWVIjTj5c/TIHINK5P0K3d3Ns1gWZsZ g4JznjLMFMZQxiML1Xi+D1ULLn/K3sRh6KaUBk9/gTY7H/xj40tJZSdQDj3SEYXhfHCz JwEqEE8hheWZaAbB0Jttt/ZrTxK/Tbh8XHMPzlNhkrLoIDudXCeBb9c8h+BUOkdtpZfO xKQA== X-Gm-Message-State: AOAM532Jr/4rFt6Wg8qXnuPfdUcgsUGFR29vNOwj6Y46jH4u9IBVTuwt /pCi8uncrC4BidrOkH+Fhf8jWeRPGWXWMw== X-Google-Smtp-Source: ABdhPJx2+eo8lp+abaNDbuBlmvNF3ln7wPVVjMStOG+GFFaEXo/eBO92TCDwR/HGPdIudX9lzrQiJQ== X-Received: by 2002:a05:622a:1888:: with SMTP id v8mr19662221qtc.206.1637180416826; Wed, 17 Nov 2021 12:20:16 -0800 (PST) Received: from relinquished.tfbnw.net ([2620:10d:c091:480::1:153e]) by smtp.gmail.com with ESMTPSA id bj1sm438074qkb.75.2021.11.17.12.20.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 12:20:16 -0800 (PST) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com Subject: [PATCH v12 17/17] btrfs: send: enable support for stream v2 and compressed writes Date: Wed, 17 Nov 2021 12:19:27 -0800 Message-Id: X-Mailer: git-send-email 2.34.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Omar Sandoval Now that the new support is implemented, allow the ioctl to accept the v2 and the compressed flag, and update the version in sysfs. Signed-off-by: Omar Sandoval --- fs/btrfs/send.c | 7 +++++-- fs/btrfs/send.h | 2 +- include/uapi/linux/btrfs.h | 3 ++- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index 7a6a23a63950..e199b85e9c2c 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -685,8 +685,7 @@ static int send_header(struct send_ctx *sctx) struct btrfs_stream_header hdr; strcpy(hdr.magic, BTRFS_SEND_STREAM_MAGIC); - hdr.version = cpu_to_le32(BTRFS_SEND_STREAM_VERSION); - + hdr.version = cpu_to_le32(sctx->proto); return write_buf(sctx->send_filp, &hdr, sizeof(hdr), &sctx->send_off); } @@ -7496,6 +7495,10 @@ long btrfs_ioctl_send(struct file *mnt_file, struct btrfs_ioctl_send_args *arg) } else { sctx->proto = 1; } + if ((arg->flags & BTRFS_SEND_FLAG_COMPRESSED) && sctx->proto < 2) { + ret = -EINVAL; + goto out; + } sctx->send_filp = fget(arg->send_fd); if (!sctx->send_filp) { diff --git a/fs/btrfs/send.h b/fs/btrfs/send.h index 50c2284f08af..cf15d4078ac6 100644 --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -10,7 +10,7 @@ #include "ctree.h" #define BTRFS_SEND_STREAM_MAGIC "btrfs-stream" -#define BTRFS_SEND_STREAM_VERSION 1 +#define BTRFS_SEND_STREAM_VERSION 2 /* * In send stream v1, no command is larger than 64k. In send stream v2, no limit diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 9d5fbe8c36c4..2533da0a0de1 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -787,7 +787,8 @@ struct btrfs_ioctl_received_subvol_args { (BTRFS_SEND_FLAG_NO_FILE_DATA | \ BTRFS_SEND_FLAG_OMIT_STREAM_HEADER | \ BTRFS_SEND_FLAG_OMIT_END_CMD | \ - BTRFS_SEND_FLAG_VERSION) + BTRFS_SEND_FLAG_VERSION | \ + BTRFS_SEND_FLAG_COMPRESSED) struct btrfs_ioctl_send_args { __s64 send_fd; /* in */