From patchwork Sat Mar 2 07:41:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579470 Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE9F715EB0; Sat, 2 Mar 2024 07:42:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365355; cv=none; b=YY9+LSYBvEw0BMtOliP0hm6dbwSUuMKd0ju4W0kk4glIvJRjl3iMNO9UM0gUGmNWU7edBuRXX+aYeVjR/teEQLg7t2Mj/i87EWSUU9DnRvH64MbuzsOKYr2JZ6kJvhh9APdBQT2aHFk+AuN6dmfNATok4++0vaOYHnb1EaR7NQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365355; c=relaxed/simple; bh=LIpOUewW6H0QEPuqhwnEO0ogdkFsfUKAZVbggnMfLIc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HNGfyJRCy2n/TMnZY+SAvdl+Z4wzb0D2peGV9cz6D3Aqv1FlhVLiIEmrKiROaate7SxCWWlMs+0Fw8KgBNp756ZUXNVe8JXMDfNl4stXKbAfmCayy2jgyiRacEcC9TsJatMivKr+zeND1tgF21L4EZbXDa2hfX5syYG2eDmrazs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kD8QL9ug; arc=none smtp.client-ip=209.85.166.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kD8QL9ug" Received: by mail-io1-f47.google.com with SMTP id ca18e2360f4ac-7c835cfbff3so10486139f.2; Fri, 01 Mar 2024 23:42:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365352; x=1709970152; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Pc6WGO5vTk7t7wqxhEwSHshxa9m7UoB9H431spDMix0=; b=kD8QL9ugqah003AJOzkxPUuA9R16erVsMI6GV09V4W53uCTtFXpGn2y+YZx/C692Q0 aztE4YT9Hny264bjY+hYWvE66oSdy9S3z272Tb2SAW9JRA29cuN1WTMkBrVdGGBJCrFW dEZjB64N9SR11aRgDowJZ+5mHXwQVPyKCaD6DbLbBL0pGUZero2woOEeH5XYPs0QYGk7 ZvHqbVI3eu6TB5n1rvNfTI5SzJqLTCLJYLV0221oBaGUOO0YYECOmHwMkj+9gYBjcd9r XsF3Gl+VLn5jw1g83Q/KoOk2uDlWrzruCXiqUTEJK+C1xdNaHCGc13hKwt39wYyu7qd8 Geaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365352; x=1709970152; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Pc6WGO5vTk7t7wqxhEwSHshxa9m7UoB9H431spDMix0=; b=aNyWq4gEioBNZ4cIWWPngvJVQsgecVqIa+5SnQr1HgIZ20f5PdUB/rTkMUpQqKNPll gWPWnspSgfP66I8bcaY+Hygk3vxvc9KxW/nFOGMi0dE5pYrVJxjVhziSY18Wf/R0Jzpe ySkSicOuXS4gLq2Tt2jLNBnrLRZ/KLAYMPq7Nnt99lLALdm2tUbYYTwz5+rQF7UNzKl7 B7jCEpGqOWpEZSrVXCxeLCoYTj3pRoQMOp4GCVYaFCaw3vL4IQaq/Ya+cuCVs27MsU8Y M7SkVZdbiyY4QCJ9yMwCoKv8eoLL6UIrlCI/1MzgzHmiSe/JHGMHDWac7FCa6iF9gunB CnHg== X-Forwarded-Encrypted: i=1; AJvYcCU3dcVghzu+BXhSZDqzB2V4xUfo5imKUvPIiS4+Rc6yEGb+ikXAdE7oQTbY7T4TDjXAdRaUGOD4JKW4l70qs4fcwlV3ROopBIQ7bSvf/4kJLqbnQNqxSPRUk9OVuOxYq/XnEIGKQbO7yg== X-Gm-Message-State: AOJu0YzV9SRgaVrmS0mfisUUa04HI4xNIsipkr/V0i9JpPhoabTEuU4b rFwrpgD6anZQHLlx9o4VzfKXKr2gdXsNeB7CDGBidwmxQYmtR42zfHIig/lP X-Google-Smtp-Source: AGHT+IFexBjPo1O7MXA8MlmaBL8/rMoklMaPJFqswlDPflcmkfUuodxfPLKXNe++1wFCaURWTwoB8A== X-Received: by 2002:a05:6e02:218d:b0:365:259b:711e with SMTP id j13-20020a056e02218d00b00365259b711emr4972467ila.5.1709365352534; Fri, 01 Mar 2024 23:42:32 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:31 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 2/8] fs: Reserve inode flag FS_ATOMICWRITES_FL for atomic writes Date: Sat, 2 Mar 2024 13:11:59 +0530 Message-ID: <4c687c1c5322b4eaf0bb173f0b5d58b38fdaa847.1709361537.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This reserves FS_ATOMICWRITES_FL for flags and adds support in fileattr to support atomic writes flag & xflag needed for ext4 and xfs. Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/ioctl.c | 4 ++++ include/linux/fileattr.h | 4 ++-- include/uapi/linux/fs.h | 1 + 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/ioctl.c b/fs/ioctl.c index 76cf22ac97d7..e0f7fae4777e 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -481,6 +481,8 @@ void fileattr_fill_xflags(struct fileattr *fa, u32 xflags) fa->flags |= FS_DAX_FL; if (fa->fsx_xflags & FS_XFLAG_PROJINHERIT) fa->flags |= FS_PROJINHERIT_FL; + if (fa->fsx_xflags & FS_XFLAG_ATOMICWRITES) + fa->flags |= FS_ATOMICWRITES_FL; } EXPORT_SYMBOL(fileattr_fill_xflags); @@ -511,6 +513,8 @@ void fileattr_fill_flags(struct fileattr *fa, u32 flags) fa->fsx_xflags |= FS_XFLAG_DAX; if (fa->flags & FS_PROJINHERIT_FL) fa->fsx_xflags |= FS_XFLAG_PROJINHERIT; + if (fa->flags & FS_ATOMICWRITES_FL) + fa->fsx_xflags |= FS_XFLAG_ATOMICWRITES; } EXPORT_SYMBOL(fileattr_fill_flags); diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h index 47c05a9851d0..ae9329afa46b 100644 --- a/include/linux/fileattr.h +++ b/include/linux/fileattr.h @@ -7,12 +7,12 @@ #define FS_COMMON_FL \ (FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \ FS_NODUMP_FL | FS_NOATIME_FL | FS_DAX_FL | \ - FS_PROJINHERIT_FL) + FS_PROJINHERIT_FL | FS_ATOMICWRITES_FL) #define FS_XFLAG_COMMON \ (FS_XFLAG_SYNC | FS_XFLAG_IMMUTABLE | FS_XFLAG_APPEND | \ FS_XFLAG_NODUMP | FS_XFLAG_NOATIME | FS_XFLAG_DAX | \ - FS_XFLAG_PROJINHERIT) + FS_XFLAG_PROJINHERIT | FS_XFLAG_ATOMICWRITES) /* * Merged interface for miscellaneous file attributes. 'flags' originates from diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index b5b4e1db9576..17f52530f9c8 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -264,6 +264,7 @@ struct fsxattr { #define FS_EA_INODE_FL 0x00200000 /* Inode used for large EA */ #define FS_EOFBLOCKS_FL 0x00400000 /* Reserved for ext4 */ #define FS_NOCOW_FL 0x00800000 /* Do not cow file */ +#define FS_ATOMICWRITES_FL 0x01000000 /* Inode supports atomic writes */ #define FS_DAX_FL 0x02000000 /* Inode is DAX */ #define FS_INLINE_DATA_FL 0x10000000 /* Reserved for ext4 */ #define FS_PROJINHERIT_FL 0x20000000 /* Create with parents projid */ From patchwork Sat Mar 2 07:42:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579471 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D5611775E; Sat, 2 Mar 2024 07:42:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365359; cv=none; b=pUH2jlgW5UeS6RIOPK6xH4UzchilKN9O1JAC6bZkKQpy0FlhNc1fktQcZpw/v5hIQuRBiESARORqNwWtay5CoWMPQ2hLJ6olvvir3Tfs94q2udXK7ZdZIYZlKvlBWSNvTim1hRarkDlEcH288ZjyTBGVB4yJvXfiwegdkGS4gbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365359; c=relaxed/simple; bh=4+Rh2VnQPFyNWCv+BvIaDZG8kx9KbVijZEmof738DIA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y7KuabuiRUbhV4Gu01Fiiao/SJO1An9GF1YBuyoLq+ZCuj4P6V+zIAN+zVcPzNuIHQn0DaVyacR8ooez/x1CJPmAFjvbGSmOh03t35lksgX5KuqItAfs8v0+1JsXAVJf3WQ0l+tArW0XWWD2duvjVxUzlknJVshw2AvpmNtHHkQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E6H0RfJz; arc=none smtp.client-ip=209.85.210.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E6H0RfJz" Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-6e5b1c6daa3so1556619b3a.1; Fri, 01 Mar 2024 23:42:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365356; x=1709970156; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eHsu0u63wjbZemZD2jGi9jSrqASniu0uOU6naDIntW0=; b=E6H0RfJzCwRAxifxnDBylksjLZOrgrPMXBr1/OxK4KvkHAlaJwj7UvaXKkhJ9V7oXe VyACnY/6Hzzj3NEiOttA3yiRC0q4YtOjqG7C3Vg2Jhw+bi7mH/a/K+4lrfQ+kp6mlGXh piKTllWS6wlzG9JZ7kGrgpnFUwKC3BVCWD6c39PFMBSw79K9NhgW4mdk3akh1vl5pHn6 AZev0dkHTga/wOPyuqLUqryBt8QryxjWmF42YSW1IeWFUMp2vjJiJ9Hzjj4ZcRRtZOg/ k0L5MkpMP8IDFbuK7IYAfrslZSLlqigz7bEw6/1Cc/17H88o6R3UTV+CKIq59NqNxcSQ NUNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365356; x=1709970156; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eHsu0u63wjbZemZD2jGi9jSrqASniu0uOU6naDIntW0=; b=gOw7BDH1VDKOkXIXiHthdujnfe6ffYb9iMH6TMUV9v8vyLdighZH1YIV0aD8Xsx88t h9wNotTrpNWbgpgw2GRStvIoZON8RtlPh29tFr+VhQ6vjFKUrX3E2hrbhipeofLxmZbb rSmVPaBAS8OxKefPwbnUNrmOTIdeM5mTftRbsiJY5tseFHTfs37eV3HzUcjSxRiCfS+A EeKpSA5+80o/6v9VOCAj+WEZOGf1zh4nPDFLgNTfpbNSX2Vj19+tcKsbP2wNFDWJEccC y7Bv6LK/MtES/v8nP1U2DKCsQ7UkK/uUEXlzjfh9y1yVBVAWD0kPBxPu5fIzoyFZIz8+ A6pA== X-Forwarded-Encrypted: i=1; AJvYcCVJF/d0lpnee+U4fxva78/CXkgzr8YoF4fGtm7WiRa0S/3u/zXVYGGShrMOd10BI1ClDFHHx6YgaiRAi7lvUUD9DQ5nJrTYsTWbYRoYDBrtEeL4MiXWv6sNc3Eg3Lo4OTUTeA1n54wCoQ== X-Gm-Message-State: AOJu0YxB09+tyGZJgpJmVO6N1XFsj8dzmnZUGQfs9ZpIphjC13ooCZop XBZ4rWL2mkRZJiFiXL/laorKNM+OvdVJ571NoZJFLcvprMqntdaH08OnPQgd X-Google-Smtp-Source: AGHT+IFkeFoOHaVJfbLC2PxqAidQfWklji6Sr2BMb/SKrF2mw/gPhtET8LcXvlUgozMIcAte7+1aoQ== X-Received: by 2002:a05:6a00:2d02:b0:6e5:9031:9885 with SMTP id fa2-20020a056a002d0200b006e590319885mr5511519pfb.23.1709365356352; Fri, 01 Mar 2024 23:42:36 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:35 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 3/8] iomap: Add atomic write support for direct-io Date: Sat, 2 Mar 2024 13:12:00 +0530 Message-ID: <6a09654d152d3d1a07636174f5abcfce9948c20c.1709361537.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This adds direct-io atomic writes support in iomap. This adds - 1. IOMAP_ATOMIC flag for iomap iter. 2. Sets REQ_ATOMIC to bio opflags. 3. Adds necessary checks in iomap_dio code to ensure a single bio is submitted for an atomic write request. (since we only support ubuf type iocb). Otherwise return an error EIO. 4. Adds a common helper routine iomap_dio_check_atomic(). It helps in verifying mapped length and start/end physical offset against the hw device constraints for supporting atomic writes. This patch is based on a patch from John Garry which adds such support of DIO atomic writes to iomap. Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/direct-io.c | 75 +++++++++++++++++++++++++++++++++++++++++-- fs/iomap/trace.h | 3 +- include/linux/iomap.h | 1 + 3 files changed, 75 insertions(+), 4 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index bcd3f8cf5ea4..b4548acb74e7 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -256,7 +256,7 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, * clearing the WRITE_THROUGH flag in the dio request. */ static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio, - const struct iomap *iomap, bool use_fua) + const struct iomap *iomap, bool use_fua, bool atomic_write) { blk_opf_t opflags = REQ_SYNC | REQ_IDLE; @@ -269,6 +269,9 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct iomap_dio *dio, else dio->flags &= ~IOMAP_DIO_WRITE_THROUGH; + if (atomic_write) + opflags |= REQ_ATOMIC; + return opflags; } @@ -279,11 +282,12 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, struct inode *inode = iter->inode; unsigned int fs_block_size = i_blocksize(inode), pad; loff_t length = iomap_length(iter); + const size_t orig_len = iter->len; loff_t pos = iter->pos; blk_opf_t bio_opf; struct bio *bio; bool need_zeroout = false; - bool use_fua = false; + bool use_fua = false, atomic_write = iter->flags & IOMAP_ATOMIC; int nr_pages, ret = 0; size_t copied = 0; size_t orig_count; @@ -356,6 +360,11 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, if (need_zeroout) { /* zero out from the start of the block to the write offset */ pad = pos & (fs_block_size - 1); + if (unlikely(pad && atomic_write)) { + WARN_ON_ONCE("pos not atomic write aligned\n"); + ret = -EINVAL; + goto out; + } if (pad) iomap_dio_zero(iter, dio, pos - pad, pad); } @@ -365,7 +374,7 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, * can set up the page vector appropriately for a ZONE_APPEND * operation. */ - bio_opf = iomap_dio_bio_opflags(dio, iomap, use_fua); + bio_opf = iomap_dio_bio_opflags(dio, iomap, use_fua, atomic_write); nr_pages = bio_iov_vecs_to_alloc(dio->submit.iter, BIO_MAX_VECS); do { @@ -397,6 +406,14 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, } n = bio->bi_iter.bi_size; + + /* This bio should have covered the complete length */ + if (unlikely(atomic_write && n != orig_len)) { + WARN_ON_ONCE(1); + ret = -EINVAL; + bio_put(bio); + goto out; + } if (dio->flags & IOMAP_DIO_WRITE) { task_io_account_write(n); } else { @@ -429,6 +446,8 @@ static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) { /* zero out from the end of the write to the end of the block */ pad = pos & (fs_block_size - 1); + /* This should never happen */ + WARN_ON_ONCE(unlikely(pad && atomic_write)); if (pad) iomap_dio_zero(iter, dio, pos, fs_block_size - pad); } @@ -516,6 +535,44 @@ static loff_t iomap_dio_iter(const struct iomap_iter *iter, } } +/* + * iomap_dio_check_atomic: DIO Atomic checks before calling bio submission. + * @iter: iomap iterator + * This function is called after filesystem block mapping and before bio + * formation/submission. This is the right place to verify hw device/block + * layer constraints to be followed for doing atomic writes. Hence do those + * common checks here. + */ +static bool iomap_dio_check_atomic(struct iomap_iter *iter) +{ + struct block_device *bdev = iter->iomap.bdev; + unsigned long long map_len = iomap_length(iter); + unsigned long long start = iomap_sector(&iter->iomap, iter->pos) + << SECTOR_SHIFT; + unsigned long long end = start + map_len - 1; + unsigned int awu_min = + queue_atomic_write_unit_min_bytes(bdev->bd_queue); + unsigned int awu_max = + queue_atomic_write_unit_max_bytes(bdev->bd_queue); + unsigned long boundary = + queue_atomic_write_boundary_bytes(bdev->bd_queue); + unsigned long mask = ~(boundary - 1); + + + /* map_len should be same as user specified iter->len */ + if (map_len < iter->len) + return false; + /* start should be aligned to block device min atomic unit alignment */ + if (!IS_ALIGNED(start, awu_min)) + return false; + /* If top bits doesn't match, means atomic unit boundary is crossed */ + if (boundary && ((start | mask) != (end | mask))) + return false; + + return true; +} + + /* * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO * is being issued as AIO or not. This allows us to optimise pure data writes @@ -554,12 +611,16 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, struct blk_plug plug; struct iomap_dio *dio; loff_t ret = 0; + bool atomic_write = iocb->ki_flags & IOCB_ATOMIC; trace_iomap_dio_rw_begin(iocb, iter, dio_flags, done_before); if (!iomi.len) return NULL; + if (atomic_write && !iter_is_ubuf(iter)) + return ERR_PTR(-EINVAL); + dio = kmalloc(sizeof(*dio), GFP_KERNEL); if (!dio) return ERR_PTR(-ENOMEM); @@ -605,6 +666,9 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, if (iocb->ki_flags & IOCB_DIO_CALLER_COMP) dio->flags |= IOMAP_DIO_CALLER_COMP; + if (atomic_write) + iomi.flags |= IOMAP_ATOMIC; + if (dio_flags & IOMAP_DIO_OVERWRITE_ONLY) { ret = -EAGAIN; if (iomi.pos >= dio->i_size || @@ -656,6 +720,11 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, blk_start_plug(&plug); while ((ret = iomap_iter(&iomi, ops)) > 0) { + if (atomic_write && !iomap_dio_check_atomic(&iomi)) { + ret = -EIO; + break; + } + iomi.processed = iomap_dio_iter(&iomi, dio); /* diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h index c16fd55f5595..c95576420bca 100644 --- a/fs/iomap/trace.h +++ b/fs/iomap/trace.h @@ -98,7 +98,8 @@ DEFINE_RANGE_EVENT(iomap_dio_rw_queued); { IOMAP_REPORT, "REPORT" }, \ { IOMAP_FAULT, "FAULT" }, \ { IOMAP_DIRECT, "DIRECT" }, \ - { IOMAP_NOWAIT, "NOWAIT" } + { IOMAP_NOWAIT, "NOWAIT" }, \ + { IOMAP_ATOMIC, "ATOMIC" } #define IOMAP_F_FLAGS_STRINGS \ { IOMAP_F_NEW, "NEW" }, \ diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 96dd0acbba44..9eac704a0d6f 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -178,6 +178,7 @@ struct iomap_folio_ops { #else #define IOMAP_DAX 0 #endif /* CONFIG_FS_DAX */ +#define IOMAP_ATOMIC (1 << 9) struct iomap_ops { /* From patchwork Sat Mar 2 07:42:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579472 Received: from mail-il1-f178.google.com (mail-il1-f178.google.com [209.85.166.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1429917BC7; Sat, 2 Mar 2024 07:42:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365363; cv=none; b=aB2aaJJBu4hFlWc6cV2gcao3LOKuWGymLfBxb5t0rxcfhunOA64kaGdl2+1NLlfn8qCQwJIsAE51OKUNBeA5044bU/J659CxqIYrTlGvtTmkljfKdshpCzEz1Cli2FS2mzPUtKaIwlOORQou003ZUu/5zq0zUJD+3aDUIDfLvcs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365363; c=relaxed/simple; bh=BDptZaUdouDGvCPACBlkzaPvTTUmwhTQ7mmrE9tmTVY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=edhVxKr6ym9uCo3KI8Jbc0qsleDkXX+9lsuhc+YamO3eY1HDoHaglDykjbkTC73IiLlZnH/PHX73upAMQ/MHUG/y64h9JFgvf0RHLwTQVSLBc565E2hmUuOABlFIc04AEps4COG46f1O79U0Ribty8LPc17n21JZMGltHEZh0h4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=i5ZJk0Zx; arc=none smtp.client-ip=209.85.166.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="i5ZJk0Zx" Received: by mail-il1-f178.google.com with SMTP id e9e14a558f8ab-365145ef32fso9452035ab.1; Fri, 01 Mar 2024 23:42:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365360; x=1709970160; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eTfdJwbprgG1n5WYQCCw6xYQX3etSudrpY9bbjKNIpw=; b=i5ZJk0Zx/+ilD36uEuMKdHpjcnFebawqXa/RJQ3pxAiYPTzvGHv7VeDe5JGCCcqIZm HgvOqObkKfz1EmezjxRtQfvO+SrbmVrAvT92ClWHEfdd+MYJRUkN6WrvpnSSjXbpVaBy 5MSh4ta/5hNWcniMBb8uGbkaGvEyGlNh1+Pq7IIdFKfYMZ5XJyjQhob1NFx0lD77M8s+ Fg5c2KS5Bzr3QGJwn4nuqkCQQa9RL3LZmmQs7/o0vEhYtuq0LwOxm/JY6fHVEwkyZMZn 8p+GUBbLyqXczYwuSLthkpOqXWHkyC0bXfQVIYKbR8eZBf3WYmJICDiOzx6LFGq7LBW3 V9Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365360; x=1709970160; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eTfdJwbprgG1n5WYQCCw6xYQX3etSudrpY9bbjKNIpw=; b=OJVnFoPGjkVupYQhC/XIA48Af4appOQ250EQzGDle0/K+p1euP71NvfF9Iyelu2zNq JvfIgO+rqhd97wDBiJxTtVU+g2xlXheXDCJVJUFaj3f0RGZ8T5sSAl+rBSIvfvM3czkg 0yq2bLp3MZ6Awm13KTyq72a2bRndY37IxoYpP1zUgcEfFWnU6Jp0b8MXriTu7xsRJxUO 42OPgHbPGhfQ/nouohi6ZJyASO8x+qdI3I5A/asjNTpVt3wKjnWyEe9WWmE96MG9ae2W XWFBJa8hQrdAvU2dTh/g3tB728YXdvKAFWtI4DvNgamijNGosuwH1MoA3byHVkO2dphi NkQQ== X-Forwarded-Encrypted: i=1; AJvYcCW/buRLFrkeTEf0ct6noqXPXhXVKmTbtnGTlaQF02wt/0G9j/LjpWZs8pgfUOCk5uDg75kGIJp71QgsAv1y64a7szQQ2VZ18HDzBmL/Omieithn2xCnJGu8yEAEsFMenvWvW3JEIZf+aw== X-Gm-Message-State: AOJu0YzEhJn8wKSGKj1gRvknApj9W9Gus/8GyExC8v9GB72ah7A76w2R EmEilWbfi4MO+jroRYtIy8W2rtNuu+63+fA3eG+wesHTJZWXTyN4VmjK6A4a X-Google-Smtp-Source: AGHT+IF6jddCyDq2/OaWTlanWjcunc8w2YfYNzosd+LLUHNO2MfZmt29PENZpz6FxdzRAZaVYRKykw== X-Received: by 2002:a05:6e02:1a47:b0:365:1ec4:a96c with SMTP id u7-20020a056e021a4700b003651ec4a96cmr4629876ilv.5.1709365360291; Fri, 01 Mar 2024 23:42:40 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:39 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 4/8] ext4: Add statx and other atomic write helper routines Date: Sat, 2 Mar 2024 13:12:01 +0530 Message-ID: <9def15d6ffb88f7352713c65292513fab532112a.1709361537.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This patch adds the statx (STATX_WRITE_ATOMIC) support in ext4_getattr() to query for atomic_write_unit_min(awu_min), awu_max and other attributes for atomic writes. This adds a new runtime mount flag (EXT4_MF_ATOMIC_WRITE_FSAWU), for querying whether ext4 supports atomic write using fsawu (filesystem atomic write unit). Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/ext4/ext4.h | 53 ++++++++++++++++++++++++++++++++++++++++++++++++- fs/ext4/inode.c | 16 +++++++++++++++ 2 files changed, 68 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 023571f8dd1b..1d2bce26e616 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1817,7 +1817,8 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino) */ enum { EXT4_MF_MNTDIR_SAMPLED, - EXT4_MF_FC_INELIGIBLE /* Fast commit ineligible */ + EXT4_MF_FC_INELIGIBLE, /* Fast commit ineligible */ + EXT4_MF_ATOMIC_WRITE_FSAWU /* Atomic write via FSAWU */ }; static inline void ext4_set_mount_flag(struct super_block *sb, int bit) @@ -3839,6 +3840,56 @@ static inline int ext4_buffer_uptodate(struct buffer_head *bh) return buffer_uptodate(bh); } +#define ext4_can_atomic_write_fsawu(sb) \ + ext4_test_mount_flag(sb, EXT4_MF_ATOMIC_WRITE_FSAWU) + +/** + * ext4_atomic_write_fsawu Returns EXT4 filesystem atomic write unit. + * @sb super_block + * This returns the filesystem min|max atomic write units. + * For !bigalloc it is filesystem blocksize (fsawu_min) + * For bigalloc it should be either blocksize or multiple of blocksize + * (fsawu_min) + */ +static inline void ext4_atomic_write_fsawu(struct super_block *sb, + unsigned int *fsawu_min, + unsigned int *fsawu_max) +{ + u8 blkbits = sb->s_blocksize_bits; + unsigned int blocksize = 1U << blkbits; + unsigned int clustersize = blocksize; + struct block_device *bdev = sb->s_bdev; + unsigned int awu_min = + queue_atomic_write_unit_min_bytes(bdev->bd_queue); + unsigned int awu_max = + queue_atomic_write_unit_max_bytes(bdev->bd_queue); + + if (ext4_has_feature_bigalloc(sb)) + clustersize = 1U << (EXT4_SB(sb)->s_cluster_bits + blkbits); + + /* fs min|max should respect awu_[min|max] units */ + if (unlikely(awu_min > clustersize || awu_max < blocksize)) + goto not_supported; + + /* in case of !bigalloc fsawu_[min|max] should be same as blocksize */ + if (!ext4_has_feature_bigalloc(sb)) { + *fsawu_min = blocksize; + *fsawu_max = blocksize; + return; + } + + /* bigalloc can support write in blocksize units. So advertize it */ + *fsawu_min = max(blocksize, awu_min); + *fsawu_max = min(clustersize, awu_max); + + /* This should never happen, but let's keep a WARN_ON_ONCE */ + WARN_ON_ONCE(!IS_ALIGNED(clustersize, *fsawu_min)); + return; +not_supported: + *fsawu_min = 0; + *fsawu_max = 0; +} + #endif /* __KERNEL__ */ #define EFSBADCRC EBADMSG /* Bad CRC detected */ diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 2ccf3b5e3a7c..ea009ca9085d 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5536,6 +5536,22 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path, } } + if (request_mask & STATX_WRITE_ATOMIC) { + unsigned int fsawu_min = 0, fsawu_max = 0; + + /* + * Get fsawu_[min|max] value which we can advertise to userspace + * in statx call, if we support atomic writes using + * EXT4_MF_ATOMIC_WRITE_FSAWU. + */ + if (ext4_can_atomic_write_fsawu(inode->i_sb)) { + ext4_atomic_write_fsawu(inode->i_sb, &fsawu_min, + &fsawu_max); + } + + generic_fill_statx_atomic_writes(stat, fsawu_min, fsawu_max); + } + flags = ei->i_flags & EXT4_FL_USER_VISIBLE; if (flags & EXT4_APPEND_FL) stat->attributes |= STATX_ATTR_APPEND; From patchwork Sat Mar 2 07:42:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579473 Received: from mail-oa1-f43.google.com (mail-oa1-f43.google.com [209.85.160.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D703218059; Sat, 2 Mar 2024 07:42:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365367; cv=none; b=LduN4ABZ7QPeLyuWnaNM6Fd9V0tDe+qj2ArUMFS+FyySUeNVO4Zt6tJdQ/Prap70nMjcDms5UWcwGYtW4VDIoibINT3nwRg6XuZUwcS6PKqb8XZmH3DBlLsVPCU55PtaAEmro9XCDHoOILU0KdvC+pNup+JrvdElU8AQxNrVZzA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365367; c=relaxed/simple; bh=XccdQgbUgEfTrYjCxrgPLToybofUT8Nn97GdxSvqImc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=euFd6ucb+Yg7+BUygYNh+zE0Z3hHAoTcnLbOWgzRjwcezAP3fNeB0QlI0wXDmZECGJVN1pqk2kwjvo6S0dAomOaFDvUFT5HfGMFOuYOfH69Qc5WP0veNUVNdN5g3jopwlBpsQyD6sPhl7zig89zhIAPW9i4vA8DmxsaWsECKL/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HPoK+sXQ; arc=none smtp.client-ip=209.85.160.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HPoK+sXQ" Received: by mail-oa1-f43.google.com with SMTP id 586e51a60fabf-21fa872dce3so1203047fac.2; Fri, 01 Mar 2024 23:42:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365364; x=1709970164; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MmvvXS9J0hotjKmd7nXk3BFkDw31hUfZsHqAnm1NWAk=; b=HPoK+sXQwBYyISFknHPxG2QoERx2lq6Z6PW8pXOvNowJfSN7lx1bi8iUCrwsLc6798 wPtk3zfoVD5J+m/+tCi97AWAzm1fM7meGU11Ibi9tIc8wWa8qiZ2Uy4tyS5WjkDXm3H6 AX6cck9M89+rpOGqVXC8azvLZXMVLLmLAT8YKHlkOybREroH1bbtSVCemXdQI1t6BvxC CtWkmGZVUHhS4jFmcikCKEc2pQaThBm+PlJTcnw0M80svpnH6r+LNA3G/uwbkTqg1+Tq cqUuUqrBzollsz9pn4/MUPkxOY2GXbV4FyBYJP/HqMkHR9MrWpJ8yG+vDaicMIFrW2zh KTMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365364; x=1709970164; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MmvvXS9J0hotjKmd7nXk3BFkDw31hUfZsHqAnm1NWAk=; b=bbbHZlhrRZuFvtCn4+L1klm9C/l7Zsqzm1htGL0ZAkjGOKa2lqXVlZanu3uj/IePki Y2o0J14RkCoEgoxMgQQ+JvvjsoQQCIApYC13mfoWHItoKWLuMUcAWkvGkmCcg3HuXgDY 5Vk3IpNo9xXMuMjJvqLhniNSGI78UobEvUp0XHKjRqVIWnbsDhhL+ezObwRg4EsuZlWp FFKyNwNkKyT74iomZTmYYVXLE6iS+oSpPTF9UdMurXuIVNi1UgIyZwOu9ZTNxkrirPhs CFNxuP49b+D5/fTRETD3f225rhgl681X9WgdRW3LUbjacOtq0wSUDbNI88T075R8Pv+2 MMMw== X-Forwarded-Encrypted: i=1; AJvYcCWJbF9GyZP5xAvVrZfcLakDkOhLu3ggy9z91aZChNng5dLcOhNL4O8uDMEwLXAZ2ASfVqbHt4NxqzcC8HUD0Bb067MMtitKsnMSjb5YKJgvdG5nmzgdNxP3hALQIjPiTsQfXh72wI3Vcg== X-Gm-Message-State: AOJu0YxX/kTkiwqBu7sB1jzBnsG6HotwOuG4Yc8wT3BIraQ8NuH5JAGB s+c0QYXKsD1iMxSRXIzxG04kwwtC7c5Y82dKOUiieWZAfk5QB+GfRMEremw/ X-Google-Smtp-Source: AGHT+IE4CpBbtufy+ckAJgMRWpOhMbygbFmtOeQHTozWUAdYrfg6HSglZMY1fvBiYVpFNVc5rALKcA== X-Received: by 2002:a05:6870:c142:b0:220:bf55:b12a with SMTP id g2-20020a056870c14200b00220bf55b12amr3871797oad.38.1709365364048; Fri, 01 Mar 2024 23:42:44 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:43 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 5/8] ext4: Adds direct-io atomic writes checks Date: Sat, 2 Mar 2024 13:12:02 +0530 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This patch adds ext4 specific checks for supporting atomic writes using fsawu (filesystem atomic write unit). We can enable this support with either - 1. bigalloc on a 4k pagesize system or 2. bs < ps system with -b 3. filesystems with LBS (large block size) support (future) Let's use generic_atomic_write_valid() helper for alignment restrictions checking. Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/ext4/file.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 54d6ff22585c..8e309a9a0bd6 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -400,6 +400,21 @@ static const struct iomap_dio_ops ext4_dio_write_ops = { .end_io = ext4_dio_write_end_io, }; +static bool ext4_dio_atomic_write_checks(struct kiocb *iocb, + struct iov_iter *from) +{ + struct super_block *sb = file_inode(iocb->ki_filp)->i_sb; + loff_t pos = iocb->ki_pos; + unsigned int fsawu_min, fsawu_max; + + if (!ext4_can_atomic_write_fsawu(sb)) + return false; + + ext4_atomic_write_fsawu(sb, &fsawu_min, &fsawu_max); + + return generic_atomic_write_valid(pos, from, fsawu_min, fsawu_max); +} + /* * The intention here is to start with shared lock acquired then see if any * condition requires an exclusive inode lock. If yes, then we restart the @@ -427,13 +442,19 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from, loff_t offset; size_t count; ssize_t ret; - bool overwrite, unaligned_io; + bool overwrite, unaligned_io, atomic_write; restart: ret = ext4_generic_write_checks(iocb, from); if (ret <= 0) goto out; + atomic_write = iocb->ki_flags & IOCB_ATOMIC; + if (atomic_write && !ext4_dio_atomic_write_checks(iocb, from)) { + ret = -EINVAL; + goto out; + } + offset = iocb->ki_pos; count = ret; @@ -576,8 +597,15 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) iomap_ops = &ext4_iomap_overwrite_ops; ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops, dio_flags, NULL, 0); - if (ret == -ENOTBLK) - ret = 0; + + /* Fallback to buffered-io for non-atomic DIO */ + if (ret == -ENOTBLK) { + if (iocb->ki_flags & IOCB_ATOMIC) + ret = -EIO; + else + ret = 0; + } + if (extend) { /* * We always perform extending DIO write synchronously so by From patchwork Sat Mar 2 07:42:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579474 Received: from mail-oa1-f44.google.com (mail-oa1-f44.google.com [209.85.160.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F0311B263; Sat, 2 Mar 2024 07:42:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365370; cv=none; b=sbxbR9kLf+qmrDozH0yXvmDvxu3KlX9SAdOm9PKz8WiJ11cR8YkAdfnh1K6AOEbbKe88EIbOq8pwobIO8Jq/Z297O56wwjWIx/GqSjUvp1ltSv0QaQiXc0zuLS6bweBklGMlljcw9yBY1+u4ZQfkp5UmD4MU15uZxsvCQYWdvkM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365370; c=relaxed/simple; bh=UEhxsrf39s3LpCg+dRj/c9p+DSGoMMdYusEHuKqzBcY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ejN+oBd11xPyrT9imu8W5TT/EOtaYQL8YJbOiHxg9eFnRxnec7GqT/guNKX5oViyqXwrgV9SdqKmBlVdKRinoCEKMn1SmI9SFNUGfiEUJYECOUiZQlrDkPNqLCgjAOcWsN9Oc3dY0bwNvd+AaKsMO+Sml7C4RWAykz7s95f7vHA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lbMtpa69; arc=none smtp.client-ip=209.85.160.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lbMtpa69" Received: by mail-oa1-f44.google.com with SMTP id 586e51a60fabf-21f70f72fb5so1934679fac.1; Fri, 01 Mar 2024 23:42:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365368; x=1709970168; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RPDq+e7qZZNUe2lBOgjexuQEg62+2SvW8S4Hg2hx7P8=; b=lbMtpa69SB34xQc7612j38A4KBcghi3XwxfJuYqEoBio6bKBzhX8/7X186gvXVsmj0 oxdkpNr/6i9/I0qRdBso528lWFWv3V77X5BwnE6ZJwuFJcPuzk/ShQkDjqTAlg5H+/xo GgASx3zz+DYzztxJZuf4237ms96FPYSL33ZMlyu+hN5KwkgweeT2bl12F5COJM7/D8qD uu0zKCc0bqjuVHo78w72bKrurLgFWT8br5Ih/oIfAryM1vJ0tJdxrP4HIpMj2AJKT/+c DSjb+robR+Opc9AwbPJC0lVYsR+OqaMaSLGg+wZ8qvZCLsxOtHpHa7laf6amk11VuThS mCGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365368; x=1709970168; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RPDq+e7qZZNUe2lBOgjexuQEg62+2SvW8S4Hg2hx7P8=; b=mha9MbGDZe/9RMCqsa2hOruBEZKz78UBTO2fIhXv65uXN9L7IO68UVGq0Mi7ZS9bf0 NwycfR/RvkI3Q43SX1YXXrxrBMPkjKe3AYRBto0t0eBQJYBMfOBU/jGunBcC9FkQUEHe lEZ9Yx/VC2cOnp2UQ6bhSrNiGUjXIOF0wl7CwA2m20Sb+Ej2hVCsSfNMFFHrb3oIJNIR /EA3uYcmPJOvn54Dc3wX7KhuquPKx7LVMxpGiO9ZdpDjitt8iBtAn/WO7lfYIuar4Ywr 4IvZQSC3uHjJbeY82MruK9dvo2p5ZFmSaKRLmsn7d/bNsHk/KC/CX0GMrM2EySk2uYSz wMnw== X-Forwarded-Encrypted: i=1; AJvYcCUcWqLW1imGthmURhTzjLpVP/lcTSGvy+y4d29JijkP9XNO6wWNEfJdXtBkWzqa402R6EWtSovdQf7GUw2ZoplvAp+8G0xUB1GpD+q55vfgeQi1g52ErgVgGL2eUHj0fmPrEsEKQBMxkQ== X-Gm-Message-State: AOJu0Yz0AvGlNoJ02Irh15q1w4pXWUh1tvWpZs7ue7mLW2HsICpjbSAX APup1yNOlymPvHRARCSD1odgswx0wnQeDbxAuynoM/k0PIzyFi9c+qFB089P X-Google-Smtp-Source: AGHT+IE/m6H6F0kQKihprUz16XWVJsMaLAmr+GMFaQJzjx0uRY+iUh55PiV4LGG2omEAlU5HzeKCqA== X-Received: by 2002:a05:6870:e305:b0:21e:e583:25e1 with SMTP id z5-20020a056870e30500b0021ee58325e1mr4359058oad.32.1709365367792; Fri, 01 Mar 2024 23:42:47 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:47 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 6/8] ext4: Add an inode flag for atomic writes Date: Sat, 2 Mar 2024 13:12:03 +0530 Message-ID: <33e9dc5cd81f85d86e3b2eb95df4f7831e4f96a6.1709361537.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This patch adds an inode atomic writes flag to ext4 (EXT4_ATOMICWRITES_FL which uses FS_ATOMICWRITES_FL flag). Also add support for setting of this flag via ioctl. Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/ext4/ext4.h | 6 ++++++ fs/ext4/ioctl.c | 11 +++++++++++ 2 files changed, 17 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 1d2bce26e616..aa7fff2d6f96 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -495,8 +495,12 @@ struct flex_groups { #define EXT4_EA_INODE_FL 0x00200000 /* Inode used for large EA */ /* 0x00400000 was formerly EXT4_EOFBLOCKS_FL */ +#define EXT4_ATOMICWRITES_FL FS_ATOMICWRITES_FL /* Inode supports atomic writes */ #define EXT4_DAX_FL 0x02000000 /* Inode is DAX */ +/* 0x04000000 unused for now */ +/* 0x08000000 unused for now */ + #define EXT4_INLINE_DATA_FL 0x10000000 /* Inode has inline data. */ #define EXT4_PROJINHERIT_FL 0x20000000 /* Create with parents projid */ #define EXT4_CASEFOLD_FL 0x40000000 /* Casefolded directory */ @@ -519,6 +523,7 @@ struct flex_groups { 0x00400000 /* EXT4_EOFBLOCKS_FL */ | \ EXT4_DAX_FL | \ EXT4_PROJINHERIT_FL | \ + EXT4_ATOMICWRITES_FL | \ EXT4_CASEFOLD_FL) /* User visible flags */ @@ -593,6 +598,7 @@ enum { EXT4_INODE_VERITY = 20, /* Verity protected inode */ EXT4_INODE_EA_INODE = 21, /* Inode used for large EA */ /* 22 was formerly EXT4_INODE_EOFBLOCKS */ + EXT4_INODE_ATOMIC_WRITE = 24, /* file does ATOMIC WRITE */ EXT4_INODE_DAX = 25, /* Inode is DAX */ EXT4_INODE_INLINE_DATA = 28, /* Data in inode. */ EXT4_INODE_PROJINHERIT = 29, /* Create with parents projid */ diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 7160a71044c8..03d0b501cbc8 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -632,6 +632,17 @@ static int ext4_ioctl_setflags(struct inode *inode, } } + if (flags & EXT4_ATOMICWRITES_FL) { + if (!ext4_can_atomic_write_fsawu(sb)) + return -EOPNOTSUPP; + + /* TODO: Do we need locks to check i_reserved_data_blocks */ + if (!S_ISREG(inode->i_mode) || ext4_has_inline_data(inode) || + READ_ONCE(ei->i_disksize) || + EXT4_I(inode)->i_reserved_data_blocks) + return -EOPNOTSUPP; + } + /* * Wait for all pending directio and then flush all the dirty pages * for this file. The flush marks all the pages readonly, so any From patchwork Sat Mar 2 07:42:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579475 Received: from mail-ot1-f41.google.com (mail-ot1-f41.google.com [209.85.210.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A44A720B3E; Sat, 2 Mar 2024 07:42:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365375; cv=none; b=hV9Lc78n3k9v2Y4emQBsX35WX7Ud1vQTotmGnLJb1IEjGnytAiv0LZJ4Ddu3ZCgVELG895iEhtVlcVIV/c0dvMfxjI3BYm7Fjgyq/DCYFAMLzXsKPlNPixp0LZlDb3TXVrVc9cgKPuWHqU8Bg/2/p6Xbnx/oXb6F3CLoLq4njxo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365375; c=relaxed/simple; bh=7K2GXZNUZuFdWwcE5Vtj/VA1EW5ZzCGgdr576kCe59M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g9TDrapuThtMHgtX2eFVBlmyRFawWdxlc9l+AbcASJRsfuNECkKNEJ1AIHAf9U/5ODygJAyOoWhSStcTds09lBWmDog9qtHpiPepOMRuHp1OBQQqU1yV5OJyoocpydmgaFrpCMA1hxqxiI3ulgx0RAXhvn8iHFjU5HnLBuLZS0Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HfRWpGeP; arc=none smtp.client-ip=209.85.210.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HfRWpGeP" Received: by mail-ot1-f41.google.com with SMTP id 46e09a7af769-6e4b34f2455so1642234a34.2; Fri, 01 Mar 2024 23:42:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365371; x=1709970171; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+kzeTSfiJu806WMN/KPCzhuK/hpOJTvJcAEBv+CARb0=; b=HfRWpGePrGLULOH+SA/8rs6MD9/PfkwTDQvZ9oh2woxlUGTtPA0PGDzvOUw8S9Bvod BpltlcEyKJ2cuoLiVRuqdEJOLuYI+Vv8Vl44fwNfr2bNW66Sjc4lnknDEJjZcYNoHf+Q M1T3ZcpVaSiWetHZF0e+wJPcI1cEYE/P9dWzXNO9pDCPnGPicocO7wAzznz7c4rpTNFw 6yZDCyHWRnf5CZOX0O7miXb22LD36uxIAIJwWwdKYQAxlxccBwUnnSCA4/CJkBB6ZhpN qzahsrOP22KbnfhdxNcVIVuzIA4NcvGLuDwG77Xjy1e4uZx7ohzwBsGnO53vXHNHqIrV YUkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365371; x=1709970171; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+kzeTSfiJu806WMN/KPCzhuK/hpOJTvJcAEBv+CARb0=; b=lirIAJq37hum70VvuZRoxvtwkHjQqmldIfCmu6jcnxYybM+/GhBQfjIc3Gd1afrb8u hjXbstInYJl7VRZICp0ALD32EtC28kvw/pmr8sbc2tP7n7aehYeQ+Fr8IrLVjSLqhz0Q pcQKRYmliFTEWfWOLa6b+kdaDBl9dBqZeYfAbMA0YnAZmYZY22cizAjqkRKOTGXtg0WM m69SlRFrvgbbACw2onAYqCWneR4TJWc8OKce6OrtYBRq+A+GmkRm02PWIZAloxUQOctE V6wXxTZUp+EQI5A3yzpdte2VKW4B9vFqdbXIzMJ7Fj5YFUC63bk7JzKoc+Z29t28QCEF Vshg== X-Forwarded-Encrypted: i=1; AJvYcCXkdyDHjQwC0n7MSgoqU0SUTh8djSULSBIJluWFJZZ/ThBQJMMRbpdSaJEtgLDB4UfInfVmYoaUgffPSe9cFFIVQNEot3R1LTWY9LJfqD52Ej2MnUZ8DmNMuc/uez2JnXoYqNrkC5w66g== X-Gm-Message-State: AOJu0YwseWLUZE/ekw/YMplvcLH9edJkypYbQYbd9CwKIA8OcxJI94Hc dap/+zZbQcCfpwEnNs2g4y21GQ9nbZoBwxy4y4iZYG1Zt4MP6EfxzGZ5pz2h X-Google-Smtp-Source: AGHT+IFva53f1HkSaVQ0Lv9O37kGnDFML6TGMRrRTSvsFAOCt1IzeQpMsbaFTgagj3y3o+g4gfwMMQ== X-Received: by 2002:a05:6808:ec9:b0:3c0:4b11:dc54 with SMTP id q9-20020a0568080ec900b003c04b11dc54mr4052409oiv.35.1709365371659; Fri, 01 Mar 2024 23:42:51 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:51 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 7/8] ext4: Enable FMODE_CAN_ATOMIC_WRITE in open for direct-io Date: Sat, 2 Mar 2024 13:12:04 +0530 Message-ID: <703c48213ec033af5fd270c5338921db9898774c.1709361537.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 For inodes which has EXT4_INODE_ATOMIC_WRITE flag set, enable FMODE_CAN_ATOMIC_WRITE mode in ext4 file open method for file opened with O_DIRECT. Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/ext4/file.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 8e309a9a0bd6..800fd79e2738 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -913,6 +913,10 @@ static int ext4_file_open(struct inode *inode, struct file *filp) return ret; } + if (ext4_test_inode_flag(inode, EXT4_INODE_ATOMIC_WRITE) && + (filp->f_flags & O_DIRECT)) + filp->f_mode |= FMODE_CAN_ATOMIC_WRITE; + filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_DIO_PARALLEL_WRITE; return dquot_file_open(inode, filp); From patchwork Sat Mar 2 07:42:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13579476 Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA0D22233B; Sat, 2 Mar 2024 07:42:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365378; cv=none; b=jGE2ROM9VeyHwgD9g8PTQj4FBCe4BIoHgdQCAGJi7m+qySElU8dlLghkmpBJzGFAq1UHcAlpIoqxxZtLeGAb62Wl1b99IawSuYpLFxSaKyv6krRN9CPummptQgoFu3bNLogQH1XNQfcBVTPon9XLtQxrjwBqkqRUL3f/waPv3/Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709365378; c=relaxed/simple; bh=lN4TxJsiUeFt/fuNQPZHvw17F4IBxs8TjYTF/9YdosE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gN0xmTOMjs3JXaDCVfx6vw8PIPmdyNbbb23+gSj3if6CpypFFJl+kCIDe1L+IKYBAcXx7UODrB+Yvp8uQtB+zuJi0rAuHCQdXrGxv7HPWAa3sR/ayxP9QjW323yC7IePdxwvhh6GDfJI0mPn92RyqocxWBxicK0O+FeYtU8WkQ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kuGZblZm; arc=none smtp.client-ip=209.85.167.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kuGZblZm" Received: by mail-oi1-f174.google.com with SMTP id 5614622812f47-3bbbc6e51d0so2037384b6e.3; Fri, 01 Mar 2024 23:42:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709365375; x=1709970175; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6Ng/bkdiZmf26t7FmcNycgQvNgpchjoGQ4iAQnfo8mo=; b=kuGZblZm9EvO2cmvh36J/3PIbHKBjPveYu7avwYOsDM0eGYOUb1HYGMnQm1JMLCfBc sFsyn/a8ogQ9UxVhMcaR228k09/rWfEfFJNw8Ia9ZHjxLDkOJifHKQGoA8wcLuO+AfOL ARJxBe85VRB+31WB4pAL0p3/cYtZIREuNwSQrludIkyCR5gTqdibDAjI/9fUqnPAukO8 sTLbwC3L4WZ9lqk8QGJwixRn4fukMqUOvYyPu/1qsJN7r1IBO30dJP/MqNAhayOuRMew qWvHeQbitJI7LMFUOlARhIQq7puOHWnk3MGV1JUK2jnrfOSeSAm5RF0in9x5Aq3DRTFs U5bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709365375; x=1709970175; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6Ng/bkdiZmf26t7FmcNycgQvNgpchjoGQ4iAQnfo8mo=; b=f604hjEWHFRMq+EhO6dLtMhe2faB/eYD3SAK3z1YR8MKQK6MGzgvqieIOz3ycyZpjV mMRItDP4Cwosn7DjDm1/6tLXp1/vIPDJ/LeqeJuoISZ+WU61DGsXGZzfr+rvfCfoQgWs ZPXP/TOwccxolfC9bWgZ5hxNqatTkDDTtsjzvZNgpwccjN70+Nxa+C7pZ3mAAvbH5ESv pOH8CQRfInuWwnJy+cipxnmuTNDAXoSodYRnPG5E0fjLIYtd4ZtqpR7gCr+vIAYeqIvg SlRQ/QFHbd2d28qC391OX0nzYnfgFAG1Z3tcthERm4dP64Lxh4th9OmPgtQe1JA2v+h2 XW1w== X-Forwarded-Encrypted: i=1; AJvYcCWr8yBnydXCxyiJ0Mhdn8MVRWzSByQakXl3ezAxNV3GBQ0GgKaQPBTMqtr0dQfZfjkcZVc1B/A+MXlYQB6lJHLoKxYnIBbjijUoUosvREYScLs3zAUMVFskSfdaPhNB9w/R7CMjKtI4dw== X-Gm-Message-State: AOJu0Yy9T8gdCDwNeK/nqOv7ICjWPFNxWTPp2WOvzt5ZquW/2YnzcOD/ nHZXr5kGj3zeUSQS+61qChrL3ktTcO9dk1ofGLVto9ixfnOInaEMzsdRqj9d X-Google-Smtp-Source: AGHT+IFFRxF5/IA8cOYk9zucRjrGH6LJnL472Gl4p8hxv2v0pEQdGtAdKJuF/JsB60rnL70SP/ALcw== X-Received: by 2002:a05:6808:639a:b0:3c1:a3df:fb6e with SMTP id ec26-20020a056808639a00b003c1a3dffb6emr3734033oib.18.1709365375491; Fri, 01 Mar 2024 23:42:55 -0800 (PST) Received: from dw-tp.. ([49.205.218.89]) by smtp.gmail.com with ESMTPSA id x11-20020aa784cb000000b006e45c5d7720sm4138206pfn.93.2024.03.01.23.42.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Mar 2024 23:42:54 -0800 (PST) From: "Ritesh Harjani (IBM)" To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org Cc: Ojaswin Mujoo , Jan Kara , Theodore Ts'o , Matthew Wilcox , "Darrick J . Wong" , Luis Chamberlain , John Garry , linux-kernel@vger.kernel.org, "Ritesh Harjani (IBM)" Subject: [RFC 8/8] ext4: Adds atomic writes using fsawu Date: Sat, 2 Mar 2024 13:12:05 +0530 Message-ID: <52a5d4d2191b289fa013f764efdfad93c8acb3c9.1709361537.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> References: <555cc3e262efa77ee5648196362f415a1efc018d.1709361537.git.ritesh.list@gmail.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 atomic write using fsawu (filesystem atomic write unit) means, a filesystem can supports doing atomic writes as long as all of below constraints are satisfied - 1. underlying block device HW supports atomic writes. 2. fsawu_[min|max] (fs blocksize or bigalloc cluster size), should be within the HW boundary range of awu_min and awu_max. If this constraints are satisfied that a filesystem can do atomic writes. There are no underlying filesystem layout changes required to enable this. This patch enables this support in ext4 during mount time if the underlying HW supports it. We set a runtime mount flag to enable this support. After this patch ext4 can support atomic writes with pwritev2's RWF_ATOMIC flag with direct-io with - 1. mkfs.ext4 -b (for a large pagesize system) 2. mkfs.ext4 -b -C (with bigalloc) Co-developed-by: Ojaswin Mujoo Signed-off-by: Ojaswin Mujoo Signed-off-by: Ritesh Harjani (IBM) --- fs/ext4/ext4.h | 28 ++++++++++++++++++++++++++++ fs/ext4/super.c | 1 + 2 files changed, 29 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index aa7fff2d6f96..529ca32b9813 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3896,6 +3896,34 @@ static inline void ext4_atomic_write_fsawu(struct super_block *sb, *fsawu_max = 0; } +/** + * ext4_init_atomic_write ext4 init atomic writes using fsawu + * @sb super_block + * + * Function to initialize atomic/untorn write support using fsawu. + * TODO: In future, when mballoc will get aligned allocations support, + * then we can enable atomic write support for ext4 without fsawu restrictions. + */ +static inline void ext4_init_atomic_write(struct super_block *sb) +{ + struct block_device *bdev = sb->s_bdev; + unsigned int fsawu_min, fsawu_max; + + if (!ext4_has_feature_extents(sb)) + return; + + if (!bdev_can_atomic_write(bdev)) + return; + + ext4_atomic_write_fsawu(sb, &fsawu_min, &fsawu_max); + if (fsawu_min && fsawu_max) { + ext4_set_mount_flag(sb, EXT4_MF_ATOMIC_WRITE_FSAWU); + ext4_msg(sb, KERN_NOTICE, + "Supports atomic writes using EXT4_MF_ATOMIC_WRITE_FSAWU, fsawu_min %u fsawu_max: %u", + fsawu_min, fsawu_max); + } +} + #endif /* __KERNEL__ */ #define EFSBADCRC EBADMSG /* Bad CRC detected */ diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 0f931d0c227d..971bfd093997 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5352,6 +5352,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) mutex_init(&sbi->s_orphan_lock); ext4_fast_commit_init(sb); + ext4_init_atomic_write(sb); sb->s_root = NULL;