xfs: add bounds checking to xlog_recover_process_ophdr

Message ID	20240603041410.79381-1-llfamsec@gmail.com (mailing list archive)
State	Accepted, archived
Headers	show Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA2903207 for <linux-xfs@vger.kernel.org>; Mon, 3 Jun 2024 04:14:40 +0000 (UTC) From: lei lu <llfamsec@gmail.com> To: linux-xfs@vger.kernel.org Cc: chandan.babu@oracle.com, djwong@kernel.org, lei lu <llfamsec@gmail.com> Subject: [PATCH] xfs: add bounds checking to xlog_recover_process_ophdr Date: Mon, 3 Jun 2024 12:14:10 +0800 Message-Id: <20240603041410.79381-1-llfamsec@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	xfs: add bounds checking to xlog_recover_process_ophdr \| expand xfs: add bounds checking to xlog_recover_process_ophdr

Message ID

20240603041410.79381-1-llfamsec@gmail.com (mailing list archive)

State

Accepted, archived

Headers

From: lei lu <llfamsec@gmail.com>
To: linux-xfs@vger.kernel.org
Cc: chandan.babu@oracle.com,
	djwong@kernel.org,
	lei lu <llfamsec@gmail.com>
Subject: [PATCH] xfs: add bounds checking to xlog_recover_process_ophdr
Date: Mon,  3 Jun 2024 12:14:10 +0800
Message-Id: <20240603041410.79381-1-llfamsec@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

xfs: add bounds checking to xlog_recover_process_ophdr | expand

Commit Message

lei lu June 3, 2024, 4:14 a.m. UTC

Make sure xlog_op_header don't stray beyond valid memory region.
Check if there is enough space for the fixed members of each
xlog_op_header before visiting.

Signed-off-by: lei lu <llfamsec@gmail.com>
---
 fs/xfs/xfs_log_recover.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Dave Chinner June 3, 2024, 5:36 a.m. UTC | #1

On Mon, Jun 03, 2024 at 12:14:10PM +0800, lei lu wrote:
> Make sure xlog_op_header don't stray beyond valid memory region.
> Check if there is enough space for the fixed members of each
> xlog_op_header before visiting.
> 
> Signed-off-by: lei lu <llfamsec@gmail.com>

Finding a corrupted ophdr chain this deep in recovery implies that
there is either a runtime bug when writing out the log buffers
during checkpointing or there is memory corruption occurring during
log recovery after the log items have been read from the journal and
attached to the transaction structure for processing. If that's the
case, then addressing the buffer overrun is not fixing the
underlying issue that needs to be triaged and fixed.

The only other way I can think of that this might occur is malicious
corruption of the journal structures on disk. If that's the case,
then you need to state this explicitly so we know that this isn't
a potentially critical runtime bug we need to spend time on
immediately.

Hence we need to know how you found the issue, what it's symptoms
are, how to reproduce the issue, etc.  ie. you need to explain where
you found a CRC validated journal structure that is internally
inconsistent. That's -critical information- that we need to assess
the scope of the issue that you have uncovered.

I asked for you to provide this sort of detail to be provided with
your last overrun fix - we need to know how such failures come about
because they often are just a symptom of some other issue we failed
to capture. You might have all the details in your head, but none of
us know anything about what you are doing.

Can you describe what tools and processes you are using to find
these issues, and please try to include such descriptions in the
commit message for future fixes to issues you find?

> ---
>  fs/xfs/xfs_log_recover.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 1251c81e55f9..660e79a07ce6 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2361,6 +2361,11 @@ xlog_recover_process_ophdr(
>  	unsigned int		len;
>  	int			error;
>  
> +	if (dp > end) {
> +		xfs_warn(log->l_mp, "%s: op header overrun", __func__);
> +		return -EFSCORRUPTED;
> +	}

Why did you put the check here, rather than...

> +
>  	/* Do we understand who wrote this op? */
>  	if (ohead->oh_clientid != XFS_TRANSACTION &&
>  	    ohead->oh_clientid != XFS_LOG) {
> @@ -2456,7 +2461,6 @@ xlog_recover_process_data(
>  
>  		ohead = (struct xlog_op_header *)dp;
>  		dp += sizeof(*ohead);
> -		ASSERT(dp <= end);

.... where the pointer is advanced and the overrun is exposed?

-Dave.

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 1251c81e55f9..660e79a07ce6 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2361,6 +2361,11 @@  xlog_recover_process_ophdr(
 	unsigned int		len;
 	int			error;
 
+	if (dp > end) {
+		xfs_warn(log->l_mp, "%s: op header overrun", __func__);
+		return -EFSCORRUPTED;
+	}
+
 	/* Do we understand who wrote this op? */
 	if (ohead->oh_clientid != XFS_TRANSACTION &&
 	    ohead->oh_clientid != XFS_LOG) {
@@ -2456,7 +2461,6 @@  xlog_recover_process_data(
 
 		ohead = (struct xlog_op_header *)dp;
 		dp += sizeof(*ohead);
-		ASSERT(dp <= end);
 
 		/* errors will abort recovery */
 		error = xlog_recover_process_ophdr(log, rhash, rhead, ohead,

xfs: add bounds checking to xlog_recover_process_ophdr

Commit Message

Comments

Patch