From patchwork Tue Jul 9 19:10:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728428 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC0951B86D8 for ; Tue, 9 Jul 2024 19:10:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552235; cv=none; b=YGgZ/FnELbZZgNfb90gdt2K9GdSVKrUWB+tsO8gPFFhabuzEPvlO4bc5p0q3angne6hGwxqyc6pVtKz3MLCHEMJ+0fmys4cuHj/DNnpT4hib7yJkWcjFacXzzOGzA72UeC3a7I1kOXm+iS2bTjHtVI7g9Otbp2XzpCKDa7TYqW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552235; c=relaxed/simple; bh=ZikabLqcFbRtymuaN3PyrgFK+xj2E1Y9pQkxR/G8TzM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=h/Qv/sN+c0CP6HDgP+3Ugu6nvSd6YpLQ5tPqRk5RVIj31zFfUGoPH1/j2qRvi4rGO1oR7XfnRr3dzCOFbUCRQ8uZU1RWMc9wHKkgeFLjw3yYZKayYuvO9F1OSk0rW1HQZF4nvTztLsqRrsr+Mtpu8v/cmf0OOHjDNZ+tkuP1Olw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=nFn1wgTI; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="nFn1wgTI" Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtWpB007633 for ; Tue, 9 Jul 2024 19:10:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s= corp-2023-11-20; bh=/RJMgwyfGqz0D8cSjgOsZlxLZ2gWOl2wNhAs0GAuhoA=; b= nFn1wgTIej3h+Kd2RaOCzfzvq6oZV1vc289rT0VpreYIfmuHBGKPmou/SSlPtT51 AMlsED/C1nM2LtAEskNkxzE3f1EqP2qc+4Ykwr/TPeUjgCFvyIgdR0NZi42dGgEk R38K3hruXAsoXmCLf8PRKtGnTdKwj0/xDZYCC975ej9imBb7QxDMdEPnLyZI/aJB r4UxTkIqBWFgkhiEVVrWKWYvn9Pj6m0oHWV8re3pQ6ADLe05x0vaDLNOyeU7h45z zYeylauycubbgwoPU9yp5Ig8K80GWOU9/8e4BhRrdUgQb//FVGJ3hi0lQB/dV+bs B+E4PBVljk+3u0rnuoLu5w== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wgpwsrh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:32 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469J3SKO013658 for ; Tue, 9 Jul 2024 19:10:31 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepmw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:31 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPQ024440 for ; Tue, 9 Jul 2024 19:10:30 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-2; Tue, 09 Jul 2024 19:10:30 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 1/9] xfsprogs: introduce defrag command to spaceman Date: Tue, 9 Jul 2024 12:10:20 -0700 Message-Id: <20240709191028.2329-2-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=937 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-ORIG-GUID: ftOoV3lJDWNt_4lnYWjgjRdPh5Nh1Lev X-Proofpoint-GUID: ftOoV3lJDWNt_4lnYWjgjRdPh5Nh1Lev Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Non-exclusive defragment Here we are introducing the non-exclusive manner to defragment a file, especially for huge files, without blocking IO to it long. Non-exclusive defragmentation divides the whole file into small segments. For each segment, we lock the file, defragment the segment and unlock the file. Defragmenting the small segment doesn’t take long. File IO requests can get served between defragmenting segments before blocked long. Also we put (user adjustable) idle time between defragmenting two consecutive segments to balance the defragmentation and file IOs. The first patch in the set checks for valid target files Valid target files to defrag must: 1. be accessible for read/write 2. be regular files 3. be in XFS filesystem 4. the containing XFS has reflink enabled. This is not checked before starting defragmentation, but error would be reported later. Signed-off-by: Wengang Wang --- spaceman/Makefile | 2 +- spaceman/defrag.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++ spaceman/init.c | 1 + spaceman/space.h | 1 + 4 files changed, 201 insertions(+), 1 deletion(-) create mode 100644 spaceman/defrag.c diff --git a/spaceman/Makefile b/spaceman/Makefile index 1f048d54..9c00b20a 100644 --- a/spaceman/Makefile +++ b/spaceman/Makefile @@ -7,7 +7,7 @@ include $(TOPDIR)/include/builddefs LTCOMMAND = xfs_spaceman HFILES = init.h space.h -CFILES = info.c init.c file.c health.c prealloc.c trim.c +CFILES = info.c init.c file.c health.c prealloc.c trim.c defrag.c LSRCFILES = xfs_info.sh LLDLIBS = $(LIBXCMD) $(LIBFROG) diff --git a/spaceman/defrag.c b/spaceman/defrag.c new file mode 100644 index 00000000..c9732984 --- /dev/null +++ b/spaceman/defrag.c @@ -0,0 +1,198 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2024 Oracle. + * All Rights Reserved. + */ + +#include "libxfs.h" +#include +#include +#include "libfrog/fsgeom.h" +#include "command.h" +#include "init.h" +#include "libfrog/paths.h" +#include "space.h" +#include "input.h" + +/* defrag segment size limit in units of 512 bytes */ +#define MIN_SEGMENT_SIZE_LIMIT 8192 /* 4MiB */ +#define DEFAULT_SEGMENT_SIZE_LIMIT 32768 /* 16MiB */ +static int g_segment_size_lmt = DEFAULT_SEGMENT_SIZE_LIMIT; + +/* size of the defrag target file */ +static off_t g_defrag_file_size = 0; + +/* stats for the target file extents before defrag */ +struct ext_stats { + long nr_ext_total; + long nr_ext_unwritten; + long nr_ext_shared; +}; +static struct ext_stats g_ext_stats; + +/* + * check if the target is a valid file to defrag + * also store file size + * returns: + * true for yes and false for no + */ +static bool +defrag_check_file(char *path) +{ + struct statfs statfs_s; + struct stat stat_s; + + if (access(path, F_OK|W_OK) == -1) { + if (errno == ENOENT) + fprintf(stderr, "file \"%s\" doesn't exist\n", path); + else + fprintf(stderr, "no access to \"%s\", %s\n", path, + strerror(errno)); + return false; + } + + if (stat(path, &stat_s) == -1) { + fprintf(stderr, "failed to get file info on \"%s\": %s\n", + path, strerror(errno)); + return false; + } + + g_defrag_file_size = stat_s.st_size; + + if (!S_ISREG(stat_s.st_mode)) { + fprintf(stderr, "\"%s\" is not a regular file\n", path); + return false; + } + + if (statfs(path, &statfs_s) == -1) { + fprintf(stderr, "failed to get FS info on \"%s\": %s\n", + path, strerror(errno)); + return false; + } + + if (statfs_s.f_type != XFS_SUPER_MAGIC) { + fprintf(stderr, "\"%s\" is not a xfs file\n", path); + return false; + } + + return true; +} + +/* + * defragment a file + * return 0 if successfully done, 1 otherwise + */ +static int +defrag_xfs_defrag(char *file_path) { + int max_clone_us = 0, max_unshare_us = 0, max_punch_us = 0; + long nr_seg_defrag = 0, nr_ext_defrag = 0; + int scratch_fd = -1, defrag_fd = -1; + char tmp_file_path[PATH_MAX+1]; + char *defrag_dir; + struct fsxattr fsx; + int ret = 0; + + fsx.fsx_nextents = 0; + memset(&g_ext_stats, 0, sizeof(g_ext_stats)); + + if (!defrag_check_file(file_path)) { + ret = 1; + goto out; + } + + defrag_fd = open(file_path, O_RDWR); + if (defrag_fd == -1) { + fprintf(stderr, "Opening %s failed. %s\n", file_path, + strerror(errno)); + ret = 1; + goto out; + } + + defrag_dir = dirname(file_path); + snprintf(tmp_file_path, PATH_MAX, "%s/.xfsdefrag_%d", defrag_dir, + getpid()); + tmp_file_path[PATH_MAX] = 0; + scratch_fd = open(tmp_file_path, O_CREAT|O_EXCL|O_RDWR, 0600); + if (scratch_fd == -1) { + fprintf(stderr, "Opening temporary file %s failed. %s\n", + tmp_file_path, strerror(errno)); + ret = 1; + goto out; + } +out: + if (scratch_fd != -1) { + close(scratch_fd); + unlink(tmp_file_path); + } + if (defrag_fd != -1) { + ioctl(defrag_fd, FS_IOC_FSGETXATTR, &fsx); + close(defrag_fd); + } + + printf("Pre-defrag %ld extents detected, %ld are \"unwritten\"," + "%ld are \"shared\"\n", + g_ext_stats.nr_ext_total, g_ext_stats.nr_ext_unwritten, + g_ext_stats.nr_ext_shared); + printf("Tried to defragment %ld extents in %ld segments\n", + nr_ext_defrag, nr_seg_defrag); + printf("Time stats(ms): max clone: %d, max unshare: %d," + " max punch_hole: %d\n", + max_clone_us/1000, max_unshare_us/1000, max_punch_us/1000); + printf("Post-defrag %u extents detected\n", fsx.fsx_nextents); + return ret; +} + + +static void defrag_help(void) +{ + printf(_( +"\n" +"Defragemnt files on XFS where reflink is enabled. IOs to the target files \n" +"can be served durning the defragmentations.\n" +"\n" +" -s segment_size -- specify the segment size in MiB, minmum value is 4 \n" +" default is 16\n")); +} + +static cmdinfo_t defrag_cmd; + +static int +defrag_f(int argc, char **argv) +{ + int i; + int c; + + while ((c = getopt(argc, argv, "s:")) != EOF) { + switch(c) { + case 's': + g_segment_size_lmt = atoi(optarg) * 1024 * 1024 / 512; + if (g_segment_size_lmt < MIN_SEGMENT_SIZE_LIMIT) { + g_segment_size_lmt = MIN_SEGMENT_SIZE_LIMIT; + printf("Using minimium segment size %d\n", + g_segment_size_lmt); + } + break; + default: + command_usage(&defrag_cmd); + return 1; + } + } + + for (i = 0; i < filecount; i++) + defrag_xfs_defrag(filetable[i].name); + return 0; +} +void defrag_init(void) +{ + defrag_cmd.name = "defrag"; + defrag_cmd.altname = "dfg"; + defrag_cmd.cfunc = defrag_f; + defrag_cmd.argmin = 0; + defrag_cmd.argmax = 4; + defrag_cmd.args = "[-s segment_size]"; + defrag_cmd.flags = CMD_FLAG_ONESHOT; + defrag_cmd.oneline = _("Defragment XFS files"); + defrag_cmd.help = defrag_help; + + add_command(&defrag_cmd); +} diff --git a/spaceman/init.c b/spaceman/init.c index cf1ff3cb..396f965c 100644 --- a/spaceman/init.c +++ b/spaceman/init.c @@ -35,6 +35,7 @@ init_commands(void) trim_init(); freesp_init(); health_init(); + defrag_init(); } static int diff --git a/spaceman/space.h b/spaceman/space.h index 723209ed..c288aeb9 100644 --- a/spaceman/space.h +++ b/spaceman/space.h @@ -26,6 +26,7 @@ extern void help_init(void); extern void prealloc_init(void); extern void quit_init(void); extern void trim_init(void); +extern void defrag_init(void); #ifdef HAVE_GETFSMAP extern void freesp_init(void); #else From patchwork Tue Jul 9 19:10:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728429 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 493451DFCF for ; Tue, 9 Jul 2024 19:10:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552235; cv=none; b=l7r87H5bhVwGjlXsBf7KvSIWrKCDTPiNoMkBzpMtafScvxTvytn08mdNF4eD5V3AjSzymDo1kpaQMnhWrH1eiBbfRSz39kZUMdJptMv5Pn57JeyVESkrUkQFzjb0d4tqhGKDihLyylFfTVNsx5DyE85sJamFiLpCvwPYu/XH9mM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552235; c=relaxed/simple; bh=uDOe3pHBc4dIq3zZa0zgN2Tr3LLI8iFh8Boxai9CXMc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=e3eSXu2x9kPKnauMSa4JBtSM62OGQwMqUHV79vLKojZBAbuwvTgk53kW3LduWECb6lGdV4M/PRRkl7MsBLTfuOZU4l4mmmajW7Nbk4AT/hScUcczdMnb+jqyoOMJI9clDjLtNDyzvYe/wC2svoiTEpKGp9JyB9mwF8Q3KuKeA0M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=aIr+ef1Q; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="aIr+ef1Q" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469Ftaaa031069 for ; Tue, 9 Jul 2024 19:10:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=p VMPjBAdGgZWYKviwKIl5YJCE63y5stLsGOKs90PX/I=; b=aIr+ef1Qh2uTr6kdz G3EuKhajG8HKddVKGvVdxKNDjjWMe03URVxiJi7voTjs7MpafjgFecRy44SDWAEf vcyj53eXWfwanaR+IMK2DXFhBaS2J46Cvxy01NL0yK8WSxPL8oQRut1hb4zXhUPN pwVGaO2VcIfW7ulLyXoZnL+o4KOzPmh2omU5bErGmgpK8zT4AQjO4rV2ca/8OLcB k1eSZMYxtVhSaLcY5P54lrNgSIPy6Gn8+9mJfwGwYo5MdE6b0Lxdce0WQu10+sMi lpdLzHj0WLj2r0PXjB+Ec4wHXRArfRlFsvFQ70UqNg6UUfKoEtH3vRQbDV+4oO2k FCANQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wky5tme-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:33 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469IDmWH014129 for ; Tue, 9 Jul 2024 19:10:32 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepnf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:32 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPS024440 for ; Tue, 9 Jul 2024 19:10:31 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-3; Tue, 09 Jul 2024 19:10:31 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 2/9] spaceman/defrag: pick up segments from target file Date: Tue, 9 Jul 2024 12:10:21 -0700 Message-Id: <20240709191028.2329-3-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-ORIG-GUID: mrbkXX6LXIGMx7B_fCzXOhADIEVqkyi3 X-Proofpoint-GUID: mrbkXX6LXIGMx7B_fCzXOhADIEVqkyi3 segments are the smallest unit to defragment. A segment 1. Can't exceed size limit 2. contains some extents 3. the contained extents can't be "unwritten" 4. the contained extents must be contigous in file blocks Signed-off-by: Wengang Wang --- spaceman/defrag.c | 204 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 204 insertions(+) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index c9732984..175cf461 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -14,6 +14,32 @@ #include "space.h" #include "input.h" +#define MAPSIZE 512 +/* used to fetch bmap */ +struct getbmapx g_mapx[MAPSIZE]; +/* current offset of the file in units of 512 bytes, used to fetch bmap */ +static long long g_offset = 0; +/* index to indentify next extent, used to get next extent */ +static int g_ext_next_idx = -1; + +/* + * segment, the smallest unit to defrag + * it includes some contiguous extents. + * no holes included, + * no unwritten extents included + * the size is limited by g_segment_size_lmt + */ +struct defrag_segment { + /* segment offset in units of 512 bytes */ + long long ds_offset; + /* length of segment in units of 512 bytes */ + long long ds_length; + /* number of extents in this segment */ + int ds_nr; + /* flag indicating if segment contains shared blocks */ + bool ds_shared; +}; + /* defrag segment size limit in units of 512 bytes */ #define MIN_SEGMENT_SIZE_LIMIT 8192 /* 4MiB */ #define DEFAULT_SEGMENT_SIZE_LIMIT 32768 /* 16MiB */ @@ -78,6 +104,165 @@ defrag_check_file(char *path) return true; } +/* + * get next extent in the file. + * Note: next call will get the same extent unless move_next_extent() is called. + * returns: + * -1: error happened. + * 0: extent returned + * 1: no more extent left + */ +static int +defrag_get_next_extent(int fd, struct getbmapx *map_out) +{ + int err = 0, i; + + /* when no extents are cached in g_mapx, fetch from kernel */ + if (g_ext_next_idx == -1) { + g_mapx[0].bmv_offset = g_offset; + g_mapx[0].bmv_length = -1LL; + g_mapx[0].bmv_count = MAPSIZE; + g_mapx[0].bmv_iflags = BMV_IF_NO_HOLES | BMV_IF_PREALLOC; + err = ioctl(fd, XFS_IOC_GETBMAPX, g_mapx); + if (err == -1) { + perror("XFS_IOC_GETBMAPX failed"); + goto out; + } + /* for stats */ + g_ext_stats.nr_ext_total += g_mapx[0].bmv_entries; + + /* no more extents */ + if (g_mapx[0].bmv_entries == 0) { + err = 1; + goto out; + } + + /* for stats */ + for (i = 1; i <= g_mapx[0].bmv_entries; i++) { + if (g_mapx[i].bmv_oflags & BMV_OF_PREALLOC) + g_ext_stats.nr_ext_unwritten++; + if (g_mapx[i].bmv_oflags & BMV_OF_SHARED) + g_ext_stats.nr_ext_shared++; + } + + g_ext_next_idx = 1; + g_offset = g_mapx[g_mapx[0].bmv_entries].bmv_offset + + g_mapx[g_mapx[0].bmv_entries].bmv_length; + } + + map_out->bmv_offset = g_mapx[g_ext_next_idx].bmv_offset; + map_out->bmv_length = g_mapx[g_ext_next_idx].bmv_length; + map_out->bmv_oflags = g_mapx[g_ext_next_idx].bmv_oflags; +out: + return err; +} + +/* + * move to next extent + */ +static void +defrag_move_next_extent() +{ + if (g_ext_next_idx == g_mapx[0].bmv_entries) + g_ext_next_idx = -1; + else + g_ext_next_idx += 1; +} + +/* + * check if the given extent is a defrag target. + * no need to check for holes as we are using BMV_IF_NO_HOLES + */ +static bool +defrag_is_target(struct getbmapx *mapx) +{ + /* unwritten */ + if (mapx->bmv_oflags & BMV_OF_PREALLOC) + return false; + return mapx->bmv_length < g_segment_size_lmt; +} + +static bool +defrag_is_extent_shared(struct getbmapx *mapx) +{ + return !!(mapx->bmv_oflags & BMV_OF_SHARED); +} + +/* + * get next segment to defragment. + * returns: + * -1 error happened. + * 0 segment returned. + * 1 no more segments to return + */ +static int +defrag_get_next_segment(int fd, struct defrag_segment *out) +{ + struct getbmapx mapx; + int ret; + + out->ds_offset = 0; + out->ds_length = 0; + out->ds_nr = 0; + out->ds_shared = false; + + do { + ret = defrag_get_next_extent(fd, &mapx); + if (ret != 0) { + /* + * no more extetns, return current segment if its not + * empty + */ + if (ret == 1 && out->ds_nr > 0) + ret = 0; + /* otherwise, error heppened, stop */ + break; + } + + /* + * If the extent is not a defrag target, skip it. + * go to next extent if the segment is empty; + * otherwise return the segment. + */ + if (!defrag_is_target(&mapx)) { + defrag_move_next_extent(); + if (out->ds_nr == 0) + continue; + else + break; + } + + /* check for segment size limitation */ + if (out->ds_length + mapx.bmv_length > g_segment_size_lmt) + break; + + /* the segment is empty now, add this extent to it for sure */ + if (out->ds_nr == 0) { + out->ds_offset = mapx.bmv_offset; + goto add_ext; + } + + /* + * the segment is not empty, check for hole since the last exent + * if a hole exist before this extent, this extent can't be + * added to the segment. return the segment + */ + if (out->ds_offset + out->ds_length != mapx.bmv_offset) + break; + +add_ext: + if (defrag_is_extent_shared(&mapx)) + out->ds_shared = true; + + out->ds_length += mapx.bmv_length; + out->ds_nr += 1; + defrag_move_next_extent(); + + } while (true); + + return ret; +} + /* * defragment a file * return 0 if successfully done, 1 otherwise @@ -92,6 +277,9 @@ defrag_xfs_defrag(char *file_path) { struct fsxattr fsx; int ret = 0; + g_offset = 0; + g_ext_next_idx = -1; + fsx.fsx_nextents = 0; memset(&g_ext_stats, 0, sizeof(g_ext_stats)); @@ -119,6 +307,22 @@ defrag_xfs_defrag(char *file_path) { ret = 1; goto out; } + + do { + struct defrag_segment segment; + + ret = defrag_get_next_segment(defrag_fd, &segment); + /* no more segments, we are done */ + if (ret == 1) { + ret = 0; + break; + } + /* error happened when reading bmap, stop here */ + if (ret == -1) { + ret = 1; + break; + } + } while (true); out: if (scratch_fd != -1) { close(scratch_fd); From patchwork Tue Jul 9 19:10:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728431 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDFD5182A52 for ; Tue, 9 Jul 2024 19:10:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552237; cv=none; b=p8wBaNo8ySTnxqTvMBteRG2wSIH2rruYEYOjvAjCsnlUBUFgqPBFscxqZ7R9GTPaF5BoZSRagmSQuryXaj7mysR6Z4ggD/h0MW8nbzJ3Sgn4iXb0IB9guVtd2cp9b8CcQtCcjbvlzKlBMyg4W7v+OpetlZzg2JaI/bDmchgL4eo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552237; c=relaxed/simple; bh=wI2N9OsGHsVBdcHzXzgUW+61rIclaKvyk+yYkgRUTuM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QOGzzkfhpGDV2dDs2BfInHdsTIkpo7OY7qsNLCU1jR+IWOhEfbb2v4caI/PaiLUBdlkGA+s4rayRZaLCxzdGYXz5Js2q09mC3iC7r5ACQSPrHkuREvWtxe1IcBFO1gXjxMAJcJOAUsEoO+DKcYaD4tidl7vu0kGBGjZ/I20rCFU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=AuzBObVX; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="AuzBObVX" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtbwW031087 for ; Tue, 9 Jul 2024 19:10:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=o TOfKYbgdWEFJfJu+uP4fTM2V4QNGrtqZwhzGOHmJMk=; b=AuzBObVXdYQBagSnI 7H6ijuw5bQcAMz02KEEO7hZ47OTrsMX8pImaQRmmyTMpfzZH0b7RPDybgU36iP2e PahfqlzXC8w0GmWaSI1rbXs5bSx3Z7bkEiFMOS8/Jsi/n+oB5bqVCtxzEOkm78Y+ nYYpGVuxXtKESjupDPjgaZlYuu7N0B64S4w8gRqxj1Tbdca/I7AYWsz7L+d7/qI1 0+445uXuy+jDHOeqx3Pt3BjstOZtThfdl5wio3hwPOKvOxmlgYXnto4dUg/azvj5 F6mMh4/jOUmJNfgRi2fBQigwI3P2r8hQZmasQ5rfbExzHYHtAdeq56gGtpjx6VB2 KdvnA== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wky5tmf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:34 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469IjXJh014344 for ; Tue, 9 Jul 2024 19:10:32 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepp5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:32 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPU024440 for ; Tue, 9 Jul 2024 19:10:32 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-4; Tue, 09 Jul 2024 19:10:32 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 3/9] spaceman/defrag: defrag segments Date: Tue, 9 Jul 2024 12:10:22 -0700 Message-Id: <20240709191028.2329-4-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-ORIG-GUID: k50HGC8-K1Wvg25fVYCVeEnNyXYYi6nu X-Proofpoint-GUID: k50HGC8-K1Wvg25fVYCVeEnNyXYYi6nu For each segment, the following steps are done trying to defrag it: 1. share the segment with a temporary file 2. unshare the segment in the target file. kernel simulates Cow on the whole segment complete the unshare (defrag). 3. release blocks from the tempoary file. Signed-off-by: Wengang Wang --- spaceman/defrag.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index 175cf461..9f11e36b 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -263,6 +263,40 @@ add_ext: return ret; } +/* + * check if the segment exceeds EoF. + * fix up the clone range and return true if EoF happens, + * return false otherwise. + */ +static bool +defrag_clone_eof(struct file_clone_range *clone) +{ + off_t delta; + + delta = clone->src_offset + clone->src_length - g_defrag_file_size; + if (delta > 0) { + clone->src_length = 0; // to the end + return true; + } + return false; +} + +/* + * get the time delta since pre_time in ms. + * pre_time should contains values fetched by gettimeofday() + * cur_time is used to store current time by gettimeofday() + */ +static long long +get_time_delta_us(struct timeval *pre_time, struct timeval *cur_time) +{ + long long us; + + gettimeofday(cur_time, NULL); + us = (cur_time->tv_sec - pre_time->tv_sec) * 1000000; + us += (cur_time->tv_usec - pre_time->tv_usec); + return us; +} + /* * defragment a file * return 0 if successfully done, 1 otherwise @@ -273,6 +307,7 @@ defrag_xfs_defrag(char *file_path) { long nr_seg_defrag = 0, nr_ext_defrag = 0; int scratch_fd = -1, defrag_fd = -1; char tmp_file_path[PATH_MAX+1]; + struct file_clone_range clone; char *defrag_dir; struct fsxattr fsx; int ret = 0; @@ -296,6 +331,8 @@ defrag_xfs_defrag(char *file_path) { goto out; } + clone.src_fd = defrag_fd; + defrag_dir = dirname(file_path); snprintf(tmp_file_path, PATH_MAX, "%s/.xfsdefrag_%d", defrag_dir, getpid()); @@ -309,7 +346,11 @@ defrag_xfs_defrag(char *file_path) { } do { + struct timeval t_clone, t_unshare, t_punch_hole; struct defrag_segment segment; + long long seg_size, seg_off; + int time_delta; + bool stop; ret = defrag_get_next_segment(defrag_fd, &segment); /* no more segments, we are done */ @@ -322,6 +363,79 @@ defrag_xfs_defrag(char *file_path) { ret = 1; break; } + + /* we are done if the segment contains only 1 extent */ + if (segment.ds_nr < 2) + continue; + + /* to bytes */ + seg_off = segment.ds_offset * 512; + seg_size = segment.ds_length * 512; + + clone.src_offset = seg_off; + clone.src_length = seg_size; + clone.dest_offset = seg_off; + + /* checks for EoF and fix up clone */ + stop = defrag_clone_eof(&clone); + gettimeofday(&t_clone, NULL); + ret = ioctl(scratch_fd, FICLONERANGE, &clone); + if (ret != 0) { + fprintf(stderr, "FICLONERANGE failed %s\n", + strerror(errno)); + break; + } + + /* for time stats */ + time_delta = get_time_delta_us(&t_clone, &t_unshare); + if (time_delta > max_clone_us) + max_clone_us = time_delta; + + /* for defrag stats */ + nr_ext_defrag += segment.ds_nr; + + /* + * For the shared range to be unshared via a copy-on-write + * operation in the file to be defragged. This causes the + * file needing to be defragged to have new extents allocated + * and the data to be copied over and written out. + */ + ret = fallocate(defrag_fd, FALLOC_FL_UNSHARE_RANGE, seg_off, + seg_size); + if (ret != 0) { + fprintf(stderr, "UNSHARE_RANGE failed %s\n", + strerror(errno)); + break; + } + + /* for time stats */ + time_delta = get_time_delta_us(&t_unshare, &t_punch_hole); + if (time_delta > max_unshare_us) + max_unshare_us = time_delta; + + /* + * Punch out the original extents we shared to the + * scratch file so they are returned to free space. + */ + ret = fallocate(scratch_fd, + FALLOC_FL_PUNCH_HOLE|FALLOC_FL_KEEP_SIZE, seg_off, + seg_size); + if (ret != 0) { + fprintf(stderr, "PUNCH_HOLE failed %s\n", + strerror(errno)); + break; + } + + /* for defrag stats */ + nr_seg_defrag += 1; + + /* for time stats */ + time_delta = get_time_delta_us(&t_punch_hole, &t_clone); + if (time_delta > max_punch_us) + max_punch_us = time_delta; + + if (stop) + break; } while (true); out: if (scratch_fd != -1) { From patchwork Tue Jul 9 19:10:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728430 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A59D3182A4F for ; Tue, 9 Jul 2024 19:10:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552237; cv=none; b=Kwp3MxqagB1FGurQOdfuzMmBlQkLjxgWi+DHhTU+nn4aHOk/U19aB2QUdMAVUyHYfSBzDP/V8UbLQRN2eHZdjjnxpvjGQLrIb35vel5U1e9xKzYqLAw/2GoU2SHw9TRtshPSNx5wilkiWGIM0XRXRB9pPXp7O64A9zl2XZyZXvo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552237; c=relaxed/simple; bh=FNzmH1f5m/6EBRbYEzn81FQYR4LBtc7MLcFXBYilf5s=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QtEI4vpEY8h5Av2OwMHWWH26iwfnoER1ufcE6h8QQ55SdbcGUwoUqYKG4ibov6kipr4ygWZEFoiwjT64eKzLffB1G91J3NBJEu+fgl1DcUz5mfPpDaiY3V+dO5GpRIIqqgb3PPI56o+YvknsEGcg3qLOr2nfdF7FvjC3ocLlZzQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=CSw1+kTl; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="CSw1+kTl" Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtaRI014970 for ; Tue, 9 Jul 2024 19:10:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=f QmaLNv8ODP5SlOT32W+TB7caAFosKkm11a4RV196gw=; b=CSw1+kTlCTlMKCraj VH/eVN1APFt0LovZMG+7CYUDxAeYbNXXTZSqkcKiqfU9otLbHLmqbSN2rNmC41Dk ySLt7Lw2Bbid9ZOD1j14CGZkpoA1MSnBOA0B9d3+hOhICAvhFNYRRBgA+n08KQzn rAdVAdYt2+yKU5US9qy7g7ZejMm78/N9Sno9q689I4pOt2NLlL1g0pMjvBjxsOuv uAE8d7XJod0JwFp9V9RskHjQT+unVMJWsDwVMz13yvPtjGRe18pSh2Qo6SVUyNsp BeJrEQ3N7pypIKD+ZYhFPSx3QWJD4qJH4cscLwuUCNHrEhVaWBB5CqAbT23tyG+W LlOlA== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wt8dq2j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:34 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469IDmWJ014129 for ; Tue, 9 Jul 2024 19:10:33 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepph-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:33 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPW024440 for ; Tue, 9 Jul 2024 19:10:33 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-5; Tue, 09 Jul 2024 19:10:32 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 4/9] spaceman/defrag: ctrl-c handler Date: Tue, 9 Jul 2024 12:10:23 -0700 Message-Id: <20240709191028.2329-5-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-GUID: GPyls3ljBARhBd3FVKM4tu0tgydYOYf8 X-Proofpoint-ORIG-GUID: GPyls3ljBARhBd3FVKM4tu0tgydYOYf8 Add this handler to break the defrag better, so it has 1. the stats reporting 2. remove the temporary file Signed-off-by: Wengang Wang --- spaceman/defrag.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index 9f11e36b..61e47a43 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -297,6 +297,13 @@ get_time_delta_us(struct timeval *pre_time, struct timeval *cur_time) return us; } +static volatile bool usedKilled = false; +void defrag_sigint_handler(int dummy) +{ + usedKilled = true; + printf("Please wait until current segment is defragmented\n"); +}; + /* * defragment a file * return 0 if successfully done, 1 otherwise @@ -345,6 +352,8 @@ defrag_xfs_defrag(char *file_path) { goto out; } + signal(SIGINT, defrag_sigint_handler); + do { struct timeval t_clone, t_unshare, t_punch_hole; struct defrag_segment segment; @@ -434,7 +443,7 @@ defrag_xfs_defrag(char *file_path) { if (time_delta > max_punch_us) max_punch_us = time_delta; - if (stop) + if (stop || usedKilled) break; } while (true); out: From patchwork Tue Jul 9 19:10:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728432 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E60491DFCF for ; Tue, 9 Jul 2024 19:10:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552238; cv=none; b=ZCCJRatysIhwtX4b4KlxabsVrrTClmFP0oh266S1XYvT8TnJhRnsgCjsHLqaYMJpSOPYrsXGYWhziTa6orWNPv4aMs+UXQ12qvTJB7CxIuUROcbRlt3viFQQzYYPOTgkUmYoQ96xzf4YUhOlYWFMO2u5JEWlM4hB1k3++07UwRM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552238; c=relaxed/simple; bh=I6q0X+FkzCRRcGbSML6QjVbmtow8kfix+hIZpgsNF84=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GyYIaKJ2/cydz6YdxcDSBgeyxkDTRXteRruLuhK3T01IG2ebgD9S2XFrWKRmgJLCbgqAwVYg/5rvqvkXIjatVq9otUaYBvD7MHr4ZYuEPGGzd/4OEDfYyh+4Y0HI5n9dxHQ/H5UXnyLJKEujEcEVBSZaATM2n7c5BpzMi/tIxYs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=ot6u4eYd; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="ot6u4eYd" Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtXdx007676 for ; Tue, 9 Jul 2024 19:10:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=9 jSEX+0XdpBeyyiD1IqAqe6jqB3Xfd7ggL7jODDEOwk=; b=ot6u4eYdytAiN1YXX EIUizTQC8OgmJO1D0FE74FYyj+3D5Spo+3ZRlblmGNFlLwNXD1E+SHBaFghwhrPP 8CQSnKGHNOqrEe2p+o0WZjOCA40naKCO/SEogJ7N2oPj0TFPNXIGVa7JyBPtsErO tkTXecQcxoAkav4HXWde4u1QVKR2241WrWL+c3zN88nYVid+Q+HqBGVzZpLZiIOn ciz+el3q+Q3NPFDN/n2aGrLd5Z2tXS0UatXtfMfERakppLRaboxslN7cv1PFAWba SNEymrIuG5Y7AJubUq8XLPlQ4t8iPaFIwys+uLdLqGubFlXen+ErsZ+MjjyeP9Va J34Tw== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wgpwsrq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:36 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469IbRn7013648 for ; Tue, 9 Jul 2024 19:10:34 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepqb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:34 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPY024440 for ; Tue, 9 Jul 2024 19:10:34 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-6; Tue, 09 Jul 2024 19:10:34 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 5/9] spaceman/defrag: exclude shared segments on low free space Date: Tue, 9 Jul 2024 12:10:24 -0700 Message-Id: <20240709191028.2329-6-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-ORIG-GUID: B_LsGquCZCiDTAMU2uCgYITbb5ec8d87 X-Proofpoint-GUID: B_LsGquCZCiDTAMU2uCgYITbb5ec8d87 On some XFS, free blocks are over-committed to reflink copies. And those free blocks are not enough if CoW happens to all the shared blocks. This defrag tool would exclude shared segments when free space is under shrethold. Signed-off-by: Wengang Wang --- spaceman/defrag.c | 46 +++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index 61e47a43..f8e6713c 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -304,6 +304,29 @@ void defrag_sigint_handler(int dummy) printf("Please wait until current segment is defragmented\n"); }; +/* + * limitation of filesystem free space in bytes. + * when filesystem has less free space than this number, segments which contain + * shared extents are skipped. 1GiB by default + */ +static long g_limit_free_bytes = 1024 * 1024 * 1024; + +/* + * check if the free space in the FS is less than the _limit_ + * return true if so, false otherwise + */ +static bool +defrag_fs_limit_hit(int fd) +{ + struct statfs statfs_s; + + if (g_limit_free_bytes <= 0) + return false; + + fstatfs(fd, &statfs_s); + return statfs_s.f_bsize * statfs_s.f_bavail < g_limit_free_bytes; +} + /* * defragment a file * return 0 if successfully done, 1 otherwise @@ -377,6 +400,15 @@ defrag_xfs_defrag(char *file_path) { if (segment.ds_nr < 2) continue; + /* + * When the segment is (partially) shared, defrag would + * consume free blocks. We check the limit of FS free blocks + * and skip defragmenting this segment in case the limit is + * reached. + */ + if (segment.ds_shared && defrag_fs_limit_hit(defrag_fd)) + continue; + /* to bytes */ seg_off = segment.ds_offset * 512; seg_size = segment.ds_length * 512; @@ -478,7 +510,11 @@ static void defrag_help(void) "can be served durning the defragmentations.\n" "\n" " -s segment_size -- specify the segment size in MiB, minmum value is 4 \n" -" default is 16\n")); +" default is 16\n" +" -f free_space -- specify shrethod of the XFS free space in MiB, when\n" +" XFS free space is lower than that, shared segments \n" +" are excluded from defragmentation, 1024 by default\n" + )); } static cmdinfo_t defrag_cmd; @@ -489,7 +525,7 @@ defrag_f(int argc, char **argv) int i; int c; - while ((c = getopt(argc, argv, "s:")) != EOF) { + while ((c = getopt(argc, argv, "s:f:")) != EOF) { switch(c) { case 's': g_segment_size_lmt = atoi(optarg) * 1024 * 1024 / 512; @@ -499,6 +535,10 @@ defrag_f(int argc, char **argv) g_segment_size_lmt); } break; + case 'f': + g_limit_free_bytes = atol(optarg) * 1024 * 1024; + break; + default: command_usage(&defrag_cmd); return 1; @@ -516,7 +556,7 @@ void defrag_init(void) defrag_cmd.cfunc = defrag_f; defrag_cmd.argmin = 0; defrag_cmd.argmax = 4; - defrag_cmd.args = "[-s segment_size]"; + defrag_cmd.args = "[-s segment_size] [-f free_space]"; defrag_cmd.flags = CMD_FLAG_ONESHOT; defrag_cmd.oneline = _("Defragment XFS files"); defrag_cmd.help = defrag_help; From patchwork Tue Jul 9 19:10:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728433 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 08309182A4F for ; Tue, 9 Jul 2024 19:10:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552239; cv=none; b=IBl5Ixhe0QJiXQ2Q256f7Ykn5OAZWjtzRdyT7z3gAVMWAJfs0mNiyMy2KRS897Gdu1A2pNqPIbdDTKq5cVFvA15mT7J8LnPuvQlbH/gVjVwnJ0WRokKfuOn9gNj5IT2j0H7RPt6WT3kexj/HEfRgUUVO+0rJoO1Z8BbIu4tovns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552239; c=relaxed/simple; bh=dePsrq2fmZK09Ey35ANhhuTru+k/5JjGv0XRUhRs5js=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RFifVkMY/P4FqJxUNcsZBxiF0Bqd33nVSwp5J9E+XeGBmvfT795dd8/QTv4ERdPW7UBQ4eS3gqvkarG4RcNb7aWOt8JnAGpHd6+RDJfWKVRwZkTat3P+rL6mzOOT9Db81MBp5BRIbh42P7EzN6VS+CKTqR4ps6nJqRLjfL85fdQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Z4lpzmca; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Z4lpzmca" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469Fta9t031057 for ; Tue, 9 Jul 2024 19:10:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=r jgPPTfJNj2JOGyAEHszdU1xzx289N8m+gLdBdmHb5A=; b=Z4lpzmca5ZMMp4cz7 dUXZCNPiq+lk3FX1+vb8Ynz3+YAP1Zf2o2yPFQSo8XWr4TLcZopKS2iRPU0BFGDQ jeHlUdospno2McOkvQIa5Q0YuSXLE+XM+rthsSX9jpCWhdbEPLscQIuEXJAx4xSZ 5DBvedenkCZBB20LftrizL+Hh/UHi7PAUmVikdg9BEM4Mp/Fr/tb/+iRmDAh+4SQ zAUzRYOe7Ic0shr1lffLi8Mo9jgXjPcCxbgqe3hw7dB9GY2FTyduwPYS5ugFVkx/ +ICU8NDeb33DQEdym3GEJmXn2fVwf3UZ9hvnrlB4FtpBlplhNcsTGlUFuG5snCss BsUpQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wky5tmj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:37 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469IDThQ014345 for ; Tue, 9 Jul 2024 19:10:35 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepr2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:35 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPa024440 for ; Tue, 9 Jul 2024 19:10:35 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-7; Tue, 09 Jul 2024 19:10:35 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 6/9] spaceman/defrag: workaround kernel xfs_reflink_try_clear_inode_flag() Date: Tue, 9 Jul 2024 12:10:25 -0700 Message-Id: <20240709191028.2329-7-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-ORIG-GUID: a5_beDO4pSJ2mirs93ekUhp2qxuUGPpG X-Proofpoint-GUID: a5_beDO4pSJ2mirs93ekUhp2qxuUGPpG xfs_reflink_try_clear_inode_flag() takes very long in case file has huge number of extents and none of the extents are shared. workaround: share the first real extent so that xfs_reflink_try_clear_inode_flag() returns quickly to save cpu times and speed up defrag significantly. Signed-off-by: Wengang Wang --- spaceman/defrag.c | 174 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 172 insertions(+), 2 deletions(-) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index f8e6713c..b5c5b187 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -327,6 +327,155 @@ defrag_fs_limit_hit(int fd) return statfs_s.f_bsize * statfs_s.f_bavail < g_limit_free_bytes; } +static bool g_enable_first_ext_share = true; + +static int +defrag_get_first_real_ext(int fd, struct getbmapx *mapx) +{ + int err; + + while (1) { + err = defrag_get_next_extent(fd, mapx); + if (err) + break; + + defrag_move_next_extent(); + if (!(mapx->bmv_oflags & BMV_OF_PREALLOC)) + break; + } + return err; +} + +static __u64 g_share_offset = -1ULL; +static __u64 g_share_len = 0ULL; +#define SHARE_MAX_SIZE 32768 /* 32KiB */ + +/* share the first real extent with scrach */ +static void +defrag_share_first_extent(int defrag_fd, int scratch_fd) +{ +#define OFFSET_1PB 0x4000000000000LL + struct file_clone_range clone; + struct getbmapx mapx; + int err; + + if (g_enable_first_ext_share == false) + return; + + err = defrag_get_first_real_ext(defrag_fd, &mapx); + if (err) + return; + + clone.src_fd = defrag_fd; + clone.src_offset = mapx.bmv_offset * 512; + clone.src_length = mapx.bmv_length * 512; + /* shares at most SHARE_MAX_SIZE length */ + if (clone.src_length > SHARE_MAX_SIZE) + clone.src_length = SHARE_MAX_SIZE; + clone.dest_offset = OFFSET_1PB + clone.src_offset; + /* if the first is extent is reaching the EoF, no need to share */ + if (clone.src_offset + clone.src_length >= g_defrag_file_size) + return; + err = ioctl(scratch_fd, FICLONERANGE, &clone); + if (err != 0) { + fprintf(stderr, "cloning first extent failed: %s\n", + strerror(errno)); + return; + } + + /* safe the offset and length for re-share */ + g_share_offset = clone.src_offset; + g_share_len = clone.src_length; +} + +/* re-share the blocks we shared previous if then are no longer shared */ +static void +defrag_reshare_blocks_in_front(int defrag_fd, int scratch_fd) +{ +#define NR_GET_EXT 9 + struct getbmapx mapx[NR_GET_EXT]; + struct file_clone_range clone; + __u64 new_share_len; + int idx, err; + + if (g_enable_first_ext_share == false) + return; + + if (g_share_len == 0ULL) + return; + + /* + * check if previous shareing still exist + * we are done if (partially) so. + */ + mapx[0].bmv_offset = g_share_offset; + mapx[0].bmv_length = g_share_len; + mapx[0].bmv_count = NR_GET_EXT; + mapx[0].bmv_iflags = BMV_IF_NO_HOLES | BMV_IF_PREALLOC; + err = ioctl(defrag_fd, XFS_IOC_GETBMAPX, mapx); + if (err) { + fprintf(stderr, "XFS_IOC_GETBMAPX failed %s\n", + strerror(errno)); + /* won't try share again */ + g_share_len = 0ULL; + return; + } + + if (mapx[0].bmv_entries == 0) { + /* shared blocks all became hole, won't try share again */ + g_share_len = 0ULL; + return; + } + + if (g_share_offset != 512 * mapx[1].bmv_offset) { + /* first shared block became hole, won't try share again */ + g_share_len = 0ULL; + return; + } + + /* we check up to only the first NR_GET_EXT - 1 extents */ + for (idx = 1; idx <= mapx[0].bmv_entries; idx++) { + if (mapx[idx].bmv_oflags & BMV_OF_SHARED) { + /* some blocks still shared, done */ + return; + } + } + + /* + * The previously shared blocks are no longer shared, re-share. + * deallocate the blocks in scrath file first + */ + err = fallocate(scratch_fd, + FALLOC_FL_PUNCH_HOLE|FALLOC_FL_KEEP_SIZE, + OFFSET_1PB + g_share_offset, g_share_len); + if (err != 0) { + fprintf(stderr, "punch hole failed %s\n", + strerror(errno)); + g_share_len = 0; + return; + } + + new_share_len = 512 * mapx[1].bmv_length; + if (new_share_len > SHARE_MAX_SIZE) + new_share_len = SHARE_MAX_SIZE; + + clone.src_fd = defrag_fd; + /* keep starting offset unchanged */ + clone.src_offset = g_share_offset; + clone.src_length = new_share_len; + clone.dest_offset = OFFSET_1PB + clone.src_offset; + + err = ioctl(scratch_fd, FICLONERANGE, &clone); + if (err) { + fprintf(stderr, "FICLONERANGE failed %s\n", + strerror(errno)); + g_share_len = 0; + return; + } + + g_share_len = new_share_len; + } + /* * defragment a file * return 0 if successfully done, 1 otherwise @@ -377,6 +526,12 @@ defrag_xfs_defrag(char *file_path) { signal(SIGINT, defrag_sigint_handler); + /* + * share the first extent to work around kernel consuming time + * in xfs_reflink_try_clear_inode_flag() + */ + defrag_share_first_extent(defrag_fd, scratch_fd); + do { struct timeval t_clone, t_unshare, t_punch_hole; struct defrag_segment segment; @@ -454,6 +609,15 @@ defrag_xfs_defrag(char *file_path) { if (time_delta > max_unshare_us) max_unshare_us = time_delta; + /* + * if unshare used more than 1 second, time is very possibly + * used in checking if the file is sharing extents now. + * to avoid that happen again we re-share the blocks in front + * to workaround that. + */ + if (time_delta > 1000000) + defrag_reshare_blocks_in_front(defrag_fd, scratch_fd); + /* * Punch out the original extents we shared to the * scratch file so they are returned to free space. @@ -514,6 +678,8 @@ static void defrag_help(void) " -f free_space -- specify shrethod of the XFS free space in MiB, when\n" " XFS free space is lower than that, shared segments \n" " are excluded from defragmentation, 1024 by default\n" +" -n -- disable the \"share first extent\" featue, it's\n" +" enabled by default to speed up\n" )); } @@ -525,7 +691,7 @@ defrag_f(int argc, char **argv) int i; int c; - while ((c = getopt(argc, argv, "s:f:")) != EOF) { + while ((c = getopt(argc, argv, "s:f:n")) != EOF) { switch(c) { case 's': g_segment_size_lmt = atoi(optarg) * 1024 * 1024 / 512; @@ -539,6 +705,10 @@ defrag_f(int argc, char **argv) g_limit_free_bytes = atol(optarg) * 1024 * 1024; break; + case 'n': + g_enable_first_ext_share = false; + break; + default: command_usage(&defrag_cmd); return 1; @@ -556,7 +726,7 @@ void defrag_init(void) defrag_cmd.cfunc = defrag_f; defrag_cmd.argmin = 0; defrag_cmd.argmax = 4; - defrag_cmd.args = "[-s segment_size] [-f free_space]"; + defrag_cmd.args = "[-s segment_size] [-f free_space] [-n]"; defrag_cmd.flags = CMD_FLAG_ONESHOT; defrag_cmd.oneline = _("Defragment XFS files"); defrag_cmd.help = defrag_help; From patchwork Tue Jul 9 19:10:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728434 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42352182A52 for ; Tue, 9 Jul 2024 19:10:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552239; cv=none; b=Jv9kI+qDN/MKJ1nGUsrIIBQi5pXxpfv1COdOEgny75hVVMyRS8dsDfIwLMXcQ5/nDBLUmS9YiNFyUAAn24r3bTWjGONH8y6Q+CPrFiXCTT1xswxqdS1jCzDwyW1z7856BWcR4Vke1Gc6m6Pa1d7RyaNnnzew0HDGLTV+pbJb4rM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552239; c=relaxed/simple; bh=jNdSGNnxYVWlWWVZ3VlOxE93Yy+DgiQM9jMw/bRqNTs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YvZ1z2crdIxAbIWGxTVDAv1rwe13x6BEmGMkYpSBsy82HEg9hI984bONfe+yhJZ3VeIBjQmX8HQbjn2yKZ66qxKiO3p8GMTITqP4oGyI9oEifiCfhnphv2DzE6m/xX0iWdMjcl0asPen65iwa/+glR18jwqc/peJU8krVOyfg68= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=KkoNe+F6; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="KkoNe+F6" Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtUUL003165 for ; Tue, 9 Jul 2024 19:10:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=f mltrwphEUrMVDHzYN30wxGU+PQbTJKuwsnCJxCwTOo=; b=KkoNe+F6QSqtFN/lB 2H987r3DclC+BWQTVn92m1uRMWM5rTStgPOJW+YOgCqinPlRe8eTidec4FokF83o rz2NTcTKAudHfPXYiiJPR05aBjnR+oQ/8boQCAkumV0QcNOBsb/vpDv5B90KtHod t2/a1JMCuC6ymKeb7P4atmQ4MKzjdcJHGRbzDwuVpFM7klzrUbKvsOlx8kXoYOVZ lf8YFCH4SkSj4u9GBKZ1UmMDldZl58ewuNzfwHPTb5a81Em50NmN7p/yXox8kU/V u6h5ZIVXH9iO2wNXwj6FyTEN6yI+zVveXKxANCFkHr6TR6Ci+EAySU2WfOYshkxC Pa/SQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406wknnsas-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:37 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469IIwoA014343 for ; Tue, 9 Jul 2024 19:10:36 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txheprp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:36 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPc024440 for ; Tue, 9 Jul 2024 19:10:36 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-8; Tue, 09 Jul 2024 19:10:35 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 7/9] spaceman/defrag: sleeps between segments Date: Tue, 9 Jul 2024 12:10:26 -0700 Message-Id: <20240709191028.2329-8-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-GUID: 1m7Jffp0XgCCCf5aEWUdoDXVx2JilUU_ X-Proofpoint-ORIG-GUID: 1m7Jffp0XgCCCf5aEWUdoDXVx2JilUU_ Let user contol the time to sleep between segments (file unlocked) to balance defrag performance and file IO servicing time. Signed-off-by: Wengang Wang --- spaceman/defrag.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index b5c5b187..415fe9c2 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -311,6 +311,9 @@ void defrag_sigint_handler(int dummy) */ static long g_limit_free_bytes = 1024 * 1024 * 1024; +/* sleep time in us between segments, overwritten by paramter */ +static int g_idle_time = 250 * 1000; + /* * check if the free space in the FS is less than the _limit_ * return true if so, false otherwise @@ -487,6 +490,7 @@ defrag_xfs_defrag(char *file_path) { int scratch_fd = -1, defrag_fd = -1; char tmp_file_path[PATH_MAX+1]; struct file_clone_range clone; + int sleep_time_us = 0; char *defrag_dir; struct fsxattr fsx; int ret = 0; @@ -574,6 +578,9 @@ defrag_xfs_defrag(char *file_path) { /* checks for EoF and fix up clone */ stop = defrag_clone_eof(&clone); + if (sleep_time_us > 0) + usleep(sleep_time_us); + gettimeofday(&t_clone, NULL); ret = ioctl(scratch_fd, FICLONERANGE, &clone); if (ret != 0) { @@ -587,6 +594,10 @@ defrag_xfs_defrag(char *file_path) { if (time_delta > max_clone_us) max_clone_us = time_delta; + /* sleeps if clone cost more than 500ms, slow FS */ + if (time_delta >= 500000 && g_idle_time > 0) + usleep(g_idle_time); + /* for defrag stats */ nr_ext_defrag += segment.ds_nr; @@ -641,6 +652,12 @@ defrag_xfs_defrag(char *file_path) { if (stop || usedKilled) break; + + /* + * no lock on target file when punching hole from scratch file, + * so minus the time used for punching hole + */ + sleep_time_us = g_idle_time - time_delta; } while (true); out: if (scratch_fd != -1) { @@ -678,6 +695,7 @@ static void defrag_help(void) " -f free_space -- specify shrethod of the XFS free space in MiB, when\n" " XFS free space is lower than that, shared segments \n" " are excluded from defragmentation, 1024 by default\n" +" -i idle_time -- time in ms to be idle between segments, 250ms by default\n" " -n -- disable the \"share first extent\" featue, it's\n" " enabled by default to speed up\n" )); @@ -691,7 +709,7 @@ defrag_f(int argc, char **argv) int i; int c; - while ((c = getopt(argc, argv, "s:f:n")) != EOF) { + while ((c = getopt(argc, argv, "s:f:ni")) != EOF) { switch(c) { case 's': g_segment_size_lmt = atoi(optarg) * 1024 * 1024 / 512; @@ -709,6 +727,10 @@ defrag_f(int argc, char **argv) g_enable_first_ext_share = false; break; + case 'i': + g_idle_time = atoi(optarg) * 1000; + break; + default: command_usage(&defrag_cmd); return 1; @@ -726,7 +748,7 @@ void defrag_init(void) defrag_cmd.cfunc = defrag_f; defrag_cmd.argmin = 0; defrag_cmd.argmax = 4; - defrag_cmd.args = "[-s segment_size] [-f free_space] [-n]"; + defrag_cmd.args = "[-s segment_size] [-f free_space] [-i idle_time] [-n]"; defrag_cmd.flags = CMD_FLAG_ONESHOT; defrag_cmd.oneline = _("Defragment XFS files"); defrag_cmd.help = defrag_help; From patchwork Tue Jul 9 19:10:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728435 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE05C182A66 for ; Tue, 9 Jul 2024 19:10:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552240; cv=none; b=fFa7T6G7cWTP3GGeuEAIyAGNZkdmLNlvWB18zsMBhD00UhBGhIau030y45+GD0j+WraRHABs4F6ei+Ub/oB2i8/QhWt3P7ESJlTMx49rvKdNSAEeewCyUCLVQmzsVFnmrbY5MJc7gvoACxF3KBgyzsQoSsoS1trEcvHkaVHu0bI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552240; c=relaxed/simple; bh=ehEm9I4nQlDJly+S0ymwSvkUuAGnob6VmXHrv/iYiNY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=aQC3cEgphterYrDMIR1v7lVk7epFrExcZaSMEPcTdD2Gag/n+0O3tp4BJSgTnEYUSdzc4U/+8lxIFMY/M0BUb6nDB8cHi77IZdGYjb07itL7N+n8d7ZZUmPVRPhrtHu8ZO0LcDXhQxx1GaJylhdDiYpUD/1tsjUZeYmhyZv6BH0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=izSAo157; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="izSAo157" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtaPs005761 for ; Tue, 9 Jul 2024 19:10:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=3 TZmsd36ctSHK3/VgqYPCTn9ThtuP4lWsO06N633U0w=; b=izSAo157u/A3UuFwd spSKHY7iYu8SHz8UlbDLVsjLomkzR8lTr7rJapx9Tr8a1wAHG74Yp1ABO7CIjzrE rk9arRldycc+NNNIh4172TAULy+ZX3lKQqGAW/Ac3M1zG7Ty8+FPnxr0tTok9WeN 39ljsvYSfQXHmlOD2YaGRiSTtQ6BqBtbutGUHNW/hZvu955hAoRlUxFj8tHsaFV4 y7lp45mU5RlyYpBcPxqPRGzwYsuS/StN7qLwCSxS4nR2/tLVv9tRlqPLdW2jwId/ 4tqzN/YekDGH3I57KRL3GEMrlrmNE2o4Es3Td/3OoeQaBX9Dv+4isYqKwUxMvLoc VPMsQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 407emsw4qv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:37 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469Hf1oA014442 for ; Tue, 9 Jul 2024 19:10:37 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhepsa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:37 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPe024440 for ; Tue, 9 Jul 2024 19:10:36 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-9; Tue, 09 Jul 2024 19:10:36 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 8/9] spaceman/defrag: readahead for better performance Date: Tue, 9 Jul 2024 12:10:27 -0700 Message-Id: <20240709191028.2329-9-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-GUID: 2ZbAIpBVkUZWQWHToUr52CbIIxnRG-au X-Proofpoint-ORIG-GUID: 2ZbAIpBVkUZWQWHToUr52CbIIxnRG-au Reading ahead take less lock on file compared to "unshare" the file via ioctl. Do readahead when defrag sleeps for better defrag performace and thus more file IO time. Signed-off-by: Wengang Wang --- spaceman/defrag.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index 415fe9c2..ab8508bb 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -331,6 +331,18 @@ defrag_fs_limit_hit(int fd) } static bool g_enable_first_ext_share = true; +static bool g_readahead = false; + +static void defrag_readahead(int defrag_fd, off64_t offset, size_t count) +{ + if (!g_readahead || g_idle_time <= 0) + return; + + if (readahead(defrag_fd, offset, count) < 0) { + fprintf(stderr, "readahead failed: %s, errno=%d\n", + strerror(errno), errno); + } +} static int defrag_get_first_real_ext(int fd, struct getbmapx *mapx) @@ -578,6 +590,8 @@ defrag_xfs_defrag(char *file_path) { /* checks for EoF and fix up clone */ stop = defrag_clone_eof(&clone); + defrag_readahead(defrag_fd, seg_off, seg_size); + if (sleep_time_us > 0) usleep(sleep_time_us); @@ -698,6 +712,7 @@ static void defrag_help(void) " -i idle_time -- time in ms to be idle between segments, 250ms by default\n" " -n -- disable the \"share first extent\" featue, it's\n" " enabled by default to speed up\n" +" -a -- do readahead to speed up defrag, disabled by default\n" )); } @@ -709,7 +724,7 @@ defrag_f(int argc, char **argv) int i; int c; - while ((c = getopt(argc, argv, "s:f:ni")) != EOF) { + while ((c = getopt(argc, argv, "s:f:nia")) != EOF) { switch(c) { case 's': g_segment_size_lmt = atoi(optarg) * 1024 * 1024 / 512; @@ -731,6 +746,10 @@ defrag_f(int argc, char **argv) g_idle_time = atoi(optarg) * 1000; break; + case 'a': + g_readahead = true; + break; + default: command_usage(&defrag_cmd); return 1; From patchwork Tue Jul 9 19:10:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 13728436 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52DEA182A6A for ; Tue, 9 Jul 2024 19:10:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552244; cv=none; b=qwAuzzcFhaphlGj9BHG5MXVJ64KiJI99y/gWVqSxxeTKv1UsaSpnU2eqY0+qH7YwWYH2kudPZOj1TUR8Emw1eRTeRtKeStxedUevt6waJzMxILPBmdnDehbAMYoFUw0eCGJjhho0hYjPluWq5D1Yk9cHXeepN0cl2xxbmqIOlFI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720552244; c=relaxed/simple; bh=0dsngSlkU3j83pK459H65qI6O2WohWO7ui5rcbdTRAw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=t0A87DYIQfGYjAeM2Aut8ziLwvBispAh9CfCMz36RDaEcvXgKlxi3aDhF+mURgN2H6cLZ7nTNyclnpKpabgekvzQt5OMjlm0kO1RiONUHmZd6sRY+mO9JjkA0xbXHWjdE18OueDFT2SLjUP0h5Ldx5i5FsisF6rwISGvtF32vwA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Bd+yM6tf; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Bd+yM6tf" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 469FtVNI027834 for ; Tue, 9 Jul 2024 19:10:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=D Fyxzp2jMdJNwUDCiZDRHdA8MTj1yAm7R/veI9G0X9s=; b=Bd+yM6tfYqtFoXZSi /GIBTCuJkw3Ui2AIv/NZ664eVSC6b81L3qI3On9ee9pD7JvIwiitn3dsW5VUcyOv rvsNRwKccArYyhPmWch11WzjkgeDw5F46gnpg1aTVO2DBKIFEv3uNQvZurWBqhFf 1Lq34l95Ux4Fkpyva54RBM3WTZRwggfmHn6nJui+I+8Q09lO3c4RgUmS2/INuFE4 2dJ2acxnqblNlbtxe9PQvVJyNwng2KQWXnBtgl+EEiaEc3sg2LACFAqJjH41S7iZ HWIDUCtQHNBeBKOwlYJ6Ug0ntIRKKEX32/1R2g02/E/103KRdu0L5jmS59tO4AN7 lh8jQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 406xfsnsg4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:39 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 469J6JB1013665 for ; Tue, 9 Jul 2024 19:10:38 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 407txhept1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 09 Jul 2024 19:10:38 +0000 Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 469JAUPg024440 for ; Tue, 9 Jul 2024 19:10:37 GMT Received: from wwg-mac.us.oracle.com (dhcp-10-159-146-188.vpn.oracle.com [10.159.146.188]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 407txhepkm-10; Tue, 09 Jul 2024 19:10:37 +0000 From: Wengang Wang To: linux-xfs@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH 9/9] spaceman/defrag: warn on extsize Date: Tue, 9 Jul 2024 12:10:28 -0700 Message-Id: <20240709191028.2329-10-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20240709191028.2329-1-wen.gang.wang@oracle.com> References: <20240709191028.2329-1-wen.gang.wang@oracle.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-07-09_08,2024-07-09_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 mlxscore=0 spamscore=0 mlxlogscore=833 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2406180000 definitions=main-2407090129 X-Proofpoint-ORIG-GUID: tP4VAaza0L4jqioYOhfCBVbP0ckWhIQQ X-Proofpoint-GUID: tP4VAaza0L4jqioYOhfCBVbP0ckWhIQQ According to current kernel implemenation, non-zero extsize might affect the result of defragmentation. Just print a warning on that if non-zero extsize is set on file. Signed-off-by: Wengang Wang --- spaceman/defrag.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/spaceman/defrag.c b/spaceman/defrag.c index ab8508bb..b6b89dd9 100644 --- a/spaceman/defrag.c +++ b/spaceman/defrag.c @@ -526,6 +526,18 @@ defrag_xfs_defrag(char *file_path) { goto out; } + if (ioctl(defrag_fd, FS_IOC_FSGETXATTR, &fsx) < 0) { + fprintf(stderr, "FSGETXATTR failed %s\n", + strerror(errno)); + ret = 1; + goto out; + } + + if (fsx.fsx_extsize != 0) + fprintf(stderr, "%s has extsize set %d. That might affect defrag " + "according to kernel implementation\n", + file_path, fsx.fsx_extsize); + clone.src_fd = defrag_fd; defrag_dir = dirname(file_path);