From patchwork Mon Oct 21 16:37:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 13844431 Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44BD61F706A; Mon, 21 Oct 2024 16:37:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.60.130.6 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729528680; cv=none; b=HiZiBFh6p4dBPPWNdUpE4VmsIG9+gN4GtFusOKZ6oBl8Ky6tt9GnkoQ4YDY8z+SBqSVyQVjC6eObzlfmlRaAQBYJj9q9xmeUIlaPGvvfpA+cV/hQ1Pq5LgDastJ4amXhQDHgnS7Uq/jEvKjRnJpzDs9umJD8yuzQmg+57ZuOrxc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729528680; c=relaxed/simple; bh=vl3JoB/DwEm8nWzjX2Krc1IOE1I7PuBEAMF2rU+Lsac=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=FwETPuUEwSnEWeswbU4HrXl9ReK3POCdg6ZsZfGCa0/tj6lsHldZLKq/ayptty4lcFO8GOB3ETAX/t35yPBN0NO4ostqEVqZh1OAWcCixVI99O3tMbkxdJIHti6rEXCfLr0GWe2HcgYpLz248+9ZaQ1HNM+2i1fWISRxaZrq56k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=TWyfiSiM; arc=none smtp.client-ip=178.60.130.6 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="TWyfiSiM" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=xvtpXgeaP5+3Qey4SeTiBPWN2JmcKdnKWOsB+lrnGJo=; b=TWyfiSiMpn2JY+LAKRTmhYlWm0 ctHt/AVzOxuRttTp1AjNMOv8enBzWmA/p0TaQJBuTdFZTN3bH0P/LlKL2UzpIqu1cJJescJqSlQtw FjGjK+HFZ0K8aXxXh63UIjs4icp1ME5rw4sXMVwMZvcTvbbRApgXEqqATGFCekTNnRmakbdodbfv9 ZAsvZI3iSsDSEmgCa6fdNlbMkB9tDqjqFJFF+/byI9/Zj/IueginCoUSXBYTHQApbzfG+L0VEW6me LloWvz/NEwxuQognsc29f3R/4/h90YVP0wjNItE+1r9uzMKRgZjUQ0Kclk6q7Ci5lnnqH5+ydInM5 riVbX8WA==; Received: from [191.204.195.205] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1t2vPo-00DECf-2D; Mon, 21 Oct 2024 18:37:56 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Mon, 21 Oct 2024 13:37:20 -0300 Subject: [PATCH v8 4/9] unicode: Recreate utf8_parse_version() Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241021-tonyk-tmpfs-v8-4-f443d5814194@igalia.com> References: <20241021-tonyk-tmpfs-v8-0-f443d5814194@igalia.com> In-Reply-To: <20241021-tonyk-tmpfs-v8-0-f443d5814194@igalia.com> To: Gabriel Krisman Bertazi , Alexander Viro , Christian Brauner , Jan Kara , Theodore Ts'o , Andreas Dilger , Hugh Dickins , Andrew Morton , Jonathan Corbet , smcv@collabora.com Cc: kernel-dev@igalia.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, =?utf-8?q?Andr=C3=A9_Almeid?= =?utf-8?q?a?= , Gabriel Krisman Bertazi X-Mailer: b4 0.14.2 All filesystems that currently support UTF-8 casefold can fetch the UTF-8 version from the filesystem metadata stored on disk. They can get the data stored and directly match it to a integer, so they can skip the string parsing step, which motivated the removal of this function in the first place. However, for tmpfs, the only way to tell the kernel which UTF-8 version we are about to use is via mount options, using a string. Re-introduce utf8_parse_version() to be used by tmpfs. This version differs from the original by skipping the intermediate step of copying the version string to an auxiliary string before calling match_token(). This versions calls match_token() in the argument string. The paramenters are simpler now as well. utf8_parse_version() was created by 9d53690f0d4 ("unicode: implement higher level API for string handling") and later removed by 49bd03cc7e9 ("unicode: pass a UNICODE_AGE() tripple to utf8_load"). Signed-off-by: André Almeida Reviewed-by: Theodore Ts'o Reviewed-by: Gabriel Krisman Bertazi --- Changes from v3: - Return version on the return value, instead of returning version at function args. --- fs/unicode/utf8-core.c | 26 ++++++++++++++++++++++++++ include/linux/unicode.h | 2 ++ 2 files changed, 28 insertions(+) diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c index 8395066341a437d0c20d6ab49b0a022eac7eec5c..7f7cb14e01ce8aa87d14dffdd767f63a90cf11f7 100644 --- a/fs/unicode/utf8-core.c +++ b/fs/unicode/utf8-core.c @@ -214,3 +214,29 @@ void utf8_unload(struct unicode_map *um) } EXPORT_SYMBOL(utf8_unload); +/** + * utf8_parse_version - Parse a UTF-8 version number from a string + * + * @version: input string + * + * Returns the parsed version on success, negative code on error + */ +int utf8_parse_version(char *version) +{ + substring_t args[3]; + unsigned int maj, min, rev; + static const struct match_token token[] = { + {1, "%d.%d.%d"}, + {0, NULL} + }; + + if (match_token(version, token, args) != 1) + return -EINVAL; + + if (match_int(&args[0], &maj) || match_int(&args[1], &min) || + match_int(&args[2], &rev)) + return -EINVAL; + + return UNICODE_AGE(maj, min, rev); +} +EXPORT_SYMBOL(utf8_parse_version); diff --git a/include/linux/unicode.h b/include/linux/unicode.h index 0c0ab04e84ee80227f9390ad0498f21a7ab7d34b..5e6b212a2aedab7ebf4363083339f4c5e9b82f8f 100644 --- a/include/linux/unicode.h +++ b/include/linux/unicode.h @@ -78,4 +78,6 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt, struct unicode_map *utf8_load(unsigned int version); void utf8_unload(struct unicode_map *um); +int utf8_parse_version(char *version); + #endif /* _LINUX_UNICODE_H */