diff mbox series

[v2,4/8] unicode: Recreate utf8_parse_version()

Message ID 20240902225511.757831-5-andrealmeid@igalia.com (mailing list archive)
State New
Headers show
Series tmpfs: Add case-insesitive support for tmpfs | expand

Commit Message

André Almeida Sept. 2, 2024, 10:55 p.m. UTC
All filesystems that currently support UTF-8 casefold can fetch the
UTF-8 version from the filesystem metadata stored on disk. They can get
the data stored and directly match it to a integer, so they can skip the
string parsing step, which motivated the removal of this function in the
first place.

However, for tmpfs, the only way to tell the kernel which UTF-8 version
we are about to use is via mount options, using a string. Re-introduce
utf8_parse_version() to be used by tmpfs.

This version differs from the original by skipping the intermediate step
of copying the version string to an auxiliary string before calling
match_token(). This versions calls match_token() in the argument string.

utf8_parse_version() was created by 9d53690f0d4 ("unicode: implement
higher level API for string handling") and later removed by 49bd03cc7e9
("unicode: pass a UNICODE_AGE() tripple to utf8_load").

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
 fs/unicode/utf8-core.c  | 30 ++++++++++++++++++++++++++++++
 include/linux/unicode.h |  3 +++
 2 files changed, 33 insertions(+)

Comments

Theodore Ts'o Sept. 3, 2024, 11:41 a.m. UTC | #1
On Mon, Sep 02, 2024 at 07:55:06PM -0300, André Almeida wrote:
> All filesystems that currently support UTF-8 casefold can fetch the
> UTF-8 version from the filesystem metadata stored on disk. They can get
> the data stored and directly match it to a integer, so they can skip the
> string parsing step, which motivated the removal of this function in the
> first place.
> 
> However, for tmpfs, the only way to tell the kernel which UTF-8 version
> we are about to use is via mount options, using a string. Re-introduce
> utf8_parse_version() to be used by tmpfs.
> 
> This version differs from the original by skipping the intermediate step
> of copying the version string to an auxiliary string before calling
> match_token(). This versions calls match_token() in the argument string.
> 
> utf8_parse_version() was created by 9d53690f0d4 ("unicode: implement
> higher level API for string handling") and later removed by 49bd03cc7e9
> ("unicode: pass a UNICODE_AGE() tripple to utf8_load").
> 
> Signed-off-by: André Almeida <andrealmeid@igalia.com>

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
diff mbox series

Patch

diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
index 4966e175ed71..3e8afd637b28 100644
--- a/fs/unicode/utf8-core.c
+++ b/fs/unicode/utf8-core.c
@@ -240,3 +240,33 @@  bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name)
 	       utf8_validate(dir->i_sb->s_encoding, d_name));
 }
 EXPORT_SYMBOL(utf8_check_strict_name);
+
+/**
+ * utf8_parse_version - Parse a UTF-8 version number from a string
+ *
+ * @version: input string
+ * @maj: output major version number
+ * @min: output minor version number
+ * @rev: output minor revision number
+ *
+ * Returns 0 on success, negative code on error
+ */
+int utf8_parse_version(char *version, unsigned int *maj,
+		       unsigned int *min, unsigned int *rev)
+{
+	substring_t args[3];
+	static const struct match_token token[] = {
+		{1, "%d.%d.%d"},
+		{0, NULL}
+	};
+
+	if (match_token(version, token, args) != 1)
+		return -EINVAL;
+
+	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
+	    match_int(&args[2], rev))
+		return -EINVAL;
+
+	return 0;
+}
+EXPORT_SYMBOL(utf8_parse_version);
diff --git a/include/linux/unicode.h b/include/linux/unicode.h
index fb56fb5e686c..724db2cd709d 100644
--- a/include/linux/unicode.h
+++ b/include/linux/unicode.h
@@ -78,4 +78,7 @@  void utf8_unload(struct unicode_map *um);
 
 bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name);
 
+int utf8_parse_version(char *version, unsigned int *maj, unsigned int *min,
+		       unsigned int *rev);
+
 #endif /* _LINUX_UNICODE_H */