diff mbox series

libfs: Attempt exact-match comparison first during casefold lookup

Message ID 20240117222836.11086-1-krisman@suse.de (mailing list archive)
State New
Headers show
Series libfs: Attempt exact-match comparison first during casefold lookup | expand

Commit Message

Gabriel Krisman Bertazi Jan. 17, 2024, 10:28 p.m. UTC
Casefolded comparisons are (obviously) way more costly than a simple
memcmp.  Try the case-sensitive comparison first, falling-back to the
case-insensitive lookup only when needed.  This allows any exact-match
lookup to complete without having to walk the utf8 trie.

Note that, for strict mode, generic_ci_d_compare used to reject an
invalid UTF-8 string, which would now be considered valid if it
exact-matches the disk-name.  But, if that is the case, the filesystem
is corrupt.  More than that, it really doesn't matter in practice,
because the name-under-lookup will have already been rejected by
generic_ci_d_hash and we won't even get here.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
 fs/libfs.c | 40 ++++++++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 16 deletions(-)

Comments

Al Viro Jan. 17, 2024, 10:38 p.m. UTC | #1
On Wed, Jan 17, 2024 at 07:28:36PM -0300, Gabriel Krisman Bertazi wrote:

> Note that, for strict mode, generic_ci_d_compare used to reject an
> invalid UTF-8 string, which would now be considered valid if it
> exact-matches the disk-name.  But, if that is the case, the filesystem
> is corrupt.  More than that, it really doesn't matter in practice,
> because the name-under-lookup will have already been rejected by
> generic_ci_d_hash and we won't even get here.

> -	if (ret >= 0)
> -		return ret;
> -
> -	if (sb_has_strict_encoding(sb))
> +	qstr.len = len;
> +	qstr.name = str;
> +	ret = utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
> +	if (ret < 0)
>  		return -EINVAL;

Umm...  So why bother with -EINVAL?  The rest looks sane, but
considering the fact that your string *has* passed ->d_hash(),
do we need anything beyond
	return utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
here?
Gabriel Krisman Bertazi Jan. 18, 2024, 12:02 a.m. UTC | #2
Al Viro <viro@zeniv.linux.org.uk> writes:

> On Wed, Jan 17, 2024 at 07:28:36PM -0300, Gabriel Krisman Bertazi wrote:
>
>> Note that, for strict mode, generic_ci_d_compare used to reject an
>> invalid UTF-8 string, which would now be considered valid if it
>> exact-matches the disk-name.  But, if that is the case, the filesystem
>> is corrupt.  More than that, it really doesn't matter in practice,
>> because the name-under-lookup will have already been rejected by
>> generic_ci_d_hash and we won't even get here.
>
>> -	if (ret >= 0)
>> -		return ret;
>> -
>> -	if (sb_has_strict_encoding(sb))
>> +	qstr.len = len;
>> +	qstr.name = str;
>> +	ret = utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
>> +	if (ret < 0)
>>  		return -EINVAL;
>
> Umm...  So why bother with -EINVAL?  The rest looks sane, but
> considering the fact that your string *has* passed ->d_hash(),
> do we need anything beyond
> 	return utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
> here?

No, you are right.  In fact, it seems d_compare can't propagate errors
anyway as it is only compared against 0, anyway.

I spin the v2 dropping this bit as suggested.

thanks,
Linus Torvalds Jan. 18, 2024, 12:40 a.m. UTC | #3
On Wed, 17 Jan 2024 at 16:02, Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>
> No, you are right.  In fact, it seems d_compare can't propagate errors
> anyway as it is only compared against 0, anyway.

Note that the whole "malformed utf-8 is an error" is actually wrong anyway.

Yes, if you *output* utf-8, and your output is malformed, then that's
an error that needs fixing.

But honestly, "malformed utf-8" on input is almost always just "oh, it
wasn't utf-8 to begin with, and somebody is still using Latin-1 or
Shift-JIS or whatever".

And then treating that as some kind of hard error is actually really
really wrong and annoying, and may end up meaning that the user cannot
*fix* it, because they can't access the data at all.

And yes, I realize that if somebody is an unicode person, they go "oh,
but it *IS* an error, and you need to detect it".

At that point you should just block that paper-pushing pinhead from your life.

Bad locales happen. It's just a fact of life. The patch that makes an
exact lookup work even when some locale rule is being violated should
be seen as an absolute win, not as a failure to detect an error.

This is another case of the old adage: be conservative in what you do,
be liberal in what you accept from others.

I find libraries that just error out on "malformed utf-8" to be
actively harmful.

                Linus
Theodore Ts'o Jan. 18, 2024, 2:05 a.m. UTC | #4
On Wed, Jan 17, 2024 at 04:40:17PM -0800, Linus Torvalds wrote:
> Note that the whole "malformed utf-8 is an error" is actually wrong anyway.
> 
> Yes, if you *output* utf-8, and your output is malformed, then that's
> an error that needs fixing.
> 
> But honestly, "malformed utf-8" on input is almost always just "oh, it
> wasn't utf-8 to begin with, and somebody is still using Latin-1 or
> Shift-JIS or whatever".
> 
> And then treating that as some kind of hard error is actually really
> really wrong and annoying, and may end up meaning that the user cannot
> *fix* it, because they can't access the data at all.

A file system which supports casefolding can support "strict" mode
(not the default) where attempts to create files that have invalid
UTF-8 characters are rejected before a file or hard link is created
(or renamed) with an error.

This is what MacOS does, by the way.  If you try to rsync a file from
a Linux box where the file was created by unpacking a Windows Zip file
created by downloading a directory hierarchy from a Microsoft
Sharepoint, and then you try to scp or rsync it over to MacOS, MacOS
will will refuse to allow the file to be created if it contains
invalid UTF-8 characters, and rsync or scp will report an error.  I
just ran into this earlier today...

So we don't need to worry about the user not being able to fix it,
because they won't have been able to create the file in the first
place.  This is not the default, since we know there are a bunch of
users who might be creating files using the unofficial "Klingon"
characters (for example) that are not officially part of Unicode since
Unicode will only allow characters used by human languages, and
Klingon doesn't qualify.  I believe though that Android has elected to
enable casefolding in strict mode, which is fine as far as I'm concerned.

> I find libraries that just error out on "malformed utf-8" to be
> actively harmful.

I admit that when I discovered that MacOS errored out on illegal utf-8
characters it was mildly annoying, but it wasn't that hard to fix it
on the Linux side and then I retried the rsync.  It also turned out
that if I unpacked the zip file on MacOS, the filename was created
without the illegal utf-8 characters, so there may have been something
funky going on with the zip userspace program on Linux.  I haven't
cared enough to try to debug it...

       		      	     	   	    - Ted
Linus Torvalds Jan. 18, 2024, 2:33 a.m. UTC | #5
On Wed, 17 Jan 2024 at 18:06, Theodore Ts'o <tytso@mit.edu> wrote:
>
> A file system which supports casefolding can support "strict" mode

.. and we can support "shit" mode that craps all over your filesystem
too. The fact that you can support something doesn't make it good.

> This is what MacOS does, by the way.

The MacOS filesystem is the CVS of filesystems: if you see that it
does something, the right thing is almost certainly to do the exact
opposite.

Case-folding is a horrible thing to do in the first place, but MacOS
compounds on its bad ways by then using NFD normalization, AND
EXPOSING THAT TO THE USER.

IOW, MacOS actively changes and corrupts the data the user gave it for
filenames.

There are tons of Apple apologists out there that will make any number
of excuses for it, but it's actively shit. It's broken garbage.

So the fact that MacOS then has "strict" mode, really is an argument
*against* it. The MacOS filesystem designers were fed the wrong kind
of drugs, and I suspect they may not have been the brightest bunch to
begin with.

> So we don't need to worry about the user not being able to fix it,
> because they won't have been able to create the file in the first
> place.

Yeah, that's a fine argument, until you have a bug or subtle bit flip
data corruption, and now instead of having something you can recover,
the system actively says "Nope".

> I admit that when I discovered that MacOS errored out on illegal utf-8
> characters it was mildly annoying,

We may have to be able to interoperate with shit, but let's call it what it is.

Nobody pretends FAT is a great filesystem that made great design
decisions. That doesn't mean that we can't interoperate with it just
fine.

But we don't need to take those idiotic and bad design decisions to
heart, and we don't need to hide the fact that they are horrendous
design mistakes.

So "strict" mode should mean that you can't *create* a misformed UTF-8 filename.

It's that same "be conservative in what you do".

But *dammit*, if "strict" mode means that you can't even read other
peoples mistakes because your "->lookup()" function refuses to even
look at it, then "strict" mode is GARBAGE.

That's the "be liberal in what you accept" part. Do it, or be damned.

Really. No excuses for shit modes.

               Linus
Gabriel Krisman Bertazi Jan. 18, 2024, 3:06 p.m. UTC | #6
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, 17 Jan 2024 at 18:06, Theodore Ts'o <tytso@mit.edu> wrote:
>> So we don't need to worry about the user not being able to fix it,
>> because they won't have been able to create the file in the first
>> place.
>
> Yeah, that's a fine argument, until you have a bug or subtle bit flip
> data corruption, and now instead of having something you can recover,
> the system actively says "Nope".

I know this is not your point, but I should add that, in case of a
bug or bit flip, we support "fixing" the "bad utf8" string through fsck.

>> I admit that when I discovered that MacOS errored out on illegal utf-8
>> characters it was mildly annoying,
>
> We may have to be able to interoperate with shit, but let's call it what it is.
>
> Nobody pretends FAT is a great filesystem that made great design
> decisions. That doesn't mean that we can't interoperate with it just
> fine.
>
> But we don't need to take those idiotic and bad design decisions to
> heart, and we don't need to hide the fact that they are horrendous
> design mistakes.

There is a correctness issue with accepting the creation of invalid
utf-8 names that justifies the existence of strict mode.  Currently
undefined code-points can become a casefold match to some other file in
a later unicode version. When you decide to update your unicode version
or even copy the file to a volume with a different version, the lookup
might yield a different file, making one of them inaccessible or
overwriting the wrong file.

Obviously, not all corruptions would yield a "valid" undefined
code-point.  But those are possible.

We currently don't care much, since mkfs will create the volume with a
fixed, never-changed unicode version. That is, unless the user goes out
of their way to shoot themselves in the foot.

Strict mode is an easy way to prevent this class of issues (aside from
corruptions).

> So "strict" mode should mean that you can't *create* a misformed UTF-8
> filename.
>
> It's that same "be conservative in what you do".
>
> But *dammit*, if "strict" mode means that you can't even read other
> peoples mistakes because your "->lookup()" function refuses to even
> look at it, then "strict" mode is GARBAGE.
>
> That's the "be liberal in what you accept" part. Do it, or be damned.

Yes, we could be more liberal in the lookup while restricting the
creation of invalid utf8 sequences.  But, the only case where it would
matter is for corrupted volumes, where a file-name suddenly changed to
something invalid.  Considering ext4 and f2fs, since the disk direntry
hash (which is hash(casefolded(filename))) didn't get corrupted exactly
right, looking up the exact-match of the invalid name might fail.

This would create an even more inconsistent semantics, where small,
non-hashed directories can find these files, but larger, hashed
directories might not.  And that is even more confusing to users,
since it exposes internal filesystem details.

I get the point about how annoying the current semantics is.  But I
still think this is the sanest approach to a fundamentally insane
feature.
diff mbox series

Patch

diff --git a/fs/libfs.c b/fs/libfs.c
index eec6031b0155..d17a8adb4a35 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1704,17 +1704,28 @@  bool is_empty_dir_inode(struct inode *inode)
 static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
 				const char *str, const struct qstr *name)
 {
-	const struct dentry *parent = READ_ONCE(dentry->d_parent);
-	const struct inode *dir = READ_ONCE(parent->d_inode);
-	const struct super_block *sb = dentry->d_sb;
-	const struct unicode_map *um = sb->s_encoding;
-	struct qstr qstr = QSTR_INIT(str, len);
+	const struct dentry *parent;
+	const struct inode *dir;
 	char strbuf[DNAME_INLINE_LEN];
+	struct qstr qstr;
 	int ret;
 
+	/*
+	 * Attempt a case-sensitive match first. It is cheaper and
+	 * should cover most lookups, including all the sane
+	 * applications that expect a case-sensitive filesystem.
+	 */
+	if (len == name->len && !memcmp(str, name->name, len))
+		return 0;
+
+	parent = READ_ONCE(dentry->d_parent);
+	dir = READ_ONCE(parent->d_inode);
 	if (!dir || !IS_CASEFOLDED(dir))
-		goto fallback;
+		return 1;
+
 	/*
+	 * Finally, try the case-insensitive match.
+	 *
 	 * If the dentry name is stored in-line, then it may be concurrently
 	 * modified by a rename.  If this happens, the VFS will eventually retry
 	 * the lookup, so it doesn't matter what ->d_compare() returns.
@@ -1724,20 +1735,17 @@  static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
 	if (len <= DNAME_INLINE_LEN - 1) {
 		memcpy(strbuf, str, len);
 		strbuf[len] = 0;
-		qstr.name = strbuf;
+		str = strbuf;
 		/* prevent compiler from optimizing out the temporary buffer */
 		barrier();
 	}
-	ret = utf8_strncasecmp(um, name, &qstr);
-	if (ret >= 0)
-		return ret;
-
-	if (sb_has_strict_encoding(sb))
+	qstr.len = len;
+	qstr.name = str;
+	ret = utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
+	if (ret < 0)
 		return -EINVAL;
-fallback:
-	if (len != name->len)
-		return 1;
-	return !!memcmp(str, name->name, len);
+
+	return ret;
 }
 
 /**