diff mbox series

[1/2] ksmbd: casefold utf-8 share names and fix ascii lowercase conversion

Message ID 20220911205729.299358-1-atteh.mailbox@gmail.com (mailing list archive)
State New, archived
Headers show
Series [1/2] ksmbd: casefold utf-8 share names and fix ascii lowercase conversion | expand

Commit Message

Atte Heikkilä Sept. 11, 2022, 8:57 p.m. UTC
strtolower() corrupts all UTF-8 share names that have a byte in the C0
(À ISO8859-1) to DE (Þ ISO8859-1) range, since the non-ASCII part of
ISO8859-1 is incompatible with UTF-8. Prevent this by checking that a
byte is in the ASCII range with isascii(), before the conversion to
lowercase with tolower(). Properly handle case-insensitivity of UTF-8
share names by casefolding them, but fallback to ASCII lowercase
conversion on failure or if CONFIG_UNICODE is not set. Refactor to move
the share name casefolding immediately after the share name extraction.
Also, make the associated constness corrections.

Signed-off-by: Atte Heikkilä <atteh.mailbox@gmail.com>
---
 fs/ksmbd/connection.c        |  8 +++++++
 fs/ksmbd/connection.h        |  1 +
 fs/ksmbd/mgmt/share_config.c | 18 ++++-----------
 fs/ksmbd/mgmt/share_config.h |  2 +-
 fs/ksmbd/mgmt/tree_connect.c |  2 +-
 fs/ksmbd/mgmt/tree_connect.h |  2 +-
 fs/ksmbd/misc.c              | 45 +++++++++++++++++++++++++++++-------
 fs/ksmbd/misc.h              |  2 +-
 fs/ksmbd/smb2pdu.c           |  2 +-
 fs/ksmbd/unicode.h           |  3 ++-
 10 files changed, 57 insertions(+), 28 deletions(-)

Comments

Namjae Jeon Sept. 12, 2022, 10:20 a.m. UTC | #1
2022-09-12 5:57 GMT+09:00, Atte Heikkilä <atteh.mailbox@gmail.com>:
Hi Atte,

[snip]
> +static char *casefold_sharename(struct unicode_map *um, const char *name)
> +{
> +	char *cf_name;
> +	int cf_len;
> +
> +	cf_name = kzalloc(KSMBD_REQ_MAX_SHARE_NAME, GFP_KERNEL);
> +	if (!cf_name)
> +		return ERR_PTR(-ENOMEM);
> +
> +	if (IS_ENABLED(CONFIG_UNICODE)) {
> +		const struct qstr q_name = {.name = name, .len = strlen(name)};
> +
> +		if (!um)
> +			goto out_ascii;
Minor nit, Wouldn't it be simpler to change something like the one below?

+	if (IS_ENABLED(CONFIG_UNICODE) && um) {

Thanks!
> +
> +		cf_len = utf8_casefold(um, &q_name, cf_name,
> +				       KSMBD_REQ_MAX_SHARE_NAME);
> +		if (cf_len < 0)
> +			goto out_ascii;
> +
> +		return cf_name;
> +	}
> +
> +out_ascii:
> +	cf_len = strscpy(cf_name, name, KSMBD_REQ_MAX_SHARE_NAME);
> +	if (cf_len < 0)
> +		return ERR_PTR(-E2BIG);
> +
> +	for (; *cf_name; ++cf_name)
> +		*cf_name = isascii(*cf_name) ? tolower(*cf_name) : *cf_name;
> +	return cf_name - cf_len;
> +}
> +
Atte Heikkilä Sept. 12, 2022, 4:14 p.m. UTC | #2
On Mon, 12 Sep 2022 19:20:54 +0900, Namjae Jeon wrote:
>2022-09-12 5:57 GMT+09:00, Atte Heikkilä <atteh.mailbox@gmail.com>:
>Hi Atte,
>
>[snip]
>> +static char *casefold_sharename(struct unicode_map *um, const char *name>)
>> +{
>> +	char *cf_name;
>> +	int cf_len;
>> +
>> +	cf_name = kzalloc(KSMBD_REQ_MAX_SHARE_NAME, GFP_KERNEL);
>> +	if (!cf_name)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	if (IS_ENABLED(CONFIG_UNICODE)) {
>> +		const struct qstr q_name = {.name = name, .len = strlen(name)};
>> +
>> +		if (!um)
>> +			goto out_ascii;
>Minor nit, Wouldn't it be simpler to change something like the one below?
>
>+	if (IS_ENABLED(CONFIG_UNICODE) && um) {

This mailing list already has a v2 patch series. Please, reply to that one.
As for your suggestion, I thought to keep the statements separate since the
block with the IS_ENABLED() macro is optimized away when CONFIG_UNICODE is
not set. I understand that the behavior is the same with your suggestion.

Thank you.

>
>Thanks!
>> +
>> +		cf_len = utf8_casefold(um, &q_name, cf_name,
>> +				       KSMBD_REQ_MAX_SHARE_NAME);
>> +		if (cf_len < 0)
>> +			goto out_ascii;
>> +
>> +		return cf_name;
>> +	}
>> +
>> +out_ascii:
>> +	cf_len = strscpy(cf_name, name, KSMBD_REQ_MAX_SHARE_NAME);
>> +	if (cf_len < 0)
>> +		return ERR_PTR(-E2BIG);
>> +
>> +	for (; *cf_name; ++cf_name)
>> +		*cf_name = isascii(*cf_name) ? tolower(*cf_name) : *cf_name;
>> +	return cf_name - cf_len;
>> +}
>> +
Namjae Jeon Sept. 13, 2022, 2:04 p.m. UTC | #3
2022-09-13 1:14 GMT+09:00, Atte Heikkilä <atteh.mailbox@gmail.com>:
> On Mon, 12 Sep 2022 19:20:54 +0900, Namjae Jeon wrote:
>>2022-09-12 5:57 GMT+09:00, Atte Heikkilä <atteh.mailbox@gmail.com>:
>>Hi Atte,
>>
>>[snip]
>>> +static char *casefold_sharename(struct unicode_map *um, const char
>>> *name>)
>>> +{
>>> +	char *cf_name;
>>> +	int cf_len;
>>> +
>>> +	cf_name = kzalloc(KSMBD_REQ_MAX_SHARE_NAME, GFP_KERNEL);
>>> +	if (!cf_name)
>>> +		return ERR_PTR(-ENOMEM);
>>> +
>>> +	if (IS_ENABLED(CONFIG_UNICODE)) {
>>> +		const struct qstr q_name = {.name = name, .len = strlen(name)};
>>> +
>>> +		if (!um)
>>> +			goto out_ascii;
>>Minor nit, Wouldn't it be simpler to change something like the one below?
>>
>>+	if (IS_ENABLED(CONFIG_UNICODE) && um) {
>
> This mailing list already has a v2 patch series. Please, reply to that one.
Okay, but please add cc me when you send the patch to the list.

> As for your suggestion, I thought to keep the statements separate since the
> block with the IS_ENABLED() macro is optimized away when CONFIG_UNICODE is
> not set. I understand that the behavior is the same with your suggestion.
When CONFIG_UNICODE is not set, um is not checked in my suggestion.
Please tell me why my suggestion is worse. if you are okay, I will
update it directly.(i.e. no need to send v3 patch). and please check
the use of strncasecmp() in __caseless_lookup() also.

And I need to do full test for this patches, I think it will take
about two days.
>
> Thank you.
>
>>
>>Thanks!
>>> +
>>> +		cf_len = utf8_casefold(um, &q_name, cf_name,
>>> +				       KSMBD_REQ_MAX_SHARE_NAME);
>>> +		if (cf_len < 0)
>>> +			goto out_ascii;
>>> +
>>> +		return cf_name;
>>> +	}
>>> +
>>> +out_ascii:
>>> +	cf_len = strscpy(cf_name, name, KSMBD_REQ_MAX_SHARE_NAME);
>>> +	if (cf_len < 0)
>>> +		return ERR_PTR(-E2BIG);
>>> +
>>> +	for (; *cf_name; ++cf_name)
>>> +		*cf_name = isascii(*cf_name) ? tolower(*cf_name) : *cf_name;
>>> +	return cf_name - cf_len;
>>> +}
>>> +
>
Atte Heikkilä Sept. 13, 2022, 5:44 p.m. UTC | #4
On Tue, 13 Sep 2022 23:04:02 +0900, Namjae Jeon wrote:
>2022-09-13 1:14 GMT+09:00, Atte Heikkilä <atteh.mailbox@gmail.com>:
>>
>> This mailing list already has a v2 patch series. Please, reply to that one.
>Okay, but please add cc me when you send the patch to the list.

I sent the v2 patch to the mailing list an hour after the first one:
https://lore.kernel.org/linux-cifs/20220911215542.104935-1-atteh.mailbox@gmail.com/

Please, let's have further discussion as replies to the v2 patch series
and not this one.

>
>> As for your suggestion, I thought to keep the statements separate since the
>> block with the IS_ENABLED() macro is optimized away when CONFIG_UNICODE is
>> not set. I understand that the behavior is the same with your suggestion.
>When CONFIG_UNICODE is not set, um is not checked in my suggestion.

I didn't mean to imply that it was.

>Please tell me why my suggestion is worse. if you are okay, I will
>update it directly.(i.e. no need to send v3 patch).

I didn't mean that your suggestion is worse. Feel free to update it directly
if you'd like.

>and please check the use of strncasecmp() in __caseless_lookup() also.

I thought that the change to utf8_strncasecmp() would be out of place for
this patch series.

>
>And I need to do full test for this patches, I think it will take
>about two days.

Please do not feel inclined to hurry, considering the mistake I managed
to make in the the v1 patch series (this one you're replying to).

>>
>> Thank you.
>>
diff mbox series

Patch

diff --git a/fs/ksmbd/connection.c b/fs/ksmbd/connection.c
index 756ad631c019..12be8386446a 100644
--- a/fs/ksmbd/connection.c
+++ b/fs/ksmbd/connection.c
@@ -60,6 +60,12 @@  struct ksmbd_conn *ksmbd_conn_alloc(void)
 	conn->local_nls = load_nls("utf8");
 	if (!conn->local_nls)
 		conn->local_nls = load_nls_default();
+	if (IS_ENABLED(CONFIG_UNICODE))
+		conn->um = utf8_load(UNICODE_AGE(12, 1, 0));
+	else
+		conn->um = ERR_PTR(-EOPNOTSUPP);
+	if (IS_ERR(conn->um))
+		conn->um = NULL;
 	atomic_set(&conn->req_running, 0);
 	atomic_set(&conn->r_count, 0);
 	conn->total_credits = 1;
@@ -350,6 +356,8 @@  int ksmbd_conn_handler_loop(void *p)
 	wait_event(conn->r_count_q, atomic_read(&conn->r_count) == 0);
 
 
+	if (IS_ENABLED(CONFIG_UNICODE))
+		utf8_unload(conn->um);
 	unload_nls(conn->local_nls);
 	if (default_conn_ops.terminate_fn)
 		default_conn_ops.terminate_fn(conn);
diff --git a/fs/ksmbd/connection.h b/fs/ksmbd/connection.h
index e7f7d5707951..41d96f5cef06 100644
--- a/fs/ksmbd/connection.h
+++ b/fs/ksmbd/connection.h
@@ -46,6 +46,7 @@  struct ksmbd_conn {
 	char				*request_buf;
 	struct ksmbd_transport		*transport;
 	struct nls_table		*local_nls;
+	struct unicode_map		*um;
 	struct list_head		conns_list;
 	/* smb session 1 per user */
 	struct xarray			sessions;
diff --git a/fs/ksmbd/mgmt/share_config.c b/fs/ksmbd/mgmt/share_config.c
index c9bca1c2c834..5d039704c23c 100644
--- a/fs/ksmbd/mgmt/share_config.c
+++ b/fs/ksmbd/mgmt/share_config.c
@@ -26,7 +26,7 @@  struct ksmbd_veto_pattern {
 	struct list_head	list;
 };
 
-static unsigned int share_name_hash(char *name)
+static unsigned int share_name_hash(const char *name)
 {
 	return jhash(name, strlen(name), 0);
 }
@@ -72,7 +72,7 @@  __get_share_config(struct ksmbd_share_config *share)
 	return share;
 }
 
-static struct ksmbd_share_config *__share_lookup(char *name)
+static struct ksmbd_share_config *__share_lookup(const char *name)
 {
 	struct ksmbd_share_config *share;
 	unsigned int key = share_name_hash(name);
@@ -119,7 +119,7 @@  static int parse_veto_list(struct ksmbd_share_config *share,
 	return 0;
 }
 
-static struct ksmbd_share_config *share_config_request(char *name)
+static struct ksmbd_share_config *share_config_request(const char *name)
 {
 	struct ksmbd_share_config_response *resp;
 	struct ksmbd_share_config *share = NULL;
@@ -190,20 +190,10 @@  static struct ksmbd_share_config *share_config_request(char *name)
 	return share;
 }
 
-static void strtolower(char *share_name)
-{
-	while (*share_name) {
-		*share_name = tolower(*share_name);
-		share_name++;
-	}
-}
-
-struct ksmbd_share_config *ksmbd_share_config_get(char *name)
+struct ksmbd_share_config *ksmbd_share_config_get(const char *name)
 {
 	struct ksmbd_share_config *share;
 
-	strtolower(name);
-
 	down_read(&shares_table_lock);
 	share = __share_lookup(name);
 	if (share)
diff --git a/fs/ksmbd/mgmt/share_config.h b/fs/ksmbd/mgmt/share_config.h
index 902f2cb1963a..7f7e89ecfe61 100644
--- a/fs/ksmbd/mgmt/share_config.h
+++ b/fs/ksmbd/mgmt/share_config.h
@@ -74,7 +74,7 @@  static inline void ksmbd_share_config_put(struct ksmbd_share_config *share)
 	__ksmbd_share_config_put(share);
 }
 
-struct ksmbd_share_config *ksmbd_share_config_get(char *name);
+struct ksmbd_share_config *ksmbd_share_config_get(const char *name);
 bool ksmbd_share_veto_filename(struct ksmbd_share_config *share,
 			       const char *filename);
 #endif /* __SHARE_CONFIG_MANAGEMENT_H__ */
diff --git a/fs/ksmbd/mgmt/tree_connect.c b/fs/ksmbd/mgmt/tree_connect.c
index 97ab7987df6e..867c0286b901 100644
--- a/fs/ksmbd/mgmt/tree_connect.c
+++ b/fs/ksmbd/mgmt/tree_connect.c
@@ -17,7 +17,7 @@ 
 
 struct ksmbd_tree_conn_status
 ksmbd_tree_conn_connect(struct ksmbd_conn *conn, struct ksmbd_session *sess,
-			char *share_name)
+			const char *share_name)
 {
 	struct ksmbd_tree_conn_status status = {-ENOENT, NULL};
 	struct ksmbd_tree_connect_response *resp = NULL;
diff --git a/fs/ksmbd/mgmt/tree_connect.h b/fs/ksmbd/mgmt/tree_connect.h
index 71e50271dccf..0f97ddc1e39c 100644
--- a/fs/ksmbd/mgmt/tree_connect.h
+++ b/fs/ksmbd/mgmt/tree_connect.h
@@ -42,7 +42,7 @@  struct ksmbd_session;
 
 struct ksmbd_tree_conn_status
 ksmbd_tree_conn_connect(struct ksmbd_conn *conn, struct ksmbd_session *sess,
-			char *share_name);
+			const char *share_name);
 
 int ksmbd_tree_conn_disconnect(struct ksmbd_session *sess,
 			       struct ksmbd_tree_connect *tree_conn);
diff --git a/fs/ksmbd/misc.c b/fs/ksmbd/misc.c
index df991107ad2c..8316e2bf926d 100644
--- a/fs/ksmbd/misc.c
+++ b/fs/ksmbd/misc.c
@@ -7,6 +7,7 @@ 
 #include <linux/kernel.h>
 #include <linux/xattr.h>
 #include <linux/fs.h>
+#include <linux/unicode.h>
 
 #include "misc.h"
 #include "smb_common.h"
@@ -226,26 +227,54 @@  void ksmbd_conv_path_to_windows(char *path)
 	strreplace(path, '/', '\\');
 }
 
+static char *casefold_sharename(struct unicode_map *um, const char *name)
+{
+	char *cf_name;
+	int cf_len;
+
+	cf_name = kzalloc(KSMBD_REQ_MAX_SHARE_NAME, GFP_KERNEL);
+	if (!cf_name)
+		return ERR_PTR(-ENOMEM);
+
+	if (IS_ENABLED(CONFIG_UNICODE)) {
+		const struct qstr q_name = {.name = name, .len = strlen(name)};
+
+		if (!um)
+			goto out_ascii;
+
+		cf_len = utf8_casefold(um, &q_name, cf_name,
+				       KSMBD_REQ_MAX_SHARE_NAME);
+		if (cf_len < 0)
+			goto out_ascii;
+
+		return cf_name;
+	}
+
+out_ascii:
+	cf_len = strscpy(cf_name, name, KSMBD_REQ_MAX_SHARE_NAME);
+	if (cf_len < 0)
+		return ERR_PTR(-E2BIG);
+
+	for (; *cf_name; ++cf_name)
+		*cf_name = isascii(*cf_name) ? tolower(*cf_name) : *cf_name;
+	return cf_name - cf_len;
+}
+
 /**
  * ksmbd_extract_sharename() - get share name from tree connect request
  * @treename:	buffer containing tree name and share name
  *
  * Return:      share name on success, otherwise error
  */
-char *ksmbd_extract_sharename(char *treename)
+char *ksmbd_extract_sharename(struct unicode_map *um, const char *treename)
 {
-	char *name = treename;
-	char *dst;
-	char *pos = strrchr(name, '\\');
+	const char *name = treename, *pos = strrchr(name, '\\');
 
 	if (pos)
 		name = (pos + 1);
 
 	/* caller has to free the memory */
-	dst = kstrdup(name, GFP_KERNEL);
-	if (!dst)
-		return ERR_PTR(-ENOMEM);
-	return dst;
+	return casefold_sharename(um, name);
 }
 
 /**
diff --git a/fs/ksmbd/misc.h b/fs/ksmbd/misc.h
index aae2a252945f..fbb7db560a28 100644
--- a/fs/ksmbd/misc.h
+++ b/fs/ksmbd/misc.h
@@ -20,7 +20,7 @@  int get_nlink(struct kstat *st);
 void ksmbd_conv_path_to_unix(char *path);
 void ksmbd_strip_last_slash(char *path);
 void ksmbd_conv_path_to_windows(char *path);
-char *ksmbd_extract_sharename(char *treename);
+char *ksmbd_extract_sharename(struct unicode_map *um, const char *treename);
 char *convert_to_unix_name(struct ksmbd_share_config *share, const char *name);
 
 #define KSMBD_DIR_INFO_ALIGNMENT	8
diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
index 19412ac701a6..62a8da520810 100644
--- a/fs/ksmbd/smb2pdu.c
+++ b/fs/ksmbd/smb2pdu.c
@@ -1883,7 +1883,7 @@  int smb2_tree_connect(struct ksmbd_work *work)
 		goto out_err1;
 	}
 
-	name = ksmbd_extract_sharename(treename);
+	name = ksmbd_extract_sharename(conn->um, treename);
 	if (IS_ERR(name)) {
 		status.ret = KSMBD_TREE_CONN_STATUS_ERROR;
 		goto out_err1;
diff --git a/fs/ksmbd/unicode.h b/fs/ksmbd/unicode.h
index 5593024230ae..076f6034a789 100644
--- a/fs/ksmbd/unicode.h
+++ b/fs/ksmbd/unicode.h
@@ -24,6 +24,7 @@ 
 #include <asm/byteorder.h>
 #include <linux/types.h>
 #include <linux/nls.h>
+#include <linux/unicode.h>
 
 #define  UNIUPR_NOLOWER		/* Example to not expand lower case tables */
 
@@ -69,7 +70,7 @@  char *smb_strndup_from_utf16(const char *src, const int maxlen,
 			     const struct nls_table *codepage);
 int smbConvertToUTF16(__le16 *target, const char *source, int srclen,
 		      const struct nls_table *cp, int mapchars);
-char *ksmbd_extract_sharename(char *treename);
+char *ksmbd_extract_sharename(struct unicode_map *um, const char *treename);
 #endif
 
 /*