From patchwork Thu Dec 6 23:08:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 10717199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8EE7D15A6 for ; Thu, 6 Dec 2018 23:09:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FAFE2F205 for ; Thu, 6 Dec 2018 23:09:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 747812F21D; Thu, 6 Dec 2018 23:09:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0FEA02F205 for ; Thu, 6 Dec 2018 23:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726335AbeLFXJv (ORCPT ); Thu, 6 Dec 2018 18:09:51 -0500 Received: from bhuna.collabora.co.uk ([46.235.227.227]:56108 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726322AbeLFXJt (ORCPT ); Thu, 6 Dec 2018 18:09:49 -0500 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id CFE2927ED58 From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-fsdevel@vger.kernel.org, kernel@collabora.com, linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH v4 11/23] nls: ascii: Support validation and normalization operations Date: Thu, 6 Dec 2018 18:08:51 -0500 Message-Id: <20181206230903.30011-12-krisman@collabora.com> X-Mailer: git-send-email 2.20.0.rc2 In-Reply-To: <20181206230903.30011-1-krisman@collabora.com> References: <20181206230903.30011-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Gabriel Krisman Bertazi validation is trivial. Any byte that has the MSB set is an invalid sequence. Casefold can be implemented with uppercase or lowercase, and we have no specification on that. Callers should be safe using either of them, as long as it doesn't change. Signed-off-by: Gabriel Krisman Bertazi --- fs/nls/nls_ascii.c | 50 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/nls.h | 8 ++++++++ 2 files changed, 58 insertions(+) diff --git a/fs/nls/nls_ascii.c b/fs/nls/nls_ascii.c index 2f4826478d3d..079a1574c19d 100644 --- a/fs/nls/nls_ascii.c +++ b/fs/nls/nls_ascii.c @@ -12,6 +12,7 @@ #include #include #include +#include static const wchar_t charset2uni[256] = { /* 0x00*/ @@ -117,6 +118,8 @@ static const unsigned char charset2upper[256] = { 0x58, 0x59, 0x5a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, /* 0x78-0x7f */ }; +#define VALID_ASCII(c) (c < 128) + static int uni2char(wchar_t uni, unsigned char *out, int boundlen) { const unsigned char *uni2charset; @@ -142,6 +145,16 @@ static int char2uni(const unsigned char *rawstring, int boundlen, wchar_t *uni) return 1; } +static int ascii_validate(const struct nls_table *table, + const unsigned char *str, size_t len) +{ + int i; + for (i = 0; i < len && str[i]; i++) + if (!VALID_ASCII(str[i])) + return -1; + return 0; +} + static unsigned char charset_tolower(const struct nls_table *table, unsigned int c){ return charset2lower[c]; @@ -152,11 +165,36 @@ static unsigned char charset_toupper(const struct nls_table *table, return charset2upper[c]; } +static int ascii_casefold(const struct nls_table *charset, + const unsigned char *str, size_t len, + unsigned char *dest, size_t dlen) +{ + unsigned int i; + + if (dlen < len) + return -EINVAL; + + for (i = 0; i < len; i++) { + if (IS_STRICT_MODE(charset) && !VALID_ASCII(str[i])) + return -EINVAL; + + if (IS_CASEFOLD_TYPE_ASCII_TOLOWER(charset)) + dest[i] = charset_tolower(charset, str[i]); + else + dest[i] = charset_toupper(charset, str[i]); + } + dest[len] = '\0'; + + return len; +} + static const struct nls_ops charset_ops = { + .validate = ascii_validate, .lowercase = charset_toupper, .uppercase = charset_tolower, .uni2char = uni2char, .char2uni = char2uni, + .casefold = ascii_casefold, }; static struct nls_charset nls_charset; @@ -165,9 +203,21 @@ static struct nls_table table = { .ops = &charset_ops, }; +struct nls_table *ascii_load_table(const char *version, unsigned int flags) +{ + if (flags & ~(NLS_STRICT_MODE) || + (flags & NLS_NORMALIZATION_TYPE_MASK) != NLS_NORMALIZATION_TYPE_PLAIN) + return ERR_PTR(-EINVAL); + + table.flags = flags; + return &table; +} + + static struct nls_charset nls_charset = { .charset = "ascii", .tables = &table, + .load_table = ascii_load_table, }; static int __init init_nls_ascii(void) diff --git a/include/linux/nls.h b/include/linux/nls.h index 44a06a9c69e7..aab60d4858ee 100644 --- a/include/linux/nls.h +++ b/include/linux/nls.h @@ -178,6 +178,14 @@ IS_CASEFOLD_TYPE_##charset##_##type(const struct nls_table *c) \ NLS_NORMALIZATION_FUNCS(ALL, PLAIN, NLS_NORMALIZATION_TYPE_PLAIN) NLS_CASEFOLD_FUNCS(ALL, TOUPPER, NLS_CASEFOLD_TYPE_TOUPPER) +/* ASCII */ + +#define NLS_ASCII_CASEFOLD_TOUPPER NLS_CASEFOLD_TYPE_TOUPPER +#define NLS_ASCII_CASEFOLD_TOLOWER NLS_CASEFOLD_TYPE(1) + +NLS_CASEFOLD_FUNCS(ASCII, TOUPPER, NLS_ASCII_CASEFOLD_TOUPPER) +NLS_CASEFOLD_FUNCS(ASCII, TOLOWER, NLS_ASCII_CASEFOLD_TOLOWER) + /* nls_base.c */ extern int __register_nls(struct nls_charset *, struct module *); extern int unregister_nls(struct nls_charset *);