From patchwork Wed Apr 29 17:25:53 2009
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jeff Layton <jlayton@redhat.com>
X-Patchwork-Id: 20768
Received: from lists.samba.org (mail.samba.org [66.70.73.150])
	by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n3THQrIP020780
	for <patchwork-cifs-client@patchwork.kernel.org>;
	Wed, 29 Apr 2009 17:26:53 GMT
Received: from dp.samba.org (localhost [127.0.0.1])
	by lists.samba.org (Postfix) with ESMTP id 05A38163C1B
	for <patchwork-cifs-client@patchwork.kernel.org>;
	Wed, 29 Apr 2009 17:26:30 +0000 (GMT)
X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on dp.samba.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.8 required=3.8 tests=AWL,BAYES_00,
	FORGED_RCVD_HELO,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.1.7
X-Original-To: linux-cifs-client@lists.samba.org
Delivered-To: linux-cifs-client@lists.samba.org
Received: from mx2.redhat.com (mx2.redhat.com [66.187.237.31])
	by lists.samba.org (Postfix) with ESMTP id 41F3C163C6A
	for <linux-cifs-client@lists.samba.org>;
	Wed, 29 Apr 2009 17:25:48 +0000 (GMT)
Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com
	[172.16.27.26])
	by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n3THQ7ul005832;
	Wed, 29 Apr 2009 13:26:07 -0400
Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199])
	by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id
	n3THQ5Iv012803; Wed, 29 Apr 2009 13:26:05 -0400
Received: from localhost.localdomain (vpn-12-35.rdu.redhat.com [10.11.12.35])
	by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n3THQ2vh014577;
	Wed, 29 Apr 2009 13:26:04 -0400
From: Jeff Layton <jlayton@redhat.com>
To: linux-cifs-client@lists.samba.org
Date: Wed, 29 Apr 2009 13:25:53 -0400
Message-Id: <1241025962-14370-4-git-send-email-jlayton@redhat.com>
In-Reply-To: <1241025962-14370-1-git-send-email-jlayton@redhat.com>
References: <1241025962-14370-1-git-send-email-jlayton@redhat.com>
X-Scanned-By: MIMEDefang 2.58 on 172.16.27.26
Cc: smfrench@gmail.com, sjayaraman@suse.de, eugene@redhat.com
Subject: [linux-cifs-client] [PATCH 03/12] cifs: add replacement for
	cifs_strtoUCS_le called cifs_ucs2le_to_host
X-BeenThere: linux-cifs-client@lists.samba.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: The Linux CIFS VFS client <linux-cifs-client.lists.samba.org>
List-Unsubscribe: 
 <https://lists.samba.org/mailman/listinfo/linux-cifs-client>,
	<mailto:linux-cifs-client-request@lists.samba.org?subject=unsubscribe>
List-Archive: <http://lists.samba.org/archive/linux-cifs-client>
List-Post: <mailto:linux-cifs-client@lists.samba.org>
List-Help: <mailto:linux-cifs-client-request@lists.samba.org?subject=help>
List-Subscribe: <https://lists.samba.org/mailman/listinfo/linux-cifs-client>,
	<mailto:linux-cifs-client-request@lists.samba.org?subject=subscribe>
Sender: 
 linux-cifs-client-bounces+patchwork-cifs-client=patchwork.kernel.org@lists.samba.org
Errors-To: 
 linux-cifs-client-bounces+patchwork-cifs-client=patchwork.kernel.org@lists.samba.org

Add a replacement function for cifs_strtoUCS_le. cifs_ucs2le_to_host
takes args for the source and destination length so that we can ensure
that the function is confined within the intended buffers.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/cifs/cifs_unicode.c |  126 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/cifs/cifs_unicode.h |    2 +
 2 files changed, 128 insertions(+), 0 deletions(-)

diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c
index 7d75272..f3fdf59 100644
--- a/fs/cifs/cifs_unicode.c
+++ b/fs/cifs/cifs_unicode.c
@@ -26,6 +26,132 @@
 #include "cifs_debug.h"
 
 /*
+ * cifs_mapchar - convert a little-endian char to proper char in codepage
+ * @target - where converted character should be copied
+ * @src_char - 2 byte little-endian source character
+ * @cp - codepage to which character should be converted
+ * @mapchar - should character be mapped according to mapchars mount option?
+ *
+ * This function handles the conversion of a single character. It is the
+ * responsibility of the caller to ensure that the target buffer is large
+ * enough to hold the result of the conversion (at least NLS_MAX_CHARSET_SIZE).
+ */
+static int
+cifs_mapchar(char *target, const __le16 src_char, const struct nls_table *cp,
+	     bool mapchar)
+{
+	int len = 1;
+
+	if (!mapchar)
+		goto cp_convert;
+
+	/*
+	 * BB: Cannot handle remapping UNI_SLASH until all the calls to
+	 *     build_path_from_dentry are modified, as they use slash as
+	 *     separator.
+	 */
+	switch (le16_to_cpu(src_char)) {
+	case UNI_COLON:
+		*target = ':';
+		break;
+	case UNI_ASTERIK:
+		*target = '*';
+		break;
+	case UNI_QUESTION:
+		*target = '?';
+		break;
+	case UNI_PIPE:
+		*target = '|';
+		break;
+	case UNI_GRTRTHAN:
+		*target = '>';
+		break;
+	case UNI_LESSTHAN:
+		*target = '<';
+		break;
+	default:
+		goto cp_convert;
+	}
+
+out:
+	return len;
+
+cp_convert:
+	len = cp->uni2char(le16_to_cpu(src_char), target,
+			   NLS_MAX_CHARSET_SIZE);
+	if (len <= 0) {
+		*target = '?';
+		len = 1;
+	}
+	goto out;
+}
+
+/*
+ * cifs_from_ucs2le - convert utf16le string to local charset
+ * @to - destination buffer
+ * @from - source buffer
+ * @tolen - destination buffer size (in bytes)
+ * @fromlen - source buffer size (in bytes)
+ * @codepage - codepage to which characters should be converted
+ * @mapchar - should characters be remapped according to the mapchars option?
+ *
+ * Convert a little-endian ucs2le string (as sent by the server) to a string
+ * in the provided codepage. The tolen and fromlen parameters are to ensure
+ * that the code doesn't walk off of the end of the buffer (which is always
+ * a danger if the alignment of the source buffer is off). The destination
+ * string is always properly null terminated and fits in the destination
+ * buffer. Returns the length of the destination string in bytes (including
+ * null terminator).
+ *
+ * Note that some windows versions actually send multiword UTF-16 characters
+ * instead of straight UCS-2. The linux nls routines however aren't able to
+ * deal with those characters properly. In the event that we get some of
+ * those characters, they won't be translated properly.
+ */
+int
+cifs_from_ucs2le(char *to, const __le16 *from, int tolen, int fromlen,
+		 const struct nls_table *codepage, bool mapchar)
+{
+	int i, charlen, safelen;
+	int outlen = 0;
+	int nullsize = nls_nullsize(codepage);
+	int fromwords = fromlen / 2;
+	char tmp[NLS_MAX_CHARSET_SIZE];
+
+	/*
+	 * because the chars can be of varying widths, we need to take care
+	 * not to overflow the destination buffer when we get close to the
+	 * end of it. Until we get to this offset, we don't need to check
+	 * for overflow however.
+	 */
+	safelen = tolen - (NLS_MAX_CHARSET_SIZE + nullsize);
+
+	for (i = 0; i < fromwords && from[i]; i++) {
+		/*
+		 * check to see if converting this character might make the
+		 * conversion bleed into the null terminator
+		 */
+		if (outlen >= safelen) {
+			charlen = cifs_mapchar(tmp, from[i], codepage, mapchar);
+			if (charlen <= 0)
+				charlen = 1;
+			if ((outlen + charlen) > (tolen - nullsize))
+				break;
+		}
+
+		/* put converted char into 'to' buffer */
+		charlen = cifs_mapchar(&to[outlen], from[i], codepage, mapchar);
+		outlen += charlen;
+	}
+
+	/* properly null-terminate string */
+	for (i = 0; i < nullsize; i++)
+		to[outlen++] = 0;
+
+	return outlen;
+}
+
+/*
  * NAME:	cifs_strfromUCS()
  *
  * FUNCTION:	Convert little-endian unicode string to character string
diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
index d6fe8ec..51cdfb1 100644
--- a/fs/cifs/cifs_unicode.h
+++ b/fs/cifs/cifs_unicode.h
@@ -72,6 +72,8 @@ extern struct UniCaseRange UniLowerRange[];
 #endif				/* UNIUPR_NOLOWER */
 
 #ifdef __KERNEL__
+int cifs_from_ucs2le(char *to, const __le16 *from, int tolen, int fromlen,
+		     const struct nls_table *codepage, bool mapchar);
 int cifs_strfromUCS_le(char *, const __le16 *, int, const struct nls_table *);
 int cifs_strtoUCS(__le16 *, const char *, int, const struct nls_table *);
 #endif